- Joined
- 11 yrs. 8 mth. 23 days
- Messages
- 5,010
- Reaction score
- 11,818
- Wallet
- 13,191$
- [email protected]
Translating Virtual to Physical Address on Windows: PAE, Virtual and Linear Addresses
Dejan Lukan March 21, 2013
Checking if PAE is Enabled
This was discussed in the first portion of this tutorial: please review before proceeding.
Getting the Virtual Address
The next thing we need to do is to compile and run the program, which we’ll debug, on Windows. When running the program on Windows, the following will be displayed, because we’ve coded the “int 3″ software interrupt into the C++ code. When the “int 3″ instruction is reached, the interrupt will be invoked, which will cause WinDbg to pause the execution of whole Windows operating system. WinDbg will then present us with a message about “break instruction exception,” as can be seen on the picture below:
Let’s now list the whole assembly function that corresponds to the above C++ code. The assembly code can be seen in the output below. The first instruction is the “int 3″ interrupt instruction that was used to stop the Windows operating system and invoke a debugger. Now we can use the Windbg t command to step into, or the p command to step over, the instructions manually.
We’re particularly interested in the virtual address that’s been used for the variables x and y in the C++ code. The values of those variables are stored at address 0x004113f5, where we can see a constant 0xAh being saved to it, and at address 0x004113ff, where a constant 0x14h is being saved to it. We could step through the program instruction by instruction with p or t command, but we’d rather set two breakpoints on the interesting addresses with the bp command like this:
The first two commands set breakpoints on the addresses 0x004113f5 and 0x004113ff, while the third cl command displays all the set breakpoints, which are the breakpoints we’ve just set. Then we can use the g command to run the program, upon which the breakpoint 0 will be hit as can be seen on the picture below:
Along with the actual breakpoint ID and its address, the current line that’s about to be executed is also displayed. On that line we can see that we’re interested in the address [ebp-8], so we must calculate it by printing the value of register EBP, subtracting 8 from it, and dumping the memory at that address. Let’s first display the value of EBP and dump the memory:
The variable x is located at the address 0x0012ff60 and its content is 0xcccccccc. After running the command that stores 0xAh at that address, the content should be 0x0000000A, as can be seen on the picture below:
We’ve gotten our first virtual address, which is 0x0012ff60. The same steps can be repeated for the y variable as well. All of the commands can be seen on the picture below:
The value 0×14 was written to the address 0×00345988, which is the second address we’re interested in. Thus, we’ve successfully gotten the two virtual addresses that we wanted. The virtual addresses are summarized below:
variable x: 0x0012ff60
variable y: 0×00345988
Getting the Linear Address
The first thing that we must do is figure out in which segment the address actually is. It’s clear that the variable x, which address is 0x0012ff60, is on the stack segment, since we’re initializing the variable on the stack.
Let’s print all of the values of stack segments with the use of the r command. The results can be seen on the picture below:
The registers are actually 16-bit, so the values correspond to the following:
SS (Stack Segment) : 0×0023
CS (Code Segment) : 0x001b
DS (Data Segment) : 0×0023
ES (Extra Segment) : 0×0023
GS (Data Segment) : 0×0000
FS (Data Segment) : 0x003b
Right now you may be confused and thinking: okay, if segmentation is in use, why do the segment registers hold the same value no matter which process is currently being debugged? Shouldn’t every process have its own segment register values? After all, the code and data between the processes is not shared. This isn’t true for shared libraries, but right now we’re not talking about DLL files, just executables.
You can be scratching your head and trying to figure out what’s happening, but all in all the reason is very simple. Let’s take a look at the stack segment register for example: 0×0023 can be transformed into binary form, which in this case is: 0000 0000 0010 0011. The first two least significant bits are used for protection, but we won’t go into that right now. The third bit is used to declare whether we should be looking for the descriptors in the global or local descriptor table. In this case the third bit is set to 0 (the 0 is bold in the binary representation), which means that the appropriate descriptor is located in the global descriptor table. The rest of the bits make up an offset into the GDT to specify the right descriptor to be used: those bits are 0000000000100, which can easily be represented with 0×4 in hexadecimal form.
Let’s use the same formula to get the offset of all segment registers:
SS: 0×4
CS: 0×3
DS: 0×4
ES: 0×4
GS: 0×0
FS: 0×7
Let’s also use the “dg 0 40″ command to print the first part of the GDT table, which can be seen on the picture below:
Let’s take the virtual address that we’ve gotten in the previous step and convert it to a linear address. The virtual address can be converted to a linear address by taking the base address from the GDT descriptor table (of an appropriate index that’s specified by one of the segment registers) and adding the virtual address to it. The base address of the first five segment descriptors that span the entire linear address space is 0×00000000 and the virtual addresses are 0x0012ff60 and 0×00345988, which can be directly translated like this:
variable x : 0×00000000 + 0x0012ff60 = 0x0012ff60
variable y : 0×00000000 + 0×00345988 = 0×00345988
This proves that segmentation is being used because it must be used and cannot be turned off, but it doesn’t actually do anything. We can conclude that virtual addresses are the same as linear addresses and no translation is necessary to translate from one to the other.
Conclusion
In this tutorial, we’ve looked at how to figure out whether PAE is enabled, but we’ve also started to look at an example and resolved virtual to linear addresses. We saw that the Windows operating system doesn’t actually use segmentation, since the virtual addresses are the same as linear addresses.
References:
[1] x86 memory management and Linux kernel, accessible at .
[2] .
[3] W4118: segmentation and paging, accessible at .
[4] Common WinDbg Commands, accessible at ]
[5] Understanding !PTE , Part 1: Let’s get physical, accessible at .
[6] Understanding !PTE, Part2: Flags and Large Pages, accessible at .
[7] Part 3: Understanding !PTE – Non-PAE and X64, accessible at .
source:
Dejan Lukan March 21, 2013
Checking if PAE is Enabled
This was discussed in the first portion of this tutorial: please review before proceeding.
Getting the Virtual Address
The next thing we need to do is to compile and run the program, which we’ll debug, on Windows. When running the program on Windows, the following will be displayed, because we’ve coded the “int 3″ software interrupt into the C++ code. When the “int 3″ instruction is reached, the interrupt will be invoked, which will cause WinDbg to pause the execution of whole Windows operating system. WinDbg will then present us with a message about “break instruction exception,” as can be seen on the picture below:

Let’s now list the whole assembly function that corresponds to the above C++ code. The assembly code can be seen in the output below. The first instruction is the “int 3″ interrupt instruction that was used to stop the Windows operating system and invoke a debugger. Now we can use the Windbg t command to step into, or the p command to step over, the instructions manually.
We’re particularly interested in the virtual address that’s been used for the variables x and y in the C++ code. The values of those variables are stored at address 0x004113f5, where we can see a constant 0xAh being saved to it, and at address 0x004113ff, where a constant 0x14h is being saved to it. We could step through the program instruction by instruction with p or t command, but we’d rather set two breakpoints on the interesting addresses with the bp command like this:
The first two commands set breakpoints on the addresses 0x004113f5 and 0x004113ff, while the third cl command displays all the set breakpoints, which are the breakpoints we’ve just set. Then we can use the g command to run the program, upon which the breakpoint 0 will be hit as can be seen on the picture below:

Along with the actual breakpoint ID and its address, the current line that’s about to be executed is also displayed. On that line we can see that we’re interested in the address [ebp-8], so we must calculate it by printing the value of register EBP, subtracting 8 from it, and dumping the memory at that address. Let’s first display the value of EBP and dump the memory:

The variable x is located at the address 0x0012ff60 and its content is 0xcccccccc. After running the command that stores 0xAh at that address, the content should be 0x0000000A, as can be seen on the picture below:

We’ve gotten our first virtual address, which is 0x0012ff60. The same steps can be repeated for the y variable as well. All of the commands can be seen on the picture below:

The value 0×14 was written to the address 0×00345988, which is the second address we’re interested in. Thus, we’ve successfully gotten the two virtual addresses that we wanted. The virtual addresses are summarized below:
variable x: 0x0012ff60
variable y: 0×00345988
Getting the Linear Address
The first thing that we must do is figure out in which segment the address actually is. It’s clear that the variable x, which address is 0x0012ff60, is on the stack segment, since we’re initializing the variable on the stack.
Let’s print all of the values of stack segments with the use of the r command. The results can be seen on the picture below:

The registers are actually 16-bit, so the values correspond to the following:
SS (Stack Segment) : 0×0023
CS (Code Segment) : 0x001b
DS (Data Segment) : 0×0023
ES (Extra Segment) : 0×0023
GS (Data Segment) : 0×0000
FS (Data Segment) : 0x003b
Right now you may be confused and thinking: okay, if segmentation is in use, why do the segment registers hold the same value no matter which process is currently being debugged? Shouldn’t every process have its own segment register values? After all, the code and data between the processes is not shared. This isn’t true for shared libraries, but right now we’re not talking about DLL files, just executables.
You can be scratching your head and trying to figure out what’s happening, but all in all the reason is very simple. Let’s take a look at the stack segment register for example: 0×0023 can be transformed into binary form, which in this case is: 0000 0000 0010 0011. The first two least significant bits are used for protection, but we won’t go into that right now. The third bit is used to declare whether we should be looking for the descriptors in the global or local descriptor table. In this case the third bit is set to 0 (the 0 is bold in the binary representation), which means that the appropriate descriptor is located in the global descriptor table. The rest of the bits make up an offset into the GDT to specify the right descriptor to be used: those bits are 0000000000100, which can easily be represented with 0×4 in hexadecimal form.
Let’s use the same formula to get the offset of all segment registers:
SS: 0×4
CS: 0×3
DS: 0×4
ES: 0×4
GS: 0×0
FS: 0×7
Let’s also use the “dg 0 40″ command to print the first part of the GDT table, which can be seen on the picture below:

Let’s take the virtual address that we’ve gotten in the previous step and convert it to a linear address. The virtual address can be converted to a linear address by taking the base address from the GDT descriptor table (of an appropriate index that’s specified by one of the segment registers) and adding the virtual address to it. The base address of the first five segment descriptors that span the entire linear address space is 0×00000000 and the virtual addresses are 0x0012ff60 and 0×00345988, which can be directly translated like this:
variable x : 0×00000000 + 0x0012ff60 = 0x0012ff60
variable y : 0×00000000 + 0×00345988 = 0×00345988
This proves that segmentation is being used because it must be used and cannot be turned off, but it doesn’t actually do anything. We can conclude that virtual addresses are the same as linear addresses and no translation is necessary to translate from one to the other.
Conclusion
In this tutorial, we’ve looked at how to figure out whether PAE is enabled, but we’ve also started to look at an example and resolved virtual to linear addresses. We saw that the Windows operating system doesn’t actually use segmentation, since the virtual addresses are the same as linear addresses.
References:
[1] x86 memory management and Linux kernel, accessible at .
[2] .
[3] W4118: segmentation and paging, accessible at .
[4] Common WinDbg Commands, accessible at ]
[5] Understanding !PTE , Part 1: Let’s get physical, accessible at .
[6] Understanding !PTE, Part2: Flags and Large Pages, accessible at .
[7] Part 3: Understanding !PTE – Non-PAE and X64, accessible at .
source: