Lab 4: GDBFall 2023
This lab will introduce you to memory layout and GDB concepts that are critical for performing buffer-overflow attacks in project 4. You will not be actively attacking code in this lab, but instead dissecting an innocent C file to understand how it appears in memory when compiled and during execution.
Memory layout and buffer-overflow exploitation depends on details of the target system. For this lab and the corresponding project, you must create your solutions inside the Project 4 VM, as it has been configured to standardize the stack layout and disable certain security features that would complicate your work.
Follow the setup instructions on the Project 4 VM page. You only need to do this once; if you follow the guide for this lab, you can use the same VM and workflow for the project.
Check out your starter code from the GitHub template inside the VM. You must do this in a folder in the native Linux filesystem. It won’t work correctly if you use a shared folder located in the host OS.
You do not need to worry about running
./test.sh for the lab.
You will make use of the GDB debugger for dynamic analysis within the
VM, which you should recall from EECS 280. Useful commands that you may
not know are
the GDB help for details, and don’t be afraid to experiment. The GDB reference sheet may also be useful.
These are many good references for Intel’s assembly language, but note that our project targets use the 64-bit x86_64 ISA (sometimes abbreviated to x64), not the older 32-bit x86 ISA. The stack is organized differently in x86_64 and x86. If you are reading any online documentation, ensure that it is based on the x86_64 architecture, not x86.
Also note that there are 2 different syntaxes for this assembly language, known as Intel syntax and AT&T syntax. They’re
just 2 ways of expressing the same code. In this class we’re always using Intel syntax, but keep in mind that online
resources might be using AT&T syntax. You can tell which is which because AT&T syntax uses percent signs (
everywhere and Intel syntax doesn’t.
Big versus Little Endian
The final task in this lab will involve endianness. Refer to this guide if you are unfamiliar with endianness or need a refresher. Also, there are helpful images you can find via Google that visually diagram this concept.
You will write all answers in
submit.toml. The Even Better TOML VS Code extension is recommended here. Make sure that it’s installed in the VM (not just locally).
Part 1 - Examine assembly code
Open a terminal within VS Code and start GDB on the
lab4 compiled binary (
$ gdb lab4). Using GDB’s
disas (shorthand for
disassemble) command, answer the following questions regarding the assembly code:
Task 1 Where in memory is the line of assembly code that makes a call to
Task 2 Where in memory does the function
Task 3 This code uses the
printf function from libc. Where in memory does this function begin?
Takeaway: Be mindful of the difference between the address of the line that calls a function versus the address of the function itself.
Part 2 - Peer into stack during execution
Project 4 will require extensive use of GDB’s
x command to look at stack contents at various stages of execution. First, you will need to set a breakpoint at your desired location, then run the binary. Assuming you have opened GDB with the compiled lab4 binary (
$ gdb lab4), use the following commands:
(gdb) break [address/function] (gdb) run
break parameter can be either an explicit address (such as one of your submissions in the previous task) in the form
*0x################ or the name of a function.
Refer to the GDB reference sheet as well as the lab slides for various ways to print the stack using
Task 4 Set a breakpoint at
foo. Run the code. At the start of the
function, what is the hexadecimal value of the byte stored at
$rbp + 8?
Continue to the end of execution
Task 5 Disassemble
foo. Look for the line that calls
printf and put a
breakpoint at that instruction. The breakpoint should be in
foo, not in
printf. Run the code, then examine memory
x command at
$rsp + n where
n is replaced with an integer of your choice (start with 0 if you’re
unsure). Locate where the word “THIS” is stored in memory. Change the value of
n so that the memory dump begins
exactly at the captial “T”. What value of
n makes this happen?
x command is extremely versatile. Make use of its features when constructing and debugging an attack.
Part 3 - Endianness
Keeping track of big versus little endian can get confusing when examining the stack. The final portion of this lab will familiarize you with the output style from different GBD
x commands regarding endianness. Note that x86_64 uses little-endian byte ordering.
Continue to the end of previous execution and remove breakpoints again.
foo multiplies the variable
foo and see if you can find the instruction that performs this multiplication. The next instruction moves the result of this calculation back into memory. Set a breakpoint right after it is moved into memory. Now run the code, then examine the stack in the following two ways:
(gdb) x/1gx $rsp (gdb) x/8bx $rsp
In both cases, we are viewing the same bytes, but you’ll realize they appear to be in a different order. Printing the bytes in giant word format clumps every 8 bytes together, interprets them as a little-endian integer, and displays them in the way we’re used to reading hexadecimal numbers (big-endian). Printing the individual bytes displays how the memory is actually configured (little-endian in our case for x86_64).
Examine the disassembly of
foo to figure out where in memory it keeps the value of
num. Try reading the memory there instead of at
Task 6 What is the value of
Task 7 What are the values of the raw bytes that represent
num (in order)?
Takeaway: While giant word format is more consise, when in doubt use individual byte format printing as it shows what the stack space actually looks like.
Submit the following file to the Autograder by the deadline: