There is a wide spread tendency nowadays to offload computation tasks from CPU to hardware dedicated units that offer better performance in terms of execution time and performance/watt, compared to generic processors. This approach involves the design of new hardware components, in silicon, that outperform the CPUs in tasks like AI inference, graphics acceleration, etc. These specialized components are placed in the same chip with the generic processors and in this way a System on Chip (SoC) is born.
By taking an overview of the SoC world I think that ARM is the most spread CPU architecture used to implement the CPU inside SoCs.
Jumping over the advantages and disadvantages of ARM, a licensed-based architecture, my attention was caught by RISC-V a new promising and open source CPU architecture, developed initially at Berkeley University.
I'm going to leave in-depth analysis of RISC-V for another article and I'm going to focus for now on the security concerns that I have about it.
In the last period there were a couple of exploits that targeted CPU architecture. Maybe the most vocal are Spectre and Meltdown, which were followed by many more. These types of attacks, that exploit side channels on modern processors without requiring any software vulnerability and independent of the operating system, reveal the importance of an instruction set architecture (ISA) designed with security in mind.
In this context, I see many advantages for open source CPU architectures like RISC-V, which allow fast updates from many contributors including security researchers and companies.
Why did I chose Return-Oriented Programming
After I graduated from university I decided to study cyber security in a masters program. There were many interesting topics between which was Return-Oriented Programming (ROP) a pretty clever exploit that was proved to be effective on many CPU architectures including x86-64, ARM and SPARC. There are protection mechanisms against ROP, implemented in software, including Address Space Layout Randomization (ASLR) and stack canaries.
At the moment when I studied this exploit I didn't find anything regarding ROP on RISC-V, so I decided to tackle the subject, which concluded in my dissertation thesis.
In this article I present how one can exploit RISC-V CPU using ROP.
Return Oriented Programming is a code-reuse technique that allows an attacker to bypass the Write Xor Execute protection mechanism provided by all modern CPU architectures and operating systems.
ROP was derived from return-into-libc attack.
Similarly to return-into-libc, ROP does not inject new code in victim’s memory space. Instead it uses code from modules that are loaded by the exploited application. On the other hand, ROP does not use whole functions from a library as return-into-libc does. It relies on short sequences of valid instructions, called gadgets.
Each gadget performs short computations, like addition or logic operations. The key to obtain a powerful ROP attack is to chain together all the gadgets that are needed to perform a certain task. By using this approach, ROP obtains a greater flexibility than return into-libc. Moreover, it is possible to obtain a Turing complete set of gadgets.
This empowers an attacker with a tool with the same capabilities of a programming language and everything is based on the legit code that already exists in the application’s memory space.
A successful ROP attack needs the following ingredients:
- buffer overflow vulnerable application
- reliable gadget chaining mechanism
- a large set of gadgets
Buffer overflow vulnerable application
We need a buffer overflow vulnerable application to be able to overwrite the returning address from a benign function. In this way we can change the execution flow to our gadgets.
Reliable gadget chaining mechanism
Everything is about branching and jumping. We want to execute some instructions and jump to another sequence of useful instructions (gadget). This link between gadgets must be automated, implemented by the CPU, because we don't control the code from memory, we control only the stack.
A large set of gadgets
We want to be as powerful as possible. By choosing the gadget's structure and chaining mechanism wisely we can obtain a huge set of available gadgets from the most popular libraries, which gives us the same capabilities as a programming language.
benign_function() is vulnerable to buffer overflow. By providing a carefully crafted payload, the attacker can replace the original return address with the address of the first gadget, G1.
The payload usually contains two types of data:
- the addresses for all gadgets that are needed to perform a malicious task
- the values that are needed by the gadgets to be loaded into registers
After the code execution is redirected to libc, G1 is executed, followed by G2 and G3. Please note that there is no malicious code in libc. G1, G2 and G3 are not malicious by themselves, but the order in which the instructions are executed (instruction 3 → instruction 8) and the values that these instructions are using (from the stack, that is controlled by the attacker) can lead to a malicious behavior.
Basic ROP example
Now let's have a look at a basic ROP attack on x86-64. We start from this architecture because it offers a straightforward and easy to understand example. I will use a 32-bit CPU example, but it works in the same manner for 64-bit. As you will see, the gadget chaining mechanism is specific for each CPU architecture because it depends on the instructions that perform stack operations.
This example is taken from Shacham’s “The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86)”. When benign_function() is called the return address is saved on the stack and the esp register points to that stack location. If the attacker overwrites the return address, and the locations that follow, the program execution is diverted to
pop %edx gadget.
It works like this:
retinstruction is called and the execution jumps to the address pointed by
esp, in our case the address of
pop %edx, and the value of
espis incremented by 32 bits (point 1 in the above figure)
edxregister with the value pointed by
esp(0xdeadbeef - point 2 in above figure) and increments
espby 32 bits
esppoints to next gadget’s address **(point 3 in above figure). The
pop %edxmakes the program execution to jump to another gadget, the new value pointed by
esp) and increments
espvalue by 32 bits
In this way the gadgets are chained together on x86-64 CPU architecture, leading to a very powerful attack.
ROP on RISC-V
RISC-V is a RISC (Reduced Instruction Set Computer) based on load-store principle. That means that the only instructions that can access off-chip memory are load and store instructions. All instructions have fixed width and must be naturally aligned. Also, RISC-V does not have stack manipulation instructions, like x86-64 or ARM. As we have seen before, x86-64 has POP and RET instructions that updates the value of the stack pointer to point to the next value on the stack. On the other hand, RISC-V uses a sequence of
lw (load word) and
addi (add immediate) instructions to load a value from the stack and to update the stack pointer. Also, RISC-V does not have a dedicated instruction for return. The
ret is a pseudo-instruction that is expanded to
jalr zero, 0(ra), which sets the program counter to
ra + 0 and saves the previous program counter’s value plus four to register
zero, which is hardwired to zero. This implies that the return value must be copied from the stack into
ra before returning.
Registers and Calling Convention
RV32I (Base Integer Instruction Set, 32-bit) has 32 general purpose registers. Some important aspects of registers’ organization are:
x0/zerois hardwired to zero
x1/raholds the return address from a function
x2/spholds the stack pointer
x10-x17/a0-a7hold arguments for functions
x8-9, x18-27/s0-s11(saved registers) preserve their values across function calls; any function that uses the saved registers must restore their original value before returning
Divide and conquer
As you have seen, on x86-64 the ROP attack is pretty simple. You have POP instruction to retrieve values from the stack and you can use
RET to jump to the next gadget. All of these without thinking about stack pointer update. On RISC-V the task is more complicated. We have to use the load instruction + a memory address to retrieve data (from the stack or memory).
To obtain a powerful ROP attack you need a reliable gadgets’ chaining method. For x86-64 this objective is accomplished by the
RET instruction, which updates the program counter register with the address of the next gadget and moves the stack pointer further. Taking this into account, one can use any function epilogue as a gadget. This approach can be used on RISC-V too, but it comes with some disadvantages.
sw a5, 1518 (a4)
ld ra, 8(sp)
addi sp, sp,16
In this example we have a gadget that stores a value in memory. The first line stores the value, the second one loads the
ra register with the return address from the stack, the third one updates the stack pointer and the
ret is executed. Simple, but let’s have a look at another example.
li a0, 0
ld ra, 40(sp)
ld s0, 32(sp)
ld s1, 24(sp)
ld s2, 16(sp)
ld s3, 8(sp)
addi sp, sp, 48
In this case
a0 is loaded with zero,
ra is loaded with the return address and
s0 → s3 are loaded with their previous values from the stack because the calling convention specifies that the saved registers must preserve their values across function calls. If this gadget is used, the attacker has to provide dummy values on the stack to be loaded in the saved register which implies a bigger payload. This type of function epilogue is quite often in libc.
For this reason I came up with another idea. I observed that there are many sequences of jump to saved register instructions in libc. So, I divided the gadgets in two categories: functional gadgets and linking gadget.
The functional gadget are the ones that are executing the useful operations for the attack and end in
jalr s0→s11. So, the link with the next gadget is accomplished by the saved registers.
Here are some example of functional gadgets.
// Load from memory to register
ld a0, 0(s0)
// Load a constant to a register
li a2, 0
// Calling the execve system call
li a7, 221
Linking gadget (the charger)
This is a special gadget that is used to link all the other gadgets together. It is the first executed gadget and is usually called at the beginning of the attack. It loads the saved registers with the addresses of all the functional gadgets that are going to be used in an attack, from here the charger name.
In the above figure we have an example of a charger gadget.
- The first instruction (green) loads the
raregister with the address of the first functional gadget.
- The following seven instructions (blue) load the saved registers
s0→s6with the addresses of all the functional gadgets that will be used in the attack.
- The yellow instruction updates the stack pointer.
- Finally, the orange instruction executes the
retthat will jump to the address loaded in
ra(first functional gadget).
Here we have a theoretical return-oriented programming attack running on RISC-V architecture.
- The stack space of
benign_function()(and all the following positions) **is overflowed by the attacker and the legit return address is changed with the address of the charger gadget (malicious return address).
- Before returning,
benign_function()frees its stack space by moving
spforward (to the next free position).
benign_function()returns the program execution jumps to the charger gadget (
ld ra, 72(sp), ....), which loads the
raregister with the address of the first functional gadget. After that, the saved registers
s0 → s6are loaded with values from the stack, provided by the attacker.
- At the end, the charger gadget updates the
sp(another useful values for the attacker can follow on the stack) and executes
- At this point the execution will jump the first functional gadget which will load
a0with the value found at the address stored in
ld a0, 0(s0)) and will jump to the address stored in
jalr s1), another functional gadget.
A complete functional example of return-oriented programming on RISC-V will be presented in a future article.
Advantages and disadvantages of this method
The main advantage of this method is the reduction of the payload that is used in the attack. If the method based on functions’ epilogues is used, the attacker has to provide dummy values to load the saved registers restored before function return. This can lead to large payloads that can be ineffective in some cases. On the other hand, I didn’t find a Turing complete set of gadgets in libc, which reduces the attacker’s capabilities.
Now that we know that RISC-V architecture can be exploited using Return Oriented Programming technique we can ask: is RISC-V insecure? No. ROP attacks leverage on function return mechanism and function calling convention which are not malicious by themselves. Moreover, one needs a vulnerable application that allows the attacker to hack the return address from a function. Also, there are many protection mechanisms that stop ROP attacks, as we will see in the next section. From this point of view, I consider that RISC-V has a significant advantage over the closed CPU architectures because each processor manufacturer can choose which protection mechanism to integrate.
What can we do to prevent ROP attacks? There are many solutions that deal with this issue. Most of them are a combination of software and hardware collaboration.
- Address Space Layout Randomization: this is a software computer security technique that randomize the addresses of processes, stack, heap and libraries, in memory. Each time a process is launched in execution it will be placed at a random address in memory. This means, using our attack example, that the attacker doesn’t know the addresses of the gadgets (libc). Read more about this here.
- G-Free: this is a compiler-based solution that eliminates all unaligned free-branch instructions inside a binary executable. More details here.
- Stack canaries: this technique is used to detect a stack buffer overflow and prevents the jumping to ROP gadgets. It works by placing a random integer value (canary) on the stack, right before the return address. The canary value is checked before the return instruction is executed to make sure that the return address was not modified (if the return addresses is overwritten the canary value is changed too). More details here.
- Zipper Stack: this is a novel technique (2019) that protects the return addresses by a chain structure using encryption. It is able to protect all the return addresses from the stack and ensures that the functions are returned in the correct order. The authors had implemented this solution on RISC-V CPU and had modified the RISC-V instruction set. Here, the advantages of an open source CPU architecture are highlighted. More details here.
- FIXER: another novel technique (2019) that provides a protection mechanism against ROP by enforcing control-flow integrity. Also, it was implemented and tested on RISC-V CPU.
If you like this subject and want to dive deeper I leave here a further reading list. I will come back soon with a hands-on article regarding ROP on RISC-V.
There are two separate research papers that tackle the subject:
Both papers were published in 2020 and I’m sure you will find interesting things there. I performed my research independently by those and I published an abstract for ACM CCSW 2020.
Regarding ROP in general I recommend:
- The geometry of innocent flesh on the bone: Return-intolibc without function calls (on the x86)
- When good instructions go bad: Generalizing return-oriented programming to RISC
- Return-oriented programming without returns
- Return oriented programming for the ARM architecture
- ROP is Still Dangerous: Breaking Modern Defenses
📩 Please feel free to share this article with colleagues and friends who will find it valuable.