Programming, Software and Code

ParrotTheory: The C System Stack



I talked a lot about control flow on the C system stack in my post about the inferior runloops problem in Parrot. Since not everybody is familiar with all the mechanics of the platform ABI, I figured I would write up a theory post about how all that works.

C and Assembly Code

The compilation process for a language like C is very straight-forward: Write C code, invoke compiler, convert code to machine code, execute it. Ultimately for a program written in C code or any language at all for that matter to execute, it must be converted to a stream of machine code instructions. The hardware simply doesn't understand anything else. Any C code can be translated, by hand if necessary into an equivalent sequence of assembly instructions, which in turn can be translated (again, you can do this by hand) into machine code. Assembly instructions and machine code tend to have a one-to-one relationship, but C code has a one-to-many relationship with assembly. It's these variants that an optimizer searches for, finding faster ways to perform the same operations. It is also possible to convert a listing of assembly code into equivalent C code. This is called "decompilation", and is an important technique in the field of reverse engineering.

C code and assembly can be converted back and forth, and are generally equivalent. There are some operations in assembly that cannot be strictly converted to C code, but that's a subset we are able to ignore for now.

Function Calls

Look now at a piece of C code that we should all be familiar with, a function call:

x = Foo(a, b, c)

This line of code translates, more or less, into this sequence of x86 assembly:

mov eax, 'c'
push eax
mov eax, 'b'
push eax
mov eax 'a'
push eax
call _Foo
mov 'x', eax
sub esp, 12

I'm obviously simplifying in some places. This is probably not what your compiler would actually generate, but is not entirely dissimilar to what you would get if variables a, b, c, and x were all declared volatile, and if you didn't optimize. To make a function call, the arguments are all pushed onto the system stack (usually in left-to-right order) and the function is called. The 'call' operation does two things: it pushes a return address onto the stack and then jumps to the function Foo. When the function returns, the return value is stored in the eax register (except floating-point values, which are different and beyond the scope of this post). The important point to take away from this is that the arguments and the function return value are all on the system stack in a very particular order when the function call was made. Notice also that we need to "clean" the stack after the function call: We move the stack pointer back to where it was before pushing the three arguments. The return address is popped automatically in the ret command that we will see next.

As a general rule, the calling function should have absolutely no knowledge or interaction with the callee's stack frame or anything below it. Likewise, the callee should have no knowledge of the caller's stack frame or anything above it.

Now, let's look at the guts of the function Foo. Here's some C code to implement it:

int Foo(int a, int b, int c)
{
return a + b + c;
}

Converting to assembly, this is:

_Foo:
mov eax, 0
mov ebx, [esp+4]
add eax, ebx
mov ebx, [esp+8]
add eax, ebx
mov ebx, [esp+12]
add eax, ebx
ret

Again, not what you are likely to actually see as the output from a good compiler. I've intentionally omitted some details for simplicity. The values a, b and c are stored on the stack, so internally to the function we read them off the stack using offsets of the esp stack register. We cannot pop the values here because popping would move the return address (it's on the top of the stack, remember).

The Stack

This is all well and good for the simple example, but what if we want to have local variables in the function? And more importantly, how do we make local variables in a way that we can recurse properly? To do this, we need to create a stack frame. But before we talk about frames, let's talk about the stack.

The stack is just a contiguous region of memory that is pointed to by the stack register esp. The stack grows "down", so it starts at a high value in memory and each additional item pushed goes to the next lowest place on the stack. The "push eax" instruction does this, essentially:

mov [esp], eax
sub esp, 4

And the "pop eax" instruction reverses this process:

add esp, 4
mov eax, [esp]

The biggest difference is that the push and pop instructions execute much more quickly then the add/mov combination does otherwise.

Stack Frames

With this knowledge of the stack, we can now talk about local variables in a function. Let's rewrite Foo to have some local temporary variables:

int Foo(int a, int b, int c)
{
int x, y;
x = a + b;
y = x + c;
return y;
}

Converting to assembly now, we create space on the stack for x and y:

_Foo:
push ebp ; Save the old value of ebp
mov ebp, esp ; ebp points to the stack
sub esp, 8 ; make space on the stack for x and y
...
mov esp, ebp
pop ebp
ret

The code I've shown you here is nearly identical for every function created in C. These are the standard entry and exit sequences for x86 machines. The old value of ebp is pushed onto the stack to save it. This is so we can recurse easily without having to worry about the caller's stack. ebp is then set to point to esp, the "top" of the stack. Then we decrease esp, creating space on the stack to store local variables without having to manually "push" space for each one individually. From this point forward we can address all function arguments as offsets from ebp, and all function variables as offsets from esp. This moving stack frame allows us to recurse the function with no limitations besides stack space. Inside the function this is where the variables are stored. Remember that in x86 assembly, the [] brackets are pointer dereferences like "prefix:*" in C.
When we return, we move esp back to where ebp is, pop the stored value of ebp off the stack (for the calling function), and then pop the return address off the stack and jump to it. In the calling function we "add esp 12" to clean up the arguments from the stack and continue execution from there.

It's worth mentioning that if we create storage on the stack, such as a string or array, and write past the bounds of that array, we can accidentally overwrite things like the stored value of ebp or even the return address. A stack overflow attack does this exact thing to write a specific value to the return address, which causes execution to "return" to somebody's malicious code somewhere. The C standard library string.h is notorious for allowing these kinds of things to happen.

When I talk about having functions "on the stack", what I mean is that we have stack frames for those functions on the stack above where we are currently executing.

Backtraces

If we look at a backtrace in a debugger like GDB, we see a listing of functions representing a call sequence. This sequence can be found by tracing the stack: starting at the current function and iterating over the stack. GDB can find the return address at [ebp + 4], follow that to the caller function, and repeat the process until we reach main(). Combine finding addresses with finding metadata about the function name (in terms of debugging symbols) and we can produce a proper backtrace. Commands in GDB follow a lot of stack terminology: "frame" shows us the current frame, "up" moves us up to the previous stack frame and "down" moves us down to the next stack frame.

Continuation Passing Style

Parrot doesn't use stacks for it's control flow. Instead, it uses continuations. It's beyond the scope of this post to really talk about the differences and nuances of each. Instead of being stored on the system stack, continuation objects form a linked list (actually, more of a tree or graph) in the heap. By doing this, continuations aren't automatically cleaned up when they are invoked, and we can reuse them for things like loops, coroutines, closures, and all sorts of control flow optimizations and other fun things.

Conclusion

That's a quick overview of the C system stack. For more information, I've contributed to a book on Wikibooks called x86 Disassembly that talks more about the translation from assembly language back into C code. It covers function calls and stack frames in far more detail then I could in a single blog post. Of course, I would be happy to post more if people still have questions about it all.

In Parrot, it's difficult to think of control flow as a unified thing. We either think of PIR control flow or C control flow, both of which are very different from the other! We can, if we need to, convert PIR down to C code, and from there convert to assembly or machine code too. It's all equivalent and interrelated,

Comments

Is it possible to find the address of the caller function(address as in address of the function and not the return address) in C..btw nice post.

-Niks

Thanks! In a general sense, it's not really possible to find the address of the calling function. At least, it's not possible if you don't know which function it is.

In C, function names are all pointers. So if we have

int foo(void){}

Then the identifier "foo" is a pointer to that function. So, if we know that "foo" calls "bar", then bar can call "foo()", etc.

One thing you could try on some architectures, if you want to get really tricky, is this:
1) Find the return address on the stack
2) Take a pointer to that location and start looping through memory until you find the first instance of the instruction "push ebp"
3) In a non-optimized binary on x86 using normal calling conventions this SHOULD be the address of the function.

Some systems also have libraries that enable backtracing, so you may be able to get access to that and find out that way.



This entry was originally posted on Blogger and was automatically converted. There may be some broken links and other errors due to the conversion. Please let me know about any serious problems.