Tuesday, January 19, 2010

Memory Layout of a C Program

high     --------------
        |               |
        | Arguments and |
        |  environment  |
        |   variables   |
        |               |
        |---------------|
        |     Stack     |<--|--
        |(grow downward)|   |
        |               |   |User
        |               |   |Stack
        |               |   |Frame
        |               |   |
        | (grow upward) |   |( Mind the Gap )
        |      Heap     |<--|--
        |---------------|
        |      BSS      |<-- uninitialized static data(block started by symbol
|||| | | long sum[1000];
|| | |
        |---------------|
        |      Data     |<-- initilized static data(int   maxcount = 99)
        |---------------|
        |      Code     |<-- text segment machine instructions     low  
Stack : where automatic variables are stored, along with information that is
saved each time a function is called. Each time a function is called, the
address of where to return to and certain information about the caller's
environment, such as some of the machine registers, are saved on the stack. The
newly called function then allocates room on the stack for its automatic and
temporary variables. This is how recursive functions in C can work. Each time a
recursive function calls itself, a new stack frame is used, so one set of
variables doesn't interfere with the variables from another instance of the
function.
Text Segment: The text segment contains the actual code to be executed. It's
usually sharable, so multiple instances of a program can share the text segment
to lower memory requirements. This segment is usually marked read-only so a
program can't modify its own instructions.
Initialized Data Segment: This segment contains global variables which are
initialized by the programmer.
Uninitialized Data Segment: Also named "bss" (block started by symbol) which
was an operator used by an old assembler. This segment contains uninitialized
global variables. All variables in this segment are initialized to 0 or NULL
pointers before the program begins to execute.
The stack: The stack is a collection of stack frames which will be described in
the next section. When a new frame needs to be added (as a result of a newly
called function), the stack grows downward.
Every time a function is called, an area of memory is set aside, called a stack frame,
for the new function call. This area of memory holds some crucial information, like:
1. Storage space for all the automatic variables for the newly called function.
2. The line number of the calling function to return to when the called function
returns.
3. The arguments, or parameters, of the called function.
The heap: Most dynamic memory, whether requested via C's malloc() and friends
or C++'s new is doled out to the program from the heap. The C library also gets
dynamic memory for its own personal workspace from the heap as well. As more
memory is requested "on the fly", the heap grows upward.

Constants are stored in text segment since they are non-writeable. However u can make them to writeable by moving them to data segment. gcc provides this functionality
gcc -fwritable-strings sample.c // this will keep constants in data segment

References:
http://discussknowhow.blogspot.com/2008/08/memory-layout-of-c-program-stack-wise.html