Thursday 18 September 2014

concept of stack and heap memory

Q:

Programming language books usually explain that value types are created on the stack, and reference types are created on the heap, without really explaining what these two things are.  With my only programming experience being in high level languages, I haven't read a clear explanation of this.  I mean I understand what _a stack_ is, but where and what are they (relative to the physical memory of a real computer)? 

* To what extent are they controlled by the OS or language runtime?
* What is their scope?
* What determines the size of each of them?
* What makes one faster?


  [1]: http://stackoverflow.com/questions/655308/what-is-the-difference-between-the-stack-and-heap-data-structures

ANSWER-1:

The stack is the memory set aside as scratch space for a thread of execution.  When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data.  When that function returns, the block becomes unused and can be used the next time a function is called.  The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed.  This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.

The heap is memory set aside for dynamic allocation.  Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time.  This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.

Each thread gets a stack, while there's typically only one heap for the application (although it isn't uncommon to have multiple heaps for different types of allocation).

To answer your questions directly: 

> _To what extent are they controlled by the OS or language runtime?_

The OS allocates the stack for each system-level thread when the thread is created.  Typically the OS is called by the language runtime to allocate the heap for the application.

> _What is their scope?_

The stack is attached to a thread, so when the thread exits the stack is reclaimed.  The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits.

> _What determines the size of each of them?_ 

The size of the stack is set when a thread is created.  The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system).

> _What makes one faster?_
 
The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or free.  Also, each byte in the stack tends to be reused very frequently which means it tends to be mapped to the processor's cache, making it very fast. Another performance hit for the heap is that the heap, being mostly a global resource, typically has to be multi-threading safe, i.e. each allocation and deallocation needs to be - typically - synchronized with "all" other heap accesses in the program.

A clear demonstration:
>![][1]
<br/><sub>Image source: [vikashazrati.wordpress.com](http://vikashazrati.wordpress.com/2007/10/01/quicktip-java-basics-stack-and-heap/)</sub>


  [1]: http://i.stack.imgur.com/i6k0Z.png

ANSWER-2:

**Stack:**

  - Stored in computer RAM just like the heap.
  - Variables created on the stack will go out of scope and automatically deallocate.
  - Much faster to allocate in comparison to variables on the heap.
  - Implemented with an actual stack data structure.
  - Stores local data, return addresses, used for parameter passing
  - Can have a stack overflow when too much of the stack is used. (mostly from infinite (or too much) recursion, very large allocations)
  - Data created on the stack can be used without pointers.
  - You would use the stack if you know exactly how much data you need to allocate before compile time and it is not too big.
  - Usually has a maximum size already determined when your program starts


**Heap:**

  - Stored in computer RAM just like the stack.
  - Variables on the heap must be destroyed manually and never fall out of scope.  The data is freed with delete, delete[] or free
  - Slower to allocate in comparison to variables on the stack.
  - Used on demand to allocate a block of data for use by the program.
  - Can have fragmentation when there are a lot of allocations and deallocations
  - In C++ data created on the heap will be pointed to by pointers and allocated with new or malloc
  - Can have allocation failures if too big of a buffer is requested to be allocated.
  - You would use the heap if you don't know exactly how much data you will need at runtime or if you need to allocate a lot of data.
  - Responsible for memory leaks


**Example:**

<!-- language: c++ -->

    int foo()
    {
      char *pBuffer; //<--nothing allocated yet (excluding the pointer itself, which is allocated here on the stack).
      bool b = true; // Allocated on the stack.
      if(b)
      {
        //Create 500 bytes on the stack
        char buffer[500];

        //Create 500 bytes on the heap
        pBuffer = new char[500];

       }//<-- buffer is deallocated here, pBuffer is not
    }//<--- oops there's a memory leak, I should have called delete[] pBuffer;

ANSWER-3:

The most important point is that heap and stack are generic terms for ways in which memory can be allocated.  They can be implemented in many different ways, and the terms apply to the basic concepts.

 - In a stack of items, items sit one on top of the other in the order they were placed there, and you can only remove the top one (without toppling the whole thing over).

 ![Stack like a stack of papers][1]

 - In a heap, there is no particular order to the way items are placed.  You can reach in and remove items in any order because there is no clear 'top' item.

 ![Heap like a heap of licorice allsorts][2]

It does a fairly good job of describing the two ways of allocating and freeing memory in a stack and a heap.  Yum!

 * To what extent are they controlled by the OS or language runtime?

 As mentioned, heap and stack are general terms, and can be implemented in many ways.  Computer programs typically have a stack called a [call stack][3] which stores information relevant to the current function such as a pointer to whichever function it was called from, and any local variables.  Because functions call other functions and then return, the stack grows and shrinks to hold information from the functions further down the call stack.  A program doesn't really have runtime control over it; it's determined by the programming language, OS and even the system architecture.

 A heap is a general term used for any memory that is allocated dynamically and randomly; i.e. out of order.  The memory is typically allocated by the OS, with the application calling API functions to do this allocation.  There is a fair bit of overhead required in managing dynamically allocated memory, which is usually handled by the OS.

 * What is their scope?

 The call stack is such a low level concept that it doesn't relate to 'scope' in the sense of programming.  If you disassemble some code you'll see relative pointer style references to portions of the stack, but as far as a higher level language is concerned, the language imposes its own rules of scope.  One important aspect of a stack, however, is that once a function returns, anything local to that function is immediately freed from the stack.  That works the way you'd expect it to work given how your programming languages work.  In a heap, it's also difficult to define.  The scope is whatever is exposed by the OS, but your programming language probably adds its rules about what a "scope" is in your application.  The processor architecture and the OS use virtual addressing, which the processor translates to physical addresses and there are page faults, etc.  They keep track of what pages belong to which applications.  You never really need to worry about this, though, because you just use whatever method your programming language uses to allocate and free memory, and check for errors (if the allocation/freeing fails for any reason).

 * What determines the size of each of them?

 Again, it depends on the language, compiler, operating system and architecture.  A stack is usually pre-allocated, because by definition it must be contiguous memory (more on that in the last paragraph).  The language compiler or the OS determine its size.  You don't store huge chunks of data on the stack, so it'll be big enough that it should never be fully used, except in cases of unwanted endless recursion (hence, "stack overflow") or other unusual programming decisions.

 A heap is a general term for anything that can be dynamically allocated.  Depending on which way you look at it, it is constantly changing size.  In modern processors and operating systems the exact way it works is very abstracted anyway, so you don't normally need to worry much about how it works deep down, except that (in languages where it lets you) you mustn't use memory that you haven't allocated yet or memory that you have freed.

 * What makes one faster?

 The stack is faster because all free memory is always contiguous.  No list needs to be maintained of all the segments of free memory, just a single pointer to the current top of the stack.  Compilers usually store this pointer in a special, fast [register][4] for this purpose.  What's more, subsequent operations on a stack are usually concentrated within very nearby areas of memory, which at a very low level is good for optimization by the processor on-die caches.


  [1]: https://i.stack.imgur.com/ZLzMV.jpg
  [2]: https://i.stack.imgur.com/kINqo.jpg
  [3]: http://en.wikipedia.org/wiki/Call_stack
  [4]: http://en.wikipedia.org/wiki/Stack_register

ANSWER-4:

(I have moved this answer from another question that was more or less a dupe of this one.)

The answer to your question is implementation specific and may vary across compilers and processor architectures. However, here is a simplified explanation.

- Both the stack and the heap are memory areas allocated from the underlying operating system (often virtual memory that is mapped to physical memory on demand).
- In a multi-threaded environment each thread will have its own completely independent stack but they will share the heap. Concurrent access has to be controlled on the heap and is not possible on the stack.

The heap
--------

- The heap contains a linked list of used and free blocks. New allocations on the heap (by `new` or `malloc`) are satisfied by creating a suitable block from one of the free blocks. This requires updating list of blocks on the heap. This _meta information_ about the blocks on the heap is also stored on the heap often in a small area just in front of every block.
- As the heap grows new blocks are often allocated from lower addresses towards higher addresses. Thus you can think of the heap as a _heap_ of memory blocks that grows in size as memory is allocated. If the heap is too small for an allocation the size can often be increased by acquiring more memory from the underlying operating system.
- Allocating and deallocating many small blocks may leave the heap in a state where there are a lot of small free blocks interspersed between the used blocks. A request to allocate a large block may fail because none of the free blocks are large enough to satisfy the allocation request even though the combined size of the free blocks may be large enough. This is called _heap fragmentation_.
- When a used block that is adjacent to a free block is deallocated the new free block may be merged with the adjacent free block to create a larger free block effectively reducing the fragmentation of the heap.

![The heap][1]

The stack
---------

- The stack often works in close tandem with a special register on the CPU named the _stack pointer_. Initially the stack pointer points to the top of the stack (the highest address on the stack).
- The CPU has special instructions for _pushing_ values onto the stack and _popping_ them back from the stack. Each _push_ stores the value at the current location of the stack pointer and decreases the stack pointer.  A _pop_ retrieves the value pointed to by the stack pointer and then increases the stack pointer (don't be confused by the fact that _adding_ a value to the stack _decreases_ the stack pointer and _removing_ a value _increases_ it. Remember that the stack grows to the bottom). The values stored and retrieved are the values of the CPU registers.
- When a function is called the CPU uses special instructions that push the current _instruction pointer_, i.e. the address of the code executing on the stack. The CPU then jumps to the function by setting the
instruction pointer to the address of the function called. Later, when the function returns, the old instruction pointer is popped from the stack and execution resumes at the code just after the call to the function.
- When a function is entered, the stack pointer is decreased to allocate more space on the stack for local (automatic) variables. If the function has one local 32 bit variable four bytes are set aside on the stack. When the function returns, the stack pointer is moved back to free the allocated area.
- If a function has parameters, these are pushed onto the stack before the call to the function. The code in the function is then able to navigate up the stack from the current stack pointer to locate these values.
- Nesting function calls work like a charm. Each new call will allocate function parameters, the return address and space for local variables and these _activation records_ can be stacked for nested calls and will unwind in the correct way when the functions return.
- As the stack is a limited block of memory, you can cause a _stack overflow_ by calling too many nested functions and/or allocating too much space for local variables. Often the memory area used for the stack is set up in such a way that writing below the bottom (the lowest address) of the stack will trigger a trap or exception in the CPU. This exceptional condition can then be caught by the runtime and converted into some kind of stack overflow exception.

![The stack][2]

> Can a function be allocated on the heap instead of a stack?

No, activation records for functions (i.e. local or automatic variables) are allocated on the stack that is used not only to store these variables, but also to keep track of nested function calls.

How the heap is managed is really up to the runtime environment. C uses `malloc` and C++ uses `new`, but many other languages have garbage collection.

However, the stack is a more low-level feature closely tied to the processor architecture. Growing the heap when there is not enough space isn't too hard since it can be implemented in the library call that handles the heap. However, growing the stack is often impossible as the stack overflow only is discovered when it is too late; and shutting down the thread of execution is the only viable option.


  [1]: http://i.stack.imgur.com/0Obi0.png
  [2]: http://i.stack.imgur.com/9UshP.png

ANSWER-5:

In the following C# code

    public void Method1()
    {
        int i = 4;
        int y = 2;
        class1 cls1 = new class1();
    }


Here's how the memory is managed

![Picture of variables on the stack][1]

`Local Variables` that only need to last as long as the function invocation go in the stack. The heap is used for variables whose lifetime we don't really know up front but we expect them to last a while. In most languages it's critical that we know at compile time how large a variable is if we want to store it on the stack.

Objects (which vary in size as we update them) go on the heap because we don't know at creation time how long they are going to last. In many languages the heap is garbage collected to find objects (such as the cls1 object) that no longer have any references.

In Java, most objects go directly into the heap. In languages like C / C++, structs and classes can often remain on the stack when you're not dealing with pointers.

More information can be found here:

[The difference between stack and heap memory allocation &laquo;  timmurphy.org][2]

and here:

[Creating Objects on the Stack and Heap][3]

This article is the source of picture above: [Six important .NET concepts: Stack, heap, value types, reference types, boxing, and unboxing - CodeProject][4]

but be aware it may contain some inaccuracies.

  [1]: http://i.stack.imgur.com/NS0k7.jpg
  [2]: http://timmurphy.org/2010/08/11/the-difference-between-stack-and-heap-memory-allocation/
  [3]: http://root.cern.ch/download/doc/ROOTUsersGuideHTML/ch06s03.html
  [4]: http://www.codeproject.com/Articles/76153/Six-important-NET-concepts-Stack-heap-value-types#Stack%20and%20Heap

ANSWER-7:

**The Stack**
When you call a function the arguments to that function plus some other overhead is put on the stack. Some info (such as where to go on return) is also stored there.
When you declare a variable inside your function, that variable is also allocated on the stack.

Deallocating the stack is pretty simple because you always deallocate in the reverse order in which you allocate. Stack stuff is added as you enter functions, the corresponding data is removed as you exit them. This means that you tend to stay within a small region of the stack unless you call lots of functions that call lots of other functions (or create a recursive solution).

**The Heap**
The heap is a generic name for where you put the data that you create on the fly. If you don't know how many spaceships your program is going to create, you are likely to use the new (or malloc or equivalent) operator to create each spaceship. This allocation is going to stick around for a while, so it is likely we will free things in a different order than we created them.

Thus, the heap is far more complex, because there end up being regions of memory that are unused interleaved with chunks that are - memory gets fragmented. Finding free memory of the size you need is a difficult problem. This is why the heap should be avoided (though it is still often used).

**Implementation**
Implementation of both the stack and heap is usually down to the runtime / OS. Often games and other applications that are performance critical create their own memory solutions that grab a large chunk of memory from the heap and then dish it out internally to avoid relying on the OS for memory.

This is only practical if your memory usage is quite different from the norm - i.e for games where you load a level in one huge operation and can chuck the whole lot away in another huge operation.

**Physical location in memory**
This is less relevant than you think because of a technology called [Virtual Memory][1] which makes your program think that you have access to a certain address where the physical data is somewhere else (even on the hard disc!). The addresses you get for the stack are in increasing order as your call tree gets deeper. The addresses for the heap are un-predictable (i.e implimentation specific) and frankly not important.


  [1]: http://en.wikipedia.org/wiki/Virtual_memory

ANSWER 8:

Others have answered the broad strokes pretty well, so I'll throw in a few details.

1. Stack and heap need not be singular.  A common situation in which you have more than one stack is if you have more than one thread in a process.  In this case each thread has its own stack.  You can also have more than one heap, for example some DLL configurations can result in different DLLs allocating from different heaps, which is why it's generally a bad idea to release memory allocated by a different library.

2. In C you can get the benefit of variable length allocation through the use of alloca, which allocates on the stack, as opposed to alloc, which allocates on the heap.  This memory won't survive your return statement, but it's useful for a scratch buffer.

3. On windows making a huge temporary buffer that you don't use much of is not free. This is because the compiler will generate a stack probe loop that is called every time your function is entered to make sure the stack exists (because windows uses a single guard page at the end of your stack to detect when it needs to grow the stack, if you access memory more than one page off the end of the stack you will crash).  Example:

<!-- language: c++ -->

    void myfunction()
    {
       char big[10000000];
       // do something that only uses for first 1K of big 99% of the time.
    }

No comments:

Post a Comment