0% found this document useful (0 votes)
142 views

Prog 8

Uploaded by

capiron22
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views

Prog 8

Uploaded by

capiron22
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

EPITA - Practical Programming

Memory
Overview
1. Multi-task And Memory Management
a. Issues ?
b. Virtual Memory
2. Processus Memory Organization
3. malloc(3)
a. Heap Management ?
b. A Simple First Fit Allocator
4. Other Kind Of Allocators
Multi-Task And Memory
Issues ?

➢ Sharing one memory with several processes


➢ Each process wants contiguous memory blocks
➢ We need confinement between processes
➢ How to deal with dynamic memory
➢ Binary program may not be relocatable
➢ …
Solutions
➢ Fixed size memory partitions: too much
constraints
➢ Full swapping: way too slow …
➢ Dynamics partitioning: continuity issues and
probably no possible confinement …
➢ Relocatable code: complex and risky

Can we do better ?
Virtual Memory

Virtual Memory maps memory addresses used by a


program, called virtual addresses, into physical
addresses in computer memory.
Virtual Memory

➢ All memory accesses are rewritten on the fly


➢ Each process has its own virtual address space
➢ Rewriting is done by the MMU (hardware)
➢ Rewriting tables are managed by the Kernel
Benefits

➢ Memory is always contiguous for each process


➢ Memory has a fixed layout for each process
➢ No swapping are required
➢ Each process is alone in its virtual address space

Virtual Memory provides a complete solution.


Mapping ?

➢ Can we do a one-to-one mapping ?


➢ A translation table needs 8 bytes (64bits) per
translated address, let’s do some math …
➢ For n bytes of virtual memory, we need 8n bytes

To map the memory we need 8 times its size !


Frames and Pages

➢ Memory is splitted in blocks


➢ Block in physical memory is called frame
➢ Block in virtual memory is called page
➢ We only translate page base into fram base
➢ Page/Frame base addresses are shorter
➢ Usually, page/frame have fixed size
Virtual Memory
Virtual Unmapped Current Physical
Memory Memory
Mapped Other

Unmapped Other

Unmapped Current


Mapped Other

Mapped Current

Pages Translation Example
Intel ia32 (32bits) schema:
➢ We use 4KB pages (12bits)
➢ Pages are grouped in page directory
➢ Translation is done in two step:
1. Translate the page directory base address
2. Translate the page base address
➢ The page directory table contains 1024 entry of 4
bytes each pointing to page table of the same
size. A full mapping takes at most 4MB for 4GB.
Process Memory
Memory Organization
➢ Each process has its own address space
➢ The memory is divided in segments:
● Text Segment: the process image (code)
● Data Segment: static initialized variables
● BSS Segment: static uninitialized variables (fill with 0)
● Heap: where dynamic allocation should take place
● Memory Mapping Segment: mapped files, libs …
● Stack: process (main thread) stack
➢ The kernel is also mapped, in the higher part of
the memory.
Process Layout (Linux 32bits)
0

Text Segment
Data Segment
BSS Segment

3GB Heap

Memory Mapping Segment

Stack

1GB Kernel Space 232


Where’re My Variables ?

➢ Global variables belongs to the BSS Segment


➢ Global constants belongs to the Data Segment
➢ Local variables are on the stack
➢ Dynamic allocation is done on the heap
➢ Parameters use registers, if possible, and the
stack.
Where’re My Variables
void f(int x) { // Let’s print some addresses
// x is in a register but has printf("x : %p\n",&x);
// an address on the stack printf("y : %p\n",&y);
int y; // on the stack printf("tab : %p\n",tab);
int tab[8]; // on the stack printf("&tab: %p\n",&tab);
char *s1 = "a constant string"; printf("s1 : %p\n",s1);
// s1 is on the stack but points printf("s2 : %p\n",s2);
// to the Data segment printf("s3 : %p\n",s3);
char *s2 = malloc(32); printf("&s3 : %p\n",&s3);
// s2 is on the stack but points }
// to the heap
char s3[] = "a local string";
// the string is on the stack
Where’re My Variables
x : 0x7fffffffd5ec
y : 0x7fffffffd5e8
tab : 0x7fffffffd5c0
&tab: 0x7fffffffd5c0
s1 : 0x400732
s2 : 0x800c07040
s3 : 0x7fffffffd5b0
&s3 : 0x7fffffffd5b0
Static Arrays

int tab[8];

➢ Static arrays always verify: (tab == &tab)


➢ The address is not stored
➢ Address usage are replaced at compile time
➢ Static local string literals are similar
Mind Variables Lifetime
struct list { int main() {
struct list *next; struct list *l = NULL;
int value; for (int i=0; i<10; ++i)
}; l = add(l, i);
assert(l != l->next);
struct list * return 0;
add(struct list *l, int x) { }
struct list *r, tmp;
tmp.next = l;
tmp.value = x; Assertion failed:
r = &tmp; (l != l->next), function
return r; main, file stupid_list.c,
} line 23.
malloc(3)
malloc(3)
➢ C defines allocation functions: malloc(3)
The malloc() function allocates size bytes of uninitialized
memory. The allocated space is suitably aligned (after possible
pointer coercion) for storage of any type of object.

The malloc() and calloc() functions return a pointer to the
allocated memory if successful; otherwise a NULL pointer is
returned and errno is set to ENOMEM.
FreeBSD man page
malloc(3)

➢ The returned pointer points to a contiguous


memory zone with a size superior or equal to the
asked size.
➢ No further invocation of malloc(3) will return a
pointer to an overlapping area.
➢ Memory acquire with malloc(3) can be release
using free(3).
Managing The Heap

➢ Dynamic allocation goes to the heap


➢ The heap start after the end of the BSS
➢ The actual end of the heap is called the break
➢ The break can be moved using brk(2) and sbrk(2)
The Heap

break

Mapped Area Unmapped Area

page boundary

Growing Memory Addresses


brk(2) and sbrk(2)
int brk(const void *addr);
void *sbrk(intptr_t incr);

➢ brk(addr) moves the break at address addr


● returns 0 if successful
● returns -1 and set errno otherwise
➢ sbrk(incr) moves the break by incr bytes
● returns the address of the old break when succesful
● returns (void*)(-1) and set errno otherwise
Simplest Allocator

void *malloc(size_t s) {
void *r;
if ( (r = sbrk(s)) == (void*)(-1) )
return NULL;
return r;
}
Comments

Pros:
➢ Minimal cost (just the cost of sbrk(2))
➢ No waste space and no overhead
Cons:
➢ Can’t implement free(3)
➢ realloc(3) is unsafe
What Do We Need ?

➢ We need information about memory chunk:


● start of the chunk
● end of the chunk
● availability (freed or not)
➢ We need to store those data
➢ We need to find available chunks of a given size

We need meta-data !
Strategy

➢ First-Fit: when searching for a given size returns


the first chunk sufficiently large.
➢ Best-Fit: when searching for a given size returns
the smallest chunk sufficiently large.
➢ Simple linear organization (linked list)
➢ More advanced structures (buddies, slab ... )
A First Fit Allocator
Meta-Data ?

➢ We add meta-data to each chunk

➢ we add the meta-data before the chunk

➢ We link chunk like a long linked list


Meta-data

meta-data data
➢ size
➢ free
➢ next

size bytes

Pointer
returned
Structured Heap

data data data data


Meta-Data

struct chunk {
struct chunk *next; ➢ next is the list pointer
size_t size; ➢ we keep data size only
➢ free indicate chunk availability
int free;
➢ we may add information later
};
Construction Of malloc

➢ Manage the first call for init


➢ Enforce aligned addresses: align sizes.
➢ When calling malloc with size s:
● align the size (eventually force a minimal size)
● scan the list of chunk for a free chunk sufficiently large
● if none are available, extend the heap with sbrk(2)
● mark the chunk as used and return a pointer just after the
meta-data struct.
● if it goes wrong, return NULL
Heap Init
void *base() {
static struct chunk *b = NULL;
➢ Get the base of the heap
if (!b) {
/* Initial call */ ➢ if not initialized:
b = sbrk(sizeof (struct chunk));
if (b == (void*)(-1)) { ● add place for an empty chunk
/* error management */ ● save the old break
_exit(71 /* EX_OSERR */); ● build a sentinel
}
/* Build a sentinel */ ➢ return the saved base
b->next = NULL;
b->size = 0; ➢ if something goes wrong, die.
b->free = 0;
}
➢ We should probably test the initial
return b; address of the break for alignment
}
Searching For A Free Chunk
struct chunk* find_chunk(size_t s)
{
struct chunk *cur = get_base();
for (; cur->next; cur = cur->next) {
if (cur->next->free && cur->next->size >= s)
break;
}
return cur;
}

➢ Basic list search


➢ Returns the pointer to the element before the free block
Adding New Chunk
struct chunk *new_chunk(size_t s,
struct chunk *prev)
{
➢ Used when no chunk are available
struct chunk *b;
b = sbrk(s + sizeof (struct chunk)); ➢ Move the break of asked size plus
if (b == (void*)(-1)) { meta-data size
if (errno == ENOMEM)
return NULL; ➢ Old break points to the new chunk
/* error management */
_exit(71 /* EX_OSERR */); ➢ Init the new meta-data block
}
prev->next = b; ➢ In case of error:
b->next = NULL; ● If not enough mem: return NULL
b->size = s; ● Otherwise: die !
b->free = 1;
return b;
}
free ?

free(p)
➢ First we need to verify p:
● it must be between the base of the heap and its end
● it must aligned on sizeof (void*)
➢ Then we must access the meta-data:
● they lie (sizeof (struct chunk)) bytes before p
● we just have to set the free field to 1
Valid call to free ?

➢ Is p a pointer returned by malloc ?


➢ We’ve done only the minimal tests
➢ Can we do better ?
● traverse the list of chunk expensive
● add a magic number to the meta-data not sure
● available data can be verified some how ...
Tricks
➢ First add data address in the meta-data
struct chunk {
struct chunk *next;
size_t size;
int free;
void *data; // pointer to data
};

➢ Then check for the address in free


struct chunk *b = p;
b -= 1;
if (b->data == p)
b->free = 1;
More ...
➢ calloc(3) : checks for integer overflow
➢ realloc(3) :
● simple version: malloc, copy, free
● try to reuse current block
● Beware of special cases
➢ Special cases:
● malloc(0) : must return a valid pointer
● realloc(NULL, size) : alias for malloc(size)
● realloc(ptr, 0) : alias for free(ptr)
● when realloc(3) fails, it returns NULL and does not free
Wasted Space And
Fragmentation
Wasted Space

➢ first-fit may choose ridiculously too large chunk


➢ best-fit requires clever storage for decent perf
➢ we should split chunk when they are too large:
● if the founded check is larger than required
● if the extra amount is enough to store meta-data
● force a minimal size to avoid too small chunk
● update the size of the chunk and populate meta-data
Split Chunk

meta meta
data
data data

meta meta meta


data data
data data data

k n-k-sizeof(chunk)
Fragmentation

➢ Internal Fragmentation: wasted space → solved


➢ External Fragmentation: too many small chunks
● after some round of malloc/free, all chunks are small
● sum of free space is sufficient for large allocation
● but no suitable chunk can be found
● a lot of space is lost in meta-data
Merge !

➢ The solution is to merge chunks


➢ When performing free:
● if the next chunk is also free we can merge with it
● we just update the size
● and make next point to the next of the merged chunk
➢ Is it enough ?
Merge More ?

Sure that all adjacent free chunks are merged ?

➢ The next of the next ? No, to expensive !


➢ The previous chunk ? Yes
➢ If each time a chunk is freed we merge its
previous and next chunk, we win !
➢ We need to access the previous chunk !
Double Link !

data data data data


Merge
p

meta meta
meta
data data data data data
data
(free) (free)

meta
data
data
Doing Better ?

➢ Only link free chunks


➢ Embedded meta-data inside the chunk
➢ Using a best-fit ?
● Finding the best chunk is expensive
● Free chunk can be linked with chunks of the same size
● Buddy Algorithm: chunks are split/merge by pair
● Slab Allocator: pools of chunk of the same size
➢ Use mmap(2) with or instead of sbrk(2)
More Allocators !
More Allocators ?

malloc(3) is highly constrained:


● Allocate any possible size
● Free chunk independently

For specific usage we may have better strategy


Allocate Many, Free Once

➢ Many programs can be sliced into stages


➢ Memory will be freed all at once at end of stage
➢ malloc(3) is not adapted to this case
● we uselessly pay the cost of fine grain free
● Freeing one by one can be difficult and expensive
➢ For that we can use a pool allocator !
Pool Allocator

➢ Allocate once and for all a huge single block


➢ Maintain an index of the currently used space
➢ Give chunk piece by piece as required
➢ Release memory by simply set the index to 0
➢ Allocation and release are in O(1)
➢ No overhead, no wasted space !
Pool Allocator

index

Data

Allocated Area Free Area


Pool Allocator
typedef struct pool {
size_t poolsize, index;
char *data;
} *mempool;

void reset_pool(mempool mem) {


mem->index = 0;
}

void *alloc_pool(mempool mem, size_t s) {


void *r = NULL;
if (mem->index + s < mem->poolsize) {
r = mem->data + mem->index;
mem->index += s;
}
return r;
}
Recycling Pool

➢ We have a bunch of allocations of the same size


➢ We often allocate/free blocks
➢ We can use a pool for allocation
➢ Rather than really freed blocks, we reuse them !
➢ Again we have O(1) allocate and free !
Free List
➢ To provide easy reuse we add a free list
➢ We keep a list of free block
➢ When freeing a block:
● we use the first word to store a pointer (our next field)
● we bind the block to the free list (first place add)
● we replace the entry of the free list by the block
➢ When allocate:
● try to take the first element of the free list
● take another chunk of the pool
Recycler
typedef struct rpool {
size_t poolsize, bsize, index;
char *data;
void **free_list;
} *recycler;

void *alloc_rpool(recycler mem) {


void *r = NULL;
if (mem->free_list) {
r = mem->free_list;
mem->free_list = *(mem->free_list);
} else if (mem->index + mem->bsize < mem->poolsize) {
r = mem->data + mem->index;
mem->index += mem->bsize;
}
return r;
}

void free_rpool(recycler mem, void *p_) {


void **p = p_;
*p = mem->free_list;
mem->free_list = p;
}

You might also like