Project 1
Table of Contents
Feedback from 2017 said this project was to simple. To that end, there is now an addition called Default destructors below. The rest of the text is unchanged because there was no time to update it. So first read everything through, then Default destructors and think of that as an extension which might augment or contradict other things stated elsewhere (like the size of the object header etc.).
1 Introduction and Domain
This project is about manual memory management in C. The are several reasons for choosing this domain for the class project:
- We do not get to work enough on it before in the course
- It is at a suitable level of abstraction for a project in C
- It forces you to really get comfortable with pointers, which is one of the hardest concepts in programming
The goal of this project is to develop a C library for managing
memory in arbitrary C programs. Traditionally, C programs use
malloc()
1 and
free()
to reserve and return memory.
This approach is hard in the sense that there is nothing that
helps a programmer to make the call when some memory can safely be
returned to the system. Failing to return memory will cause leaks
which will eventually make programs crash and burn. Returning
memory too early will instead cause programs to corrupt memory
(because technically we have two objects overlapping in memory).
Returning the same memory multiple times with free()
tends to
lead to hard crashes. In short, manual memory management is an
unforgiving undertaking.
2 Functional Requirements
2.1 Reference Counting
The system you are implementing should support reference counting.
This means that each object has an associated counter that tracks
the number of pointers to that object. When an object is created,
its reference count is 0. Its count can subsequently be
manipulated by the functions retain()
and release()
which
increments respectively decrements a reference count by 1.
If release()
is called on an object with a reference count of 1,
the object is considered garbage and shall be free’d.
Calling retain()
and release()
on NULL
is supported and
should simply be ignored.
For simplicity, we will only count pointers on the heap. If an
object is allocated and never retained()
, it can be deleted by
calling deallocate()
. Calling deallocate()
on an object with a
non-zero reference count is an error.
2.2 Destructors
The system you are implementing should support destructor functions. This means that each object (may) have one associated function that is called right before the object’s memory is free’d.
Destructors are useful for resource management, including freeing
complicated data structures. For example, consider a linked list.
The destructor for a link should be a function that calls
release()
on the link’s next
pointer2. Similarly, the destructor
for the list head calls release()
on both first
and last
.
2.3 Dealing with Cascading Frees
Consider the linked list example above: if implemented naively, freeing a very large linked list (think millions of elements) will at best cause the program to spend many consecutive cycles on freeing links, and at worst cause the program to run out of stack space. Neither behaviour is acceptable.
The system you are implementing must avoid both these pathological
cases, by supporting setting an upper limit on how many objects
are free’d at once (set_cascade_limit()
).
To avoid oversubscribing to memory, you can delay outstanding free’s until next allocation. When freeing in conjunction with allocation, the cascade limit must be respected as usual, unless the amount of memory free’d is less than the bytes requested for allocation. In that case, your system is allowed to free objects until the requested number of bytes have been free’d or there are no more objects to free.
To reduce memory pressure, the system should also provide a
cleanup()
function that free’s all objects whose reference count
is 0, regardless of the cascade limit.
2.4 Complete Header File
#pragma once typedef void obj; typedef void(*function1_t)(obj *); void retain(obj *); void release(obj *); size_t rc(obj *); obj *allocate(size_t bytes, function1_t destructor); obj *allocate_array(size_t elements, size_t elem_size, function1_t destructor); void deallocate(obj *); void set_cascade_limit(size_t); size_t get_cascade_limit(); void cleanup(); void shutdown();
%23pragma%20once%0A%0Atypedef%20void%20obj%3B%0Atypedef%20void%28%2Afunction1_t%29%28obj%20%2A%29%3B%0A%0Avoid%20retain%28obj%20%2A%29%3B%0Avoid%20release%28obj%20%2A%29%3B%0Asize_t%20rc%28obj%20%2A%29%3B%0Aobj%20%2Aallocate%28size_t%20bytes%2C%20function1_t%20destructor%29%3B%0Aobj%20%2Aallocate_array%28size_t%20elements%2C%20size_t%20elem_size%2C%20function1_t%20destructor%29%3B%0Avoid%20deallocate%28obj%20%2A%29%3B%0Avoid%20set_cascade_limit%28size_t%29%3B%0Asize_t%20get_cascade_limit%28%29%3B%0Avoid%20cleanup%28%29%3B%0Avoid%20shutdown%28%29%3B%0A
The function rc()
returns an object’s reference count. The
allocate()
function is the equivalent of malloc()
and takes a
function pointer to a destructor function, possibly NULL
. The
allocate_array()
function is the equivalent of calloc()
and
too takes a function pointer to a destructor function, possibly
NULL
, which is called for each non-NULL
element. The function
deallocate()
is the equivalent of free()
but is only allowed
to be called on objects p
for which rc(p)
= 0
. Calling
deallocate()
on an object runs its destructor (if any).
Note: You are allowed3 to call
malloc()
, calloc()
and free()
under the hood to allocate and
return memory. Think of these as low-level functions whose
existence you are hiding from your users.
2.5 Dealing With Reference Count Overflow
Naturally, the size of the reference counter is an important design choice. For example, choosing an 8 bit integer, we keep the memory overhead small but are then only allowed 255 incoming references to each object4. Choose the size of your reference counter wisely and document the choice and explain how your system handles (or not) reference counter overflow.
3 Non-Functional Requirements
Naturally your program must not leak memory or have other memory bugs or bugs in general.
3.1 Amortising the Memory Management Overhead over Program Run-Time
Amortising the memory management overhead over the program run-time means that the cost of managing memory is distributed relatively evenly over the the program’s execution. Contrast this with automatic garbage collectors that suspend the program completely to manage memory from time to time.
You achieve this by properly respecting the cascade limit (see above).
3.2 Memory Efficiency
You may only use a constant factor of memory overhead in your
implementation, relative to malloc()
/ calloc()
. For example,
a program that uses B bytes of memory with malloc()
/
calloc()
may only use k * B bytes using your system.
Additionally, k may not be larger than 2.
You must hand in a convincing proof/back-of-envelope calculation that explains why your implementation satisfies this requirement5.
For the sake of the calculation, you may assume that no allocation is smaller than 8 bytes6, but you may not assume a limit on the number of allocations made by programs using the system, nor a limit on the amount of live objects or the amount of (logical) garbage.
Keeping memory overhead low is important, but ignore that in your first prototype. Solve the problem of getting something up and running first, then figure out how to optimise. Here are some tricks you can use to keep memory overhead small. These may or may not be usable together:
- Use a small scalar value for your reference count.
- When an object is garbage, you can reuse its memory – including its reference counter – to store whetever you want.
- If the number of destructors is small, there might be a more compact representation to store them than through a pointer.
Try to make the simplest and smallest number of optimisations needed to satisfy the requirements. Remember that keeping the code readable is of paramount importance! Don’t go XOR’ing pointers or anything until you actually need to7.
3.3 Validation and Verification
Since you are providing a low-level service on top of which many
other services are expected to build, your implementation must be
rock solid. You must provide a comprehensive set of tests that
demonstrate the stability of your system, and the fulfilment of
its specification. Use a combination of CUnit
and valgrind
to
harden your implementation.
You must document code coverage for your tests.
4 Demonstration
As additional proof that your implementation is correct, you will
provide a version of a successfully demonstrated Z101 program,
adapted to use your memory management system. This involves using
retain()
and release()
sensibly throughout the program.
You are required to change your tree and list libraries so that
trees and lists storing object pointer elements call retain()
on
the elements on insertion (etc.). Naturally, removing an object
pointer element from a list will call release()
on it.
5 Simplifications and Assumptions
You may have to make assumptions about the programs that your system supports. All such assumptions must be carefully documented. Any upper limit on reference count values, for example, fall under this category.
6 Example Use
This program creates a chain of two linked cells, ensures that
their reference counts are both 1, and then releases the first
cell. Since that cell holds the first one, both cells should be
free’d as a side-effect. You should be able to run this program
with valgrind
and find zero memory errors.
#include "refmem.h" #include <assert.h> struct cell { struct cell *cell; int i; char *string; }; void cell_destructor(obj c) { release(((struct cell *) c)->cell); } int main(void) { struct cell *c = allocate(sizeof(struct cell), cell_destructor); assert(rc(c) == 0); retain(c); assert(rc(c) == 1); c->cell = allocate(sizeof(struct cell), cell_destructor); assert(rc(c->cell) == 0); retain(c->cell); assert(rc(c->cell) == 1); c->cell->cell = NULL; release(c); return 0; }
%23include%20%22refmem.h%22%0A%23include%20%3Cassert.h%3E%0A%0Astruct%20cell%0A%7B%0A%20%20struct%20cell%20%2Acell%3B%0A%20%20int%20i%3B%0A%20%20char%20%2Astring%3B%0A%7D%3B%0A%0Avoid%20cell_destructor%28obj%20c%29%0A%7B%0A%20%20release%28%28%28struct%20cell%20%2A%29%20c%29-%3Ecell%29%3B%0A%7D%0A%0Aint%20main%28void%29%0A%7B%0A%20%20struct%20cell%20%2Ac%20%3D%20allocate%28sizeof%28struct%20cell%29%2C%20cell_destructor%29%3B%0A%20%20assert%28rc%28c%29%20%3D%3D%200%29%3B%0A%20%20retain%28c%29%3B%0A%20%20assert%28rc%28c%29%20%3D%3D%201%29%3B%0A%0A%20%20c-%3Ecell%20%3D%20allocate%28sizeof%28struct%20cell%29%2C%20cell_destructor%29%3B%0A%20%20assert%28rc%28c-%3Ecell%29%20%3D%3D%200%29%3B%0A%20%20retain%28c-%3Ecell%29%3B%0A%20%20assert%28rc%28c-%3Ecell%29%20%3D%3D%201%29%3B%0A%0A%20%20c-%3Ecell-%3Ecell%20%3D%20NULL%3B%0A%0A%20%20release%28c%29%3B%0A%0A%20%20return%200%3B%0A%7D%0A
7 Getting Technical Help
Naturally, you are allowed to ask for technical help (as well as help on anything else).
8 Addition in 2018: Default Destructors
To simplify programming, the system should support a notion of
default destructors, that saves the programmer from the burden of
writing a destructor manually. Let \(o\) be an object with a default
destructor that is just about to be freed. Before freeing \(o\),
the default destructor will call release()
on all the pointers
in \(o\). For example, if \(o\) is a linked list head in a list with a
head and tail pointer, we are going to call release()
on both
the head and the tail. If \(o\) is a link in the list, we are going
to call release()
on the next pointer.
Implementing default destructors can be done in several ways. Here are two suggestions:
- Scan the object for pointers
- Require the programmer to describe the object
8.1 Scan the Object for Pointers
Sticking with the \(o\) example from above, in this implementation,
we need to look at all possible pointers in \(o\) – essentially all
sizeof(void *)
blocks that could possibly hold a pointer, e.g.,
all starting addresses \(o+i\) where \(i\) denotes all offsets into
\(o\) where a pointer could be stored. (Note that alignment rules
for structs impose restrictions on the starting addresses of
fields based on the sizes of fields.)
For each possible pointer, we need to check whether the data we
are looking at is a pointer or not. For example, let’s say that we
look at sizeof(void *)
bytes starting at \(o+4\) (e.g., void
*possible_pointer = o + 4;
) – how can we know if that is a
pointer to something, or the list’s size field? One possibility to
answer this question with relatively high probability is to keep
track of where we have allocated data. Since all allocations go
through our own functions, we can simply store that somewhere.
Now, the algorithm is essentially this:
- For each possible starting location \(p\) of a pointer inside \(o\):
- check if \(p\) is an address where we currently have allocated memory
- if so, treat \(p\) as a valid pointer and call
release()
on it - if not, ignore \(p\) and move on to the next \(p\)
- if so, treat \(p\) as a valid pointer and call
- check if \(p\) is an address where we currently have allocated memory
This also requires:
- Whenever we allocate at an address \(p\), remember \(p\) in some collection \(A\).
- Whenever we free an object at an address \(p\), remove \(p\) from \(A\).
- Add the size of each object to its metadata.
Aside: We would like to store \(A\) in an efficient way. Naturally, we only optimise once we have something that works, so think of that second, not first. (A bitmap is a good idea.)
Feel free to improve on this technique if you want to!
8.2 Require the Programmer to Describe the Object
A problem with the technique above is that it is approximate. If we happen to have an integer field in a place where a pointer could be stored, and the value of that integer happens to be a an address in \(A\), then we might cause an error in the program due to premature deallocation.
An alternative approach is to require the programmer to specify where in the struct pointers are stored8. This can be handled in several ways. One method is explained in painful detail in Project 2, another way is to add a metadata function that uses C macros and require that the programmer names the fields which contain pointers, e.g., like this:
// Register metadata about the pointer fields f1 .. fn of type type register_typeN(type, f1, f2, ... fn); // Allocate using a registered type allocate_from_type(type);
%2F%2F%20Register%20metadata%20about%20the%20pointer%20fields%20f1%20..%20fn%20of%20type%20type%0Aregister_typeN%28type%2C%20f1%2C%20f2%2C%20...%20fn%29%3B%20%0A%2F%2F%20Allocate%20using%20a%20registered%20type%20%0Aallocate_from_type%28type%29%3B%0A
If these calls are implemented as macros, they can be turned into
calls to the “actual methods”, here named with prefixing __
.
__register_typeN(#type, sizeof(type), offsetof(type, f1), offsetof(type, f2), ... offsetof(type, fn));
__allocate_from_type(#type);
__register_typeN%28%23type%2C%20sizeof%28type%29%2C%20offsetof%28type%2C%20f1%29%2C%20offsetof%28type%2C%20f2%29%2C%20...%20offsetof%28type%2C%20fn%29%29%3B%0A__allocate_from_type%28%23type%29%3B%0A
Note that #int
in a macro returns the string "int"
so if I write
register_type2(list_t, first, last); allocate_from_type(list_t);
register_type2%28list_t%2C%20first%2C%20last%29%3B%0Aallocate_from_type%28list_t%29%3B%0A
and my macro is defined thus (fill in the blanks for allocate_from_type()
):
#define register_type2(type, f1, f2) \ __register_type2(#type, sizeof(type), offsetof(type, f1), offsetof(type, f2))
%23define%20register_type2%28type%2C%20f1%2C%20f2%29%20%5C%0A%20%20%20__register_type2%28%23type%2C%20sizeof%28type%29%2C%20offsetof%28type%2C%20f1%29%2C%20offsetof%28type%2C%20f2%29%29%0A
the resulting call after macro expansion is this:
__register_type2("list_t", 24, 0, 8); __allocate_from_type("list_t");
__register_type2%28%22list_t%22%2C%2024%2C%200%2C%208%29%3B%0A__allocate_from_type%28%22list_t%22%29%3B%0A
if the size of list_t
is 24 bytes and the offsets of the fields
are 0 and 8 respectively. The offsetof()
macro calculates the
starting point of a field \(f\) in a type \(t\), so basically gives
you the \(i\)’s discussed in the scanning strategy.
To support different numbers of fields, you either implement
several functions for registering and allocating, e.g.,
register_type1()
, register_type2()
, …, register_typeN()
,
or use a trick to get overloading based on the number of
arguments.
The key difficulty with this implementation is to store the
metadata somewhere. For this, we can use the hash table from
Assignment 1 where the keys are the types and the values is the
size of the types the offsets of the pointer fields. Abstractly
speaking, __register_type2("list_t", 24, 0, 8);
will map
“listt” to [24, 0, 8] and __allocate_from_type("list_t");
will
look up 24 to know how many bytes to allocate.
Each object now needs to know its type so that when an object \(o\) is destroyed we can do the following:
- Get \(o\)’s type \(t\)
- Use \(t\) in the global hash map to get the offsets \(i_1...i_n\) of all the pointer fields in \(t\)
- For each offset \(i\), call
release(o+i)
.
To make memory leaks easier to detect, we should also remove the
hash table etc. in the shutdown()
function whose job it is to
completely tear down the library and its associated data.
8.3 Wrapping Up
It should be possible to allocate an object and get a default destructor “for free”. What “for free” means depends on your implementation. Strategy 1 (scanning) is truly for free but may under rare circumstances not work out in practise. Strategy 2 (meta data) is a little harder on the programmer, but less brittle – unless you use union types, in which case you need to fall back on explicit constructors. Feel free to come up with alternative ways of implementing support for default constructors. Always start by doing the simplest thing possible and once that works, move to e.g. a more efficient or general solution.
Questions about stuff on these pages? Use our Piazza forum.
Want to report a bug? Please place an issue here. Pull requests are graciously accepted (hint, hint).
Nerd fact: These pages are generated using org-mode in Emacs, a modified ReadTheOrg template, and a bunch of scripts.
Ended up here randomly? These are the pages for a one-semester course at 67% speed on imperative and object-oriented programming at the department of Information Technology at Uppsala University, created by Tobias Wrigstad.
Footnotes:
calloc()
and realloc()
.