Project 1

1. Introduction and Domain
2. Functional Requirements
3. Non-Functional Requirements
4. Demonstration
5. Simplifications and Assumptions
6. Example Use
7. Getting Technical Help
8. Addition in 2018: Default Destructors

Feedback from 2017 said this project was to simple. To that end, there is now an addition called Default destructors below. The rest of the text is unchanged because there was no time to update it. So first read everything through, then Default destructors and think of that as an extension which might augment or contradict other things stated elsewhere (like the size of the object header etc.).

1 Introduction and Domain

This project is about manual memory management in C. The are several reasons for choosing this domain for the class project:

We do not get to work enough on it before in the course
It is at a suitable level of abstraction for a project in C
It forces you to really get comfortable with pointers, which is one of the hardest concepts in programming

The goal of this project is to develop a C library for managing memory in arbitrary C programs. Traditionally, C programs use malloc()¹ and free() to reserve and return memory.

This approach is hard in the sense that there is nothing that helps a programmer to make the call when some memory can safely be returned to the system. Failing to return memory will cause leaks which will eventually make programs crash and burn. Returning memory too early will instead cause programs to corrupt memory (because technically we have two objects overlapping in memory). Returning the same memory multiple times with free() tends to lead to hard crashes. In short, manual memory management is an unforgiving undertaking.

2 Functional Requirements

2.1 Reference Counting

The system you are implementing should support reference counting. This means that each object has an associated counter that tracks the number of pointers to that object. When an object is created, its reference count is 0. Its count can subsequently be manipulated by the functions retain() and release() which increments respectively decrements a reference count by 1.

If release() is called on an object with a reference count of 1, the object is considered garbage and shall be free’d.

Calling retain() and release() on NULL is supported and should simply be ignored.

For simplicity, we will only count pointers on the heap. If an object is allocated and never retained(), it can be deleted by calling deallocate(). Calling deallocate() on an object with a non-zero reference count is an error.

2.2 Destructors

The system you are implementing should support destructor functions. This means that each object (may) have one associated function that is called right before the object’s memory is free’d.

Destructors are useful for resource management, including freeing complicated data structures. For example, consider a linked list. The destructor for a link should be a function that calls release() on the link’s next pointer². Similarly, the destructor for the list head calls release() on both first and last.

2.3 Dealing with Cascading Frees

Consider the linked list example above: if implemented naively, freeing a very large linked list (think millions of elements) will at best cause the program to spend many consecutive cycles on freeing links, and at worst cause the program to run out of stack space. Neither behaviour is acceptable.

The system you are implementing must avoid both these pathological cases, by supporting setting an upper limit on how many objects are free’d at once (set_cascade_limit()).

To avoid oversubscribing to memory, you can delay outstanding free’s until next allocation. When freeing in conjunction with allocation, the cascade limit must be respected as usual, unless the amount of memory free’d is less than the bytes requested for allocation. In that case, your system is allowed to free objects until the requested number of bytes have been free’d or there are no more objects to free.

To reduce memory pressure, the system should also provide a cleanup() function that free’s all objects whose reference count is 0, regardless of the cascade limit.

2.4 Complete Header File

Listing 1: refmem.h

#pragma once

typedef void obj;
typedef void(*function1_t)(obj *);

void retain(obj *);
void release(obj *);
size_t rc(obj *);
obj *allocate(size_t bytes, function1_t destructor);
obj *allocate_array(size_t elements, size_t elem_size, function1_t destructor);
void deallocate(obj *);
void set_cascade_limit(size_t);
size_t get_cascade_limit();
void cleanup();
void shutdown();

The function rc() returns an object’s reference count. The allocate() function is the equivalent of malloc() and takes a function pointer to a destructor function, possibly NULL. The allocate_array() function is the equivalent of calloc() and too takes a function pointer to a destructor function, possibly NULL, which is called for each non-NULL element. The function deallocate() is the equivalent of free() but is only allowed to be called on objects p for which rc(p) = 0. Calling deallocate() on an object runs its destructor (if any).

Note: You are allowed³ to call malloc(), calloc() and free() under the hood to allocate and return memory. Think of these as low-level functions whose existence you are hiding from your users.

2.5 Dealing With Reference Count Overflow

Naturally, the size of the reference counter is an important design choice. For example, choosing an 8 bit integer, we keep the memory overhead small but are then only allowed 255 incoming references to each object⁴. Choose the size of your reference counter wisely and document the choice and explain how your system handles (or not) reference counter overflow.

3 Non-Functional Requirements

Naturally your program must not leak memory or have other memory bugs or bugs in general.

3.1 Amortising the Memory Management Overhead over Program Run-Time

Amortising the memory management overhead over the program run-time means that the cost of managing memory is distributed relatively evenly over the the program’s execution. Contrast this with automatic garbage collectors that suspend the program completely to manage memory from time to time.

You achieve this by properly respecting the cascade limit (see above).

3.2 Memory Efficiency

You may only use a constant factor of memory overhead in your implementation, relative to malloc() / calloc(). For example, a program that uses B bytes of memory with malloc() / calloc() may only use k * B bytes using your system. Additionally, k may not be larger than 2.

You must hand in a convincing proof/back-of-envelope calculation that explains why your implementation satisfies this requirement⁵.

For the sake of the calculation, you may assume that no allocation is smaller than 8 bytes⁶, but you may not assume a limit on the number of allocations made by programs using the system, nor a limit on the amount of live objects or the amount of (logical) garbage.

Keeping memory overhead low is important, but ignore that in your first prototype. Solve the problem of getting something up and running first, then figure out how to optimise. Here are some tricks you can use to keep memory overhead small. These may or may not be usable together:

Use a small scalar value for your reference count.
When an object is garbage, you can reuse its memory – including its reference counter – to store whetever you want.
If the number of destructors is small, there might be a more compact representation to store them than through a pointer.

Try to make the simplest and smallest number of optimisations needed to satisfy the requirements. Remember that keeping the code readable is of paramount importance! Don’t go XOR’ing pointers or anything until you actually need to⁷.

3.3 Validation and Verification

Since you are providing a low-level service on top of which many other services are expected to build, your implementation must be rock solid. You must provide a comprehensive set of tests that demonstrate the stability of your system, and the fulfilment of its specification. Use a combination of CUnit and valgrind to harden your implementation.

You must document code coverage for your tests.

4 Demonstration

As additional proof that your implementation is correct, you will provide a version of a successfully demonstrated Z101 program, adapted to use your memory management system. This involves using retain() and release() sensibly throughout the program.

You are required to change your tree and list libraries so that trees and lists storing object pointer elements call retain() on the elements on insertion (etc.). Naturally, removing an object pointer element from a list will call release() on it.

5 Simplifications and Assumptions

You may have to make assumptions about the programs that your system supports. All such assumptions must be carefully documented. Any upper limit on reference count values, for example, fall under this category.

6 Example Use

This program creates a chain of two linked cells, ensures that their reference counts are both 1, and then releases the first cell. Since that cell holds the first one, both cells should be free’d as a side-effect. You should be able to run this program with valgrind and find zero memory errors.

Listing 2: example.c

#include "refmem.h"
#include <assert.h>

struct cell
{
  struct cell *cell;
  int i;
  char *string;
};

void cell_destructor(obj c)
{
  release(((struct cell *) c)->cell);
}

int main(void)
{
  struct cell *c = allocate(sizeof(struct cell), cell_destructor);
  assert(rc(c) == 0);
  retain(c);
  assert(rc(c) == 1);

  c->cell = allocate(sizeof(struct cell), cell_destructor);
  assert(rc(c->cell) == 0);
  retain(c->cell);
  assert(rc(c->cell) == 1);

  c->cell->cell = NULL;

  release(c);

  return 0;
}

7 Getting Technical Help

Naturally, you are allowed to ask for technical help (as well as help on anything else).

8 Addition in 2018: Default Destructors

To simplify programming, the system should support a notion of default destructors, that saves the programmer from the burden of writing a destructor manually. Let \(o\) be an object with a default destructor that is just about to be freed. Before freeing \(o\), the default destructor will call release() on all the pointers in \(o\). For example, if \(o\) is a linked list head in a list with a head and tail pointer, we are going to call release() on both the head and the tail. If \(o\) is a link in the list, we are going to call release() on the next pointer.

Implementing default destructors can be done in several ways. Here are two suggestions:

Scan the object for pointers
Require the programmer to describe the object

8.1 Scan the Object for Pointers

Sticking with the \(o\) example from above, in this implementation, we need to look at all possible pointers in \(o\) – essentially all sizeof(void *) blocks that could possibly hold a pointer, e.g., all starting addresses \(o+i\) where \(i\) denotes all offsets into \(o\) where a pointer could be stored. (Note that alignment rules for structs impose restrictions on the starting addresses of fields based on the sizes of fields.)

For each possible pointer, we need to check whether the data we are looking at is a pointer or not. For example, let’s say that we look at sizeof(void *) bytes starting at \(o+4\) (e.g., void *possible_pointer = o + 4;) – how can we know if that is a pointer to something, or the list’s size field? One possibility to answer this question with relatively high probability is to keep track of where we have allocated data. Since all allocations go through our own functions, we can simply store that somewhere.

Now, the algorithm is essentially this:

For each possible starting location \(p\) of a pointer inside \(o\):
1. check if \(p\) is an address where we currently have allocated memory
  1. if so, treat \(p\) as a valid pointer and call release() on it
  2. if not, ignore \(p\) and move on to the next \(p\)

This also requires:

Whenever we allocate at an address \(p\), remember \(p\) in some collection \(A\).
Whenever we free an object at an address \(p\), remove \(p\) from \(A\).
Add the size of each object to its metadata.

Aside: We would like to store \(A\) in an efficient way. Naturally, we only optimise once we have something that works, so think of that second, not first. (A bitmap is a good idea.)

Feel free to improve on this technique if you want to!

8.2 Require the Programmer to Describe the Object

A problem with the technique above is that it is approximate. If we happen to have an integer field in a place where a pointer could be stored, and the value of that integer happens to be a an address in \(A\), then we might cause an error in the program due to premature deallocation.

An alternative approach is to require the programmer to specify where in the struct pointers are stored⁸. This can be handled in several ways. One method is explained in painful detail in Project 2, another way is to add a metadata function that uses C macros and require that the programmer names the fields which contain pointers, e.g., like this:

// Register metadata about the pointer fields f1 .. fn of type type
register_typeN(type, f1, f2, ... fn); 
// Allocate using a registered type 
allocate_from_type(type);

If these calls are implemented as macros, they can be turned into calls to the “actual methods”, here named with prefixing __.

__register_typeN(#type, sizeof(type), offsetof(type, f1), offsetof(type, f2), ... offsetof(type, fn));
__allocate_from_type(#type);

Note that #int in a macro returns the string "int" so if I write

register_type2(list_t, first, last);
allocate_from_type(list_t);

and my macro is defined thus (fill in the blanks for allocate_from_type()):

#define register_type2(type, f1, f2) \
   __register_type2(#type, sizeof(type), offsetof(type, f1), offsetof(type, f2))

the resulting call after macro expansion is this:

__register_type2("list_t", 24, 0, 8);
__allocate_from_type("list_t");

if the size of list_t is 24 bytes and the offsets of the fields are 0 and 8 respectively. The offsetof() macro calculates the starting point of a field \(f\) in a type \(t\), so basically gives you the \(i\)’s discussed in the scanning strategy.

To support different numbers of fields, you either implement several functions for registering and allocating, e.g., register_type1(), register_type2(), …, register_typeN(), or use a trick to get overloading based on the number of arguments.

The key difficulty with this implementation is to store the metadata somewhere. For this, we can use the hash table from Assignment 1 where the keys are the types and the values is the size of the types the offsets of the pointer fields. Abstractly speaking, __register_type2("list_t", 24, 0, 8); will map “list_t” to [24, 0, 8] and __allocate_from_type("list_t"); will look up 24 to know how many bytes to allocate.

Each object now needs to know its type so that when an object \(o\) is destroyed we can do the following:

Get \(o\)’s type \(t\)
Use \(t\) in the global hash map to get the offsets \(i_1...i_n\) of all the pointer fields in \(t\)
For each offset \(i\), call release(o+i).

To make memory leaks easier to detect, we should also remove the hash table etc. in the shutdown() function whose job it is to completely tear down the library and its associated data.

8.3 Wrapping Up

It should be possible to allocate an object and get a default destructor “for free”. What “for free” means depends on your implementation. Strategy 1 (scanning) is truly for free but may under rare circumstances not work out in practise. Strategy 2 (meta data) is a little harder on the programmer, but less brittle – unless you use union types, in which case you need to fall back on explicit constructors. Feel free to come up with alternative ways of implementing support for default constructors. Always start by doing the simplest thing possible and once that works, move to e.g. a more efficient or general solution.

Questions about stuff on these pages? Use our Piazza forum.

Want to report a bug? Please place an issue here. Pull requests are graciously accepted (hint, hint).

Nerd fact: These pages are generated using org-mode in Emacs, a modified ReadTheOrg template, and a bunch of scripts.

Ended up here randomly? These are the pages for a one-semester course at 67% speed on imperative and object-oriented programming at the department of Information Technology at Uppsala University, ran by Tobias Wrigstad.

Footnotes:

And friends calloc() and realloc().

And also on its element, if that is an object pointer.

Or possibly more correct: expected to. It is possible to request memory from the system directly, but that would complicate this project considerably.

⁴

Is that enough? How does one find out?

⁵

It can be expected that your system needs a few data structures set up even before the first allocation, so for programs that do not use lots of memory, you are going to have k > 2. Thus, we relax the requirement to say must approach k = 2 as the number of allocation grow. Yes, I buried this relaxation in a margin note to stress the importance of reading specifications closely. Contratuations!

⁶

For other reason, it is not clear that it makes sense to allocate less than 8 bytes even for a smaller request, but that is an orthogonal issue.

⁷

Where needing to must be proven empirically by testing a prototype of the system.

⁸

Note that this does not handle unions well.