Posts Tagged ‘c’
The simplest linked list implementation in C
I recently needed to develop a way to perform pseudo-random memory accesses in a C program, in order to fool the hardware prefetchers. Without going into any details, my solution was to represent some continuous memory as a linked list with random next pointers.
Naturally, the simplest list that can be developed has nodes that do not contain any other elements apart from the next pointers that link the nodes together. In C, this can be easily done with a structure similar to the following:
struct node { struct node* next; } |
However, one can leverage the flexibility of C to even avoid using the node
structure: you are allowed to cast some memory to any type you like. For instance, you can use 8 bytes as an integer(uint64_t
) , or as a pointer to an integer (uint64_t*
).
Doing so, we can develop, what I believe is, the simplest linked list possible.
The code that generates a random list is the following:
static void create_rand_list(volatile uint64_t* list, size_t n) { srand(time(NULL)); uint8_t* used = calloc(n, sizeof(uint8_t)); assert (used != NULL); size_t idx = 0; size_t used_num = 0; while (used_num < n - 1) { used[idx] = 1; used_num++; size_t nxt; do { nxt = rand() % n; } while (used[nxt]); list[idx] = (uint64_t) (list + nxt); idx = nxt; } list[idx] = 0; /* NULL pointer in the last element */ free(used); } |
The “magic” happens when we set list[idx] = (uint64_t) (list + nxt)
, so we set the “next” pointer for the location idx.
What is even more interesting is how we can now parse and get the size of the list:
static void list_parse(volatile uint64_t* l) { while (*l != 0) { l = (uint64_t*) *l; } } static size_t list_size(volatile uint64_t* l) { size_t s = 1; while (*l != 0) { l = (uint64_t*) *l; s++; } return s; } |
Pretty slick, right? 🙂
GCC — 64bits addressing: function returns 32bits pointer
Today, it was the second time I stepped on an interesting problem with GCC Â (well, it is actually a behavior). I created a function which returns a void*
in a .c file. This C file was then compiled and added in a library (.a). When I used this function in an application, I was getting a void*
pointer were the 32 most significant bits were either zeroed (0x00000000...
) or set to 1 (0xFFFFFFFF...
). My application is 64bit!!
For example, the debug prints I added would return:
[lib] allocated 0x7f756d6fa048 [app] allocated 0x6d6fa048 |
where you can see the “conversion”.
What was the problem? After some time of debugging, I realized that I had forgotten to include the aforementioned function in the corresponding header file :|. So, although GCC could find the function in the library I was linking the application to, I guess it was assuming a wrong return value/header for that function.
The conclusion: Be more careful 😉
PS. I am using the following version of GCC:
$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.3-1ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) |
Structures and Memory Consumption in C
Motivation
The way you design you structures in C can have a (significant) impact on the size they occupy in memory!
Memory structure of a structure
A structure occupies contiguous space in memory. Though, in order to provide efficient access to the structure’s fields, C uses padding. In other words, the fields are aligned to the memory words
.
Example
In a system with:
word : 4 bytes
char : 1 byte
int : 4 bytes
without padding would be:
char (8bit) |
int (32bit) |
---|
and since the word
size is 4 bytes
the integer would belong to 2 different words
, so in order to access it, two memory accesses would be necessary.
In order to avoid it, the actual memory organization is:
char (8bit) |
padding (24bit) |
int (32bit) |
---|
The Possible Problem
A bad designed structure can possible occupy more space than needed.
Example
Take a look at the following structure:
struct bad { char c1; int i1; char c2; }; |
One would possibly say that since 1 + 4 + 1 = 6 bytes
is the total size of the fields, 2 words = 8 bytes
will be needed to store it. This is not true. Due to the padding we have:
c1 (8bit) |
padding (24bit) |
i1 (32bit) |
c2 (8bit) |
padding (24bit) |
---|
3 words = 12 bytes
, 4 more bytes
than necessary.
Solution
Just take into account how the types fit together:
struct good { char c1; //c1 + c2 = 2 bytes char c2; int i1; }; |
This design needs the minimum number of 2 words = 8 bytes
.
Outcome
The bigger the structure gets, the more space you can lose.. So, take care of it while designing the structures!
Pointers and Arrays in C
As anyone who had programmed in C knows, pointers can be (also) used to access the data of an array.
From the Wikipedia article about C language:
C supports the use of pointers, a very simple type of reference that records, in effect, the address or location of an object or function in memory. Pointers can be dereferenced to access data stored at the address pointed to, or to invoke a pointed-to function. Pointers can be manipulated using assignment and also pointer arithmetic. The run-time representation of a pointer value is typically a raw memory address (perhaps augmented by an offset-within-word field), but since a pointer’s type includes the type of the thing pointed to, expressions including pointers can be type-checked at compile time. Pointer arithmetic is automatically scaled by the size of the pointed-to data type.