← Back to blog

Refreshing C Fundamentals: Rebuilding Data Structures From Scratch

·10 min read·By Mohamed Yassine Hemissi
CData Structures

Introduction:

This is my attempt to refresh my understanding after not coding in C for a while, and to bypass the AI-induced recall issues that come from day-to-day exposure to autocompletion. I’ll be replicating data structures and algorithms from scratch.

Tools:

In order to make this happen, I decided to just use Neovim. I’ve never really been a Neovim guy in my life (I used Vim mostly in terminals for quick edits, but it was never my main editor), but at least I know how to quit it. I first tried the kickstart configuration, but it seems to be deprecated, so I moved to LazyVim, which is more than enough for my use case.

I also had to get Neovide running with the --opengl flag, since my Windows setup seems to have an issue rendering the window opacity correctly. For algorithm references, I decided to use a mix of what I already know — since I’ve studied most of these structures throughout my CS studies — along with well-known books like Grokking Algorithms and Data Structures and Algorithms (the bible). I’ve also included ChatGPT as a partner for writing tests for my code and auditing when needed.

Data Structures & Algorithms Implemented:

Hashtable:

I started with hash tables since I’ve always been a fan of their O(1) retrieval, which helps a lot with overall complexity, and because key-value retrieval logic is always a boost. This implementation is simple: I created a structure that holds a pointer to a hash table record, along with the total size and capacity.

I use the well-known load factor threshold formula to dynamically increase the table capacity, and some chained list logic since the hash table record contains a pointer to next, forming a singly linked list. Basically, we take a key, hash it, apply modulo with the capacity, and store it at the index corresponding to the remainder of the division, chaining entries when collisions occur.

Vectors:

I still remember the first time I used C after coming from high-level languages like Lua and JavaScript — I was wondering why arrays don’t grow dynamically. Either you had to use the classic (and space-inefficient) technique of allocating a large fixed size, or spend time crying over realloc error handling.

Fortunately, C++ had std::vector, and here I implemented something similar. The structure is essentially a pointer to pointers, but this time generic, with dynamic allocation when the total size is reached. The rest is basic array logic with element shifting when needed.

Heap:

Being an iconic data structure used to implement some of the cleanest algorithm versions like Heap Sort, Dijkstra, or Prim (for that tree stuff), I had to implement it. I reused my vector object, which made it more fun at this point since I’m building on top of what I already wrote.

The implementation is based on the parent / two-children logic, with support for both min-heap and max-heap behavior. The most important functions here are step_up and step_down: one moves an element up, the other moves it down if constraints allow it. A heapify function orchestrates these steps recursively to maintain heap constraints.

I also implemented a heap_build function that uses these step functions starting from the last parent, allowing a full heap construction in O(n).

Links

Github Repository