Learning Zig Through Brute Force Pt. 1: Optional Pointers

Background#

I have a desire to do more coding, I want to sharpen my own skills and get back to a goal of mine from ~7 years ago to build my own operating system from “scratch”. When considering what language I want to get my hands on, I evaluated these factors.

The little code I do write is usually in Go, so I want something different and something lower level.
Rust is an obvious choice, but as a previous card carrying member of the Rust evangelism task force I have already gone down Rust’s journey. To be frank, I like Rust as a concept more than I enjoy writing Rust.
C or C++, I haven’t written C since college, there are also dozens of off the shelf low level things already implemented in C and C++ so I’m not exactly trail blazing anything. This lead me to believe I would end up copying someone else’s code more than writing my own.

Then a new contender appeared, Zig. I remember hearing about Zig circa 2019 but I was fully immersed in Rust to pay it much attention. I was recently reading into Ghostty terminal and discovered, much to my surprise, that it was written in Zig.

At a quick glance, Zig appeared to be very similar to C syntax but…. somehow a far reduced version. After a grand total of 5 minutes reading some of the Zig docs I decided I would give it a fair shake.

In order to learn Zig I decided to build something. In a matter of about an hour I managed to get Zig installed, got my neovim config supporting Zig’s syntax and was able to a pretty basic Stack implementation running and working.

This exposed me to a few concepts that I found useful, but I wanted to something a bit more complex. I decided to try an write a Binary Tree without using any of the standard libraries Data Structures. This seemed like a reasonable enough task, and I have implemented a similar Tree in Rust years ago. I assumed this would give me a chance to compare Zig and Rust, but I wanted to go a bit further with this Zig Implementation. Little did I know this simple tree would end up taking me down several deep rabbit holes into Zig and Zig’s ecosystem that taught me a ton, and took way longer than I would like to admit.

I would not recommend diving into languages like Zig without reading the docs thoroughly, but I was too eager and I truly felt like I have internalized so much more by doing this rather than reading the language spec first.

Building a Basic Tree#

I decided that I could shortcut a tree by building a Node type, then using that type in a main function to simulate the behavior of a Tree. Instead of that I wanted to fully implement a Tree type, that consumed a Node, and the user of Tree doesn’t have to think about the underlying Nodes and they are given a simple interface to interact with our tree.

Let’s Define the Node#

While implementing a Tree in Zig I defined my Node type like this

pub fn Node(comptime T: type) type {
  return struct {
    value: T,
    left_child: ?*Node(T)
    right_child: ?*Node(T),

    // ... more implementation 
  }
}

At a high level, this code block

declares a Node object, that accepts a value type argument T that is not known until compile time.
Node has a value property that is the same type as T
Each Node has a left_child and a right_child, these are optional pointers to another Node values of the same type T.

I created each child node as an optional pointer, which may not seem appropriate at first. I chose to do this because Zig does not allow pointers to be null. It is one of the memory protections built into the language. So having an option is one way I was able to create references to things that may or may not exist. In other languages I could have just checked for null pointers.

Now that we have a base structure, we need to define how we will initialize the struct when one is created.

pub fn Node(comptime T: type) type {
  return struct {
    // ... 
    const Self = @This();

    pub fn init(value: T) @This() {
      return .{
        .left_child = null,
        .right_child = null,
        .value = value,
      }
    }
  }
}

There are a few interesting syntax points here:

const Self = @This() is declaring a common Zig idiom for holding a reference to our defined type. This will be used in just a second, bear with me.
We define a function init() that accepts some T value, and returns @This(), which here is an instance of Node.
we return a struct that has Node.value = value and all children set to null.

The return struct syntax of Zig is a little strange at first, but in more advance usages it actually makes a ton of sense. Before we go much deeper, lets finish out the node type by defining how to determine if our Node is a leaf.

pub fn Node(comptime T: type) type {
  return struct {
    // ... 
    pub fn is_leaf(self: *Self) bool {
      if (self.left_child || self.right_child) {
        return false;
      }
      return true;
    }
  }
}

Now we see where const Self = @This() comes into play, we can create instance methods by passing in a reference to Self as the first argument in our functions. Feels slightly python-y to me, but that syntax makes sense. We don’t have to create a constant and can just use @This() everywhere, but when you have longer function signatures the *Self idiom is a lot easier to read.

This is also the first foray into syntactic sugar for dealing with Optionals in Zig, lets deep dive on this a little

All the Options#

I wanted to experiment a little and luckily Zig allows us to define and execute tests cases that exist along side our source code in the same files!

Let’s whip up a quick little test case to verify what we can and can’t do with optionals in Zig.

test "ptr" {
    var num: i32 = 5;  // a constant
    var optptr: ?*i32 = null; // an option, that could be that constant or could be things
    try std.testing.expectEqual(optptr, null); // our option has nothing in it, so it is null
}

Ok, great, so options are null when there is no value. You can also directly unwrap a pointer in Zig by using the my_option.? syntax, you must be careful with this because unwrapping a null option will throw an error. You are generally expected to catch the option with a few different techniques.

test "ptr" {
    var num: i32 = 5;
    var optptr: ?*i32 = &num; // optptr now points to num, and is not longer null
    // we can  unwrap, and dereference the pointer and see the value is num's value.
    try std.testing.expectEqual(optptr.?.*, 5);

    // normal pointer rules apply, you update num and our pointer will also update.
    num = 6;
    optptr.?.* = 6;
    try std.testing.expectEqual(optptr.?.*, 6);

    // Catching our option with an if null
    optptr = &num;
    if (optptr == null) {
        // our option was null
        std.debug.print("optptr was null!\n", .{});
    } else {
        // Option had a value, safe to unwrap
        std.debug.print("optptr was not null!\n", .{});
        std.debug.print("derefed pointer was {d}\n", .{optptr.?.*});
    }


    // Now we can use a capture, with some syntactical sugar
    if (optptr) |value| {
        // this means optptr was not null
        //value is equal to optptr.? which is a pointer to an i32
        std.debug.print("optptr not null, value: {d}", .{value.*});
    } else {
        // this means our option was null
        std.debug.print("optptr was null", .{});
    }

    // we can also unwrap options with a default value if the option is null with orelse
    const default = 7;
    const new_option = optptr orelse &default;
}

Wonderful, we have a few ways of catching options and managing them. You can also traverse optional values with a loop, but we will cover that once we get to our Tree.

Now that the option syntax is a little more clear, my implementation of is_leaf() should be pretty straight forward. We are checking if either child is null, if both are null then it is a leaf, otherwise it is not a leaf.

Much More to Come#

We have a basic node, and up next we are going to build a Tree. If you enjoyed this, stick around because I have many more painfully learned lessons to follow.