LinkedList in Rust

03 Jan, 2018

I have never actually implemented any data structure in Rust from scratch. So, I thought why not spend a couple of hours doing that over the last weekend.

If you don’t know already, Rust is a programming language designed to be safe and fast. The compiler requires all programs to be written following certain ownership rules, as they call it and ensures that programs don’t access invalid memory. There’s more to it, but that would suffice for now.

I will assume you are familiar with generics, types, regular programming constructs like if, for etc.. I will also assume that you have some familiarity with how the borrow checker works.

In this post, let’s implement a generic list which supports append and remove operations and later on, ways to extend the list to support more operations.

Like many languages, Rust supports structs. Let’s start by defining our List and Node structures. Our List would contain a length field, and a pointer to the first node. Node will contain the actual data along with a pointer to the next Node.

1struct List<T> {
2    head: Option<Box<Node<T>>>,
3    len: usize
4}
5
6struct Node<T> {
7    data: T,
8    next: Option<Box<Node<T>>>
9}

You probably guessed it, but <T> denote the use of generic types as in many other languages.

Option and Box must have piqued your interest.

In Rust, Box is a heap-backed pointer type(like a pointer returned by malloc). Box<T> is a pointer to the memory location which contains a value of type T. Also, Box is the owner of whatever it points to which means that you can change the contents of the memory location pointed to by the Box only through it.

There is no such thing as an empty Box. This means that Box always points to a valid memory location. To represent a NULL pointer, we make use of the Option type and wrap our Box inside that. Option is an enum, which either contains some value denoted by Some(val) or None which denotes nothing is inside. In this case, we are telling that the head is either a pointer to a Node or nothing.

Let’s go ahead and create an associated function(similar to a static method in other languages) which will return an empty list. In Rust, we define associated functions and methods for a struct inside an impl block. We will see why when we implement the remove method.

 1struct Node<T> {
 2    data: T,
 3    next: Option<Box<Node<T>>>
 4}
 5
 6struct List<T> {
 7    head: Option<Box<Node<T>>>,
 8    len: usize
 9}
10
11impl<T> List<T> {
12    // Create an empty List
13    fn new() -> Self {
14        List {
15            head: None,
16            len: 0usize
17        }
18    }
19}
20
21fn main() {
22    let _ = List::new();
23}

1$ rustc list.rs

Before going on to implement other functionality, let’s talk a bit about the ownership rules I mentioned above because they will come into play around this point.

In Rust, there’s always an owner to a value. If you do let list = List::new();, list is the owner of the value returned by List::new. If you assign, list to another variable(list2) using a let expression like below, you can no longer access the list variable. You effectively transferred the ownership of the value from list to list2.

1let list = List::new();
2
3let list2 = list;

But having only owned values doesn’t cut it, so we also have

mutable(&mut T) references: can modify the contents and
immutable(&T) references: can’t modify the contents.

Ownership rules ensure that at any time there is either a single mutable reference or multiple immutable references but not both. I won’t get into the details why(read the book). All programs should follow the rules or else they won’t compile.

Appending to the list

Attempt #1

 1impl<T> List<T> {
 2    fn new() -> Self {
 3        List {
 4            head: None,
 5            len: 0usize,
 6        }
 7    }
 8
 9    fn append(&mut self, data: T) {
10        let new_node = Some(Box::new(Node {
11            data: data,
12            next: None,
13        }));
14        if self.head.is_none() {
15            self.head = new_node;
16            self.len += 1;
17            return;
18        }
19        let mut head = self.head.as_mut().unwrap();
20        loop {
21            let next = head.next.as_mut(); // <- right here
22            if let Some(next) = next {
23                head = next;
24            } else {
25                break;
26            }
27        }
28        head.next = new_node;
29        self.len += 1;
30    }
31}

Of course, this didn’t compile. The reason is that the mutable reference inside the loop should be valid till the end of the method to enable us to do head.next = new_node;. But when we hit the second iteration of the loop, we get another mutable reference which violates the single mutable reference rule. So, rustc complains:

1$ rustc list.rs
2error[E0499]: cannot borrow `head.next` as mutable more than once at a time
3  --> test.rs:33:24
4   |
533 |             let next = head.next.as_mut();
6   |                        ^^^^^^^^^ mutable borrow starts here in previous iteration of loop
7...
843 |     }
9   |     - mutable borrow ends here

Attempt #2

After thinking for a while, I came up with an idea along these lines: move the head out of the List, taking ownership of the first node, iterate till the end checking the next pointer for None at every node and inserting the new node there.

As we move the value out of the list, there would only be a single owner and we can mutate it safely. All good, Rust is happy and so am I!

But there’s a small gotcha here. Until you add the new element at the end, you can put the node back into it’s position as that would get us back to the same problem we started with. Recursion to our rescue!

 1impl<T> List<T> {
 2    fn new() -> Self {
 3        List {
 4            head: None,
 5            len: 0usize,
 6        }
 7    }
 8
 9    fn append(&mut self, data: T) {
10          // allocate new node on the heap
11        let new_node = Some(Box::new(Node {
12            data: data,
13            next: None,
14        }));
15        // move first node out of the list
16        if let Some(mut head) = self.head.take() {
17            add_node(&mut *head, new_node);
18
19            // put back the first node
20            self.head = Some(head);
21
22        } else { // list is empty
23            self.head = new_node;
24        }
25        // increment the length
26        self.len += 1;
27    }
28}
29
30fn add_node<T>(node: &mut Node<T>, new_node: Option<Box<Node<T>>>) {
31    if node.next.is_none() {
32        // append to the list
33        node.next = new_node;
34    } else {
35        // move the value out of the next pointer
36        let mut next = node.next.take().unwrap();
37
38        // send in a mutable reference
39        add_node(&mut next, new_node);
40        
41        // put back the node
42        node.next = Some(next);
43    }
44}

That’s it! You can append a value to the list now.

 1fn main() {
 2    // mut here says that you are going to modify the value of list later on
 3    let mut list = List::new();
 4    
 5    list.append(7);
 6    list.append(8);
 7    
 8    let first = list.head.unwrap().data;
 9    let second = list.head.unwrap().next.unwrap().data;
10    
11    println!("{} {}", first, second);
12}

Removing from the list

Removing from the list is pretty straightforward, given that we have already seen how to implement append.

 1struct Node<T> {
 2	...
 3}
 4
 5struct List<T> {
 6	...
 7}
 8
 9impl<T> List<T> {
10	fn new() -> Self {
11		...
12	}
13
14	fn append(&mut self) {
15		...
16	}
17}
18
19impl<T: PartialOrd> List<T> {
20    fn remove(&mut self, data: T) {
21        if let Some(mut head) = self.head.take() {
22            if head.data == data {
23                self.head = head.next.take();
24                self.len -= 1;
25            } else {
26                if remove_node(&mut *head, data) {
27                    self.len -= 1;
28                }
29                self.head = Some(head);
30            }
31        }
32    }
33}
34
35fn remove_node<T: PartialOrd>(node: &mut Node<T>, data: T) -> bool {
36    if node.next.is_none() {
37        false
38    } else {
39        let mut next = node.next.take().unwrap();
40        if next.data == data {
41            node.next = next.next.take();
42            true
43        } else {
44            if remove_node(&mut *next, data) {
45                node.next = Some(next);
46                true
47            } else {
48                node.next = Some(next);
49                false
50            }
51        }
52    }
53}

Perhaps, the most interesting thing here is the new syntax: impl<T: PartialOrd> which says - implement the remove method for this list if the underlying type T can be compared using ==. PartialOrd is a trait (similar to an interface) and T: PartialOrd is a trait bound on T.

With multiple impl blocks having different trait bounds, we can implement additional functionality for our list depending on what the underlying type supports.

At this point, you are probably bored reading stuff and want to do something on your own. Try changing our append only list to be able to insert/remove element at an index.

You can find the full code sample here. If you find any mistakes or have suggestions, I am all ears.

#Rust