Learn With Me: Elixir

Lists in Elixir are actually linked lists, which are particularly suitable for functional programming because they go well together with recursion. Since a list is a linked list, modification and sharing is cheap, but random access is expensive, since it is an O(n) operation.

I've talked about how Elixir likes to take advantage of data being immutable to share pieces of data structures so that it doesn't have to make a full copy the entire data structure. Lists are very well suited to that. You can not only reuse portions of the list when it is modified (thereby creating a new list). List are highly optimized for recursion because you can create a sub-list from bigger list just by pointing to the next node in the list.

So let's look at a list literal. It uses a pair of brackets [] and the elements are separated by commas. Like Javascript arrays, lists can contain items of any data type.

iex> list = ["Gaz", :name, 3.14, 2]
["Gaz", :name, 3.14, 2]

The closest thing to a list in Javascript is an array, but an array is not a linked list. In C#, List<T> comes to mind, since it has the same name, but an Elixir list is more like LinkedList<T>, which is a linked list implementation.

An Elixir list is best used for iteration, prepend operations (adding to the front of the list), and sequential-access operations.

Modifying a list is cheaper than modifying a tuple, depending on how a list is modified. Prepending to the front of a list is an O(1) operation whereas appending to the end of a list is an O(n) operation. Here's what a linked list looks like:

The stars are pointers to the next element in the list.

If you prepend an element to the front of the list, a new node is created that points to the the node that used to be the front of the list. Any references to the previous list would still point to the original node and no copying would be done.

A new node is prepended to the linked list

The reddish node is the prepended node. It is added to the beginning of the list and points at the node that was the previous first node in the list.

If you append an element to the end of the list, you're going to have to copy the end node and point itto the new node, which will require the previous node to be copied and made to point to that copy, and so forth back down to the beginning. At least, that's how I think Elixir implements it. It's certainly possible that there's a better implementation in place, but Elixir makes it clear that appending is not efficient.

Making Heads and Tails of Lists

Lists are made up of a head and a tail. The head is the first item in the list and the tail is the rest of the list.

That tail also consists of a head (the first element) and a tail (the rest of the list).

This keeps going until the tail consists of an empty list. An empty list does not have a head and tail. As you can see, this a great data structure for recursive calls, where we can keep getting one element at a time (the head) and the function can keep calling itself using the tail until there's just an empty list left.

Since a list is just a recursive structure of heads and tails, a head element can be separated from a tail by the | character. So the list [1, 2, 3] can also be written like [ 1 | [2, 3]], which is a list where the head is 1 and the tail is [2, 3]. That tail can also be expanded to a head and a tail, and that tail can be expanded, and so forth.

Fully expanded, the list [1, 2, 3] is [1 | [2 | [3 | []]]], where each tail is further divided into a head and tail until the tail is just an empty list.

Let's demonstrate this in IEx. The most compact syntax (which we will mostly use in our actual work) creates the same list as the fully expanded syntax.

iex> [1, 2, 3]
[1, 2, 3]
iex> [1 | [2, 3]]
[1, 2, 3]
iex> [1 | [2 | [3]]]
[1, 2, 3]
iex> [1 | [2 | [3 | []]]]
[1, 2, 3]

You can also think of the internals of the list as looking like this, a classic linked-list representation.

Prepending

We can prepend to a list using the | operator as well. It's the same basic concept: adding a head element to an existing list, which becomes a tail of the new list.

iex> list = [3 | []]
[3]
iex> list = [2 | list]
[2, 3]
iex> list = [1 | list]
[1, 2, 3]
iex> list = [4 | list]
[4, 1, 2, 3]

At this point you might be thinking "But I want to add items to the end of my list, otherwise I'll have a list in reverse order!". That's not a problem. The way Elixir code solves that problem is to prepend elements to a list and then reverse it all using the Enum.reverse/1 function. The Enum.reverse/1 function is highly optimized, and doing this is way more efficient that constructing the same list by appending.

iex> list = [4 | [3, 2, 1]]
[4, 3, 2, 1]
iex> Enum.reverse(list)
[1, 2, 3, 4]

Appending

You can append to a list. My first instinct was to use the same operator to append a value.

iex> [[1, 2, 3] | 4]
[[1, 2, 3] | 4]

This result made no sense to me when I first saw it. Lists don't look like that! IEx should be showing it as [1, 2, 3, 4]. It turned out that this is the notation for an "improper list", which is Elixir-speak for "Your list is all messed up". The problem is that by prepending (which is what this operator actually does) the list [1, 2, 3] to the value 4, we no longer have a list. The value 4 is not a proper tail (which is alway a list) and the list that was created does not end in an empty list.

From what I can tell, improper lists are useless and cannot be used for anything. I wonder why Elixir just doesn't thrown an error instead of creating the stupid improper list in the first place. I have no doubt that was a deliberate decision, but I have no idea why.

So my next thought was to make 4 into a list so that we'd have a properly formed list! Good idea, right? Well, this does result in something, but it wasn't what I expected.

iex> [[1, 2, 3] | [4]]
[[1, 2, 3], 4]

We end up with a list of length 2, where the first element is [1, 2, 3] and the second element is 4. That's because whatever comes before the | operator is the head of the list, which is always a single element. So it made the list [1, 2, 3] the head element. An element in a list can be any sort of data, even another list. Lists can be nested within each other with no problems.

Attempting to concatenate two lists together with the | operatior results in something very similar.

iex> [[1, 2] | [3, 4]]
[[1, 2], 3, 4]

The head element became the first element in the new list, because that's how heads and tails work.

It turns out that the correct way to append an element to a list is to make it into a list and use the list concatenation operator ++. Lists of any size can be concatenated in that way. It's not a efficient as prepending, but it gets the job done.

iex> [1, 2, 3] ++ [4]
[1, 2, 3, 4]
iex> [1, 2, 3] ++ 4
[1, 2, 3 | 4]  #Improper list
iex> [1, 2] ++ [3, 4]
[1, 2, 3, 4]

I would recommend against implementing an algorithm that repeatedly concatenates lists to build a bigger one, but doing one-time concatentations will probably be just fine in many cases.

Keep in mind that none of these list operations actually modify a list because the data in Elixir is immutable. They just create a new list that is a modified version of the original. You can rebind the new list to the original variable to make it look like a modification, but in reality the old list is thrown away and replaced with a new list.

The advantages of immutable data are still a little hard for me to wrap my head around. A lot of the stuff I had to worry about with mutable data just doesn't apply here. I can throw any data structure around how I like and not have to worry about it getting modified somehow. It can be shared and reused, but it will never change from under me. Variables can be rebound, but they will always refer to a completely new data structure.

Other List Operations

You can find the difference between two lists with the -- operator.

iex> [5, 10, 2, 6] -- [4, 5, 6]
[10, 2]

The result is the elements in the first list that were not in the second list. You can think of it a bit like subtraction where we are subtracting the second list from the first list. What happens if there are elements in the second list that are not in the first list?

iex> [5, 10, 2, 6] -- [4, 5, 6, 21, 1]
[10, 2]

It comes up with the same results. This operation just examines what was in the first list and takes away any elements that are also in the second list. Any elements in the second list that are not in the first list have no effect on the results.

You can test if an element is in a list using the in operator.

iex> 10 in [5, 10, 2, 6]
true
iex> 1 in [5, 10, 2, 6]
false

You can find the length of a list using the length/1 function.

iex> length([1, 2, 3])
3
iex> length(["Dib"])
1
iex> length([])
0

Functions that are specific to lists can be found in the List module in the standard Elixir library. A list is also an enumerable, so the functions in the Enum module are also available to us when operating on a list. In fact, most of the functions we'll apply to a list actually come from the Enum module. We'll go over what enumerables are in a following post.

Interestingly, the length function is found in the Kernel module instead of the List or Enum module. I'm curious why that is. It would make more sense to me if it were in the Enum module.

You can retrieve an element from the list with the Enum.at/2 function.

iex> Enum.at([1, 2, 3], 0)
1
iex> Enum.at([1, 2, 3], 2)
3
iex> Enum.at([1, 2, 3], 5)
nil

Keep in mind that unlike a tuple where it's an O(1) operation, accessing a specific element in a list is a O(n) operation.

Learn With Me: Elixir - Lists (#14)

Kevin Peter

Kevin Peter

Making Heads and Tails of Lists

Prepending

Appending

Other List Operations

Learn With Me: Elixir - ElixirLargeSort IntSort Project Part 4 (#82)

Learn With Me: Elixir - ElixirLargeSort IntSort Project Part 3 (#81)

Learn With Me: Elixir - ElixirLargeSort IntSort Project Part 2 (#80)

Learn With Me: Elixir - Character lists (#15)

Learn With Me: Elixir - Tuples (#13)