Learn With Me: Elixir - Tuples (#13)
We're now going to start going over Elixir collection types. These are interesting to me because they are a bit different in nature than collections in most languages I've encountered.
What is a Tuple?
Some of you may already be familiar with tuples from other languages or from relational database concepts. A tuple is simply a group of data items. Think of a tuple as an array that's not intended for iteration, but simply for random access to data. Tuples are also typically immutable.
Python has tuples and more recent versions of C# have tuples as well. My guess is that Ruby does, considering Elixir's similarities to Ruby, but I don't know the language well enough to say for sure.
Here's an example from Python
data_tuple = ('Dib', 'Zim', 10, 3.14)
Here's an example from C#
//Using Tuple class
Tuple<string, string, int, float> data_tuple1 = Tuple.Create("Dib", "Zim", 10, 3.14);
//Built-in tuples as of C# 7
var data_tuple2 = ("Dib", "Zim", 10, 3.14);
Javascript has no concept of tuples, but something roughly equivalent can be created with an array.
Elixir Tuples
Elixir tuples mostly resemble tuples in other languages. Elixir tuples are stored contiguously in memory and are accessed by index. A tuple literal uses a pair of curly braces {}
.
iex> data_tuple = {"Dib", "Zim", 10, 3.14}
{"Dib", "Zim", 10, 3.14}
The elem/2
function is used to access a data item in a tuple.
iex> elem(data_tuple, 0)
"Dib"
iex> elem(data_tuple, 1)
"Zim"
iex> elem(data_tuple, 2)
10
iex> elem(data_tuple, 3)
3.14
iex> elem(data_tuple, 4)
** (ArgumentError) argument error
:erlang.element(5, {"Dib", "Zim", 10, 3.14})
The length of a tuple can be found with the tuple_size/1
function.
iex> tuple_size({"Dib", "Zim", 10, 3.14})
4
iex> tuple_size({"Bob", 3})
2
iex> tuple_size({3})
1
iex> tuple_size({})
0
Tuples in Elixir, like with other languages, are meant to serve as a container for grouping data. They are not meant to serve as an iterable collection like an array would.
A typical use for tuples is to return multiple items of data from a function. Many functions in Elixir return a tuple containing a status code and data. The status codes are always atoms and :ok
and :error
are common status codes.
Interesting fact: There aren't arrays in Elixir. Instead, you get to choose between types that are more suited for functional programming.
Modifying Tuples is Expensive
Modifying a tuple in Elixir is expensive. Tuples are optimized toward read operations, so when you modify a tuple, the entire thing has to be copied.
You may be thinking "Modifying? But data in Elixir is immutable and cannot be modified!" That is indeed true, but when we talk about "modifying" in Elixir, we are always talking about taking the original data and constructing an entirely new data item with a modification. We are not talking about modifying the original data in place, but rather creating a new one with the modification we wanted.
Now you may be thinking "Won't we always have to copy the entire set of data when we modify something?". The answer is no. Many data structures in Elixir are implemented so that when we "modify" an existing set of data, it reuses much of the data and shares it between the two data structures. This sharing would be problematic for mutable data because modifying one set of data would modify another set where that same data also exists. That's not an issue in Elixir: since everything is immutable, data can be shared between multiple collections without an issue. This keeps Elixir from having to make copies of the entire set of data.
That said, when you modify a tuple, Elixir has to copy the entire tuple, which makes modifying a tuple more expensive than modifying some other types of collections.
Let's clarify what the copying involves. Data in Elixir is immutable and is typically shared. When you create the string "Gir"
or the number 3
, or any other bit of data, there will only ever be one instance of that data in the entire Elixir runtime. Anything that is assigned "Gir"
or 3
will actually get a reference to that single instance of data. So a tuple actually contains references to some immutable data, and when a tuple is copied, the references are copied and then swapped out with any new references for the new tuple. This is often referred to as a "shallow copy".
I'm not sure how references are implemented under the surface in the Erlang VM, but it's probably something pointer-like and relatively small in size. This means that copying a tuple containing 3 integers is just as fast as a tuple containing 3 lists of 100,000 integers because in either case, 3 references are being copied.
So modifying tuples isn't as bad as what I had originally envisioned, which was making copies of the referenced data, but Elixir still needs to make a copy. It's an O(n) operation no matter what, and that's far more expensive than a O(1) in-place array modification.
Read operations for Elixir tuples are always O(1), which is where tuples excel.
Functions for manipulating tuples are found in the Tuple
module. The Elixir documentation recommends not using those functions because that usually means you are using the tuple as a true collection and not just a simple grouping of data.
That's it for tuples: they're pretty simple.