Learn With Me: Elixir - Structs (#18)
A struct is a data structure that stores properties and values. This is similar to an object in C# or a class or struct in C#, except that it only consists of data.
A struct is actually a specialized map with predefined keys and default values, which functions as a data type. Unlike a map, where any key can be added, a struct can only contain the keys that were predefined at compile time. Any attempt to add a new property results in an error. This results in a data structure that is fixed at compile time.
Defining Structs
Structs are defined within the context of a module using defstruct
to define the struct's properties and default values.
defmodule Person do
defstruct name: "", age: 0, stage: :baby
end
This defines a struct named Person
, which has three properties: name
, age
, and stage
. The module name becomes the name of the data type.
Unlike maps, only atoms can be used as keys in structs. That makes sense to me, seeing has how they are meant to be data structures, not dictionaries.
If your struct property list is long or contains some lengthy default values, you can define each property on separate lines
defmodule Person do
defstruct [
name: "",
age: 0,
stage: :baby
]
end
The concept of a data type seems very fluid in Elixir. It seems that any data can serve as its own data type if it is accompanied by functions that can manipulate it. Strings and regular expressions are great examples of this. You can use them without realizing that they are actually implemented as a binary and a data structure.
Instantiating Structs
We create an instance of the struct using the map notation (since a struct is just a specialized map), but we put the name of the struct between the "%" and "{" characters.
iex> unknown_person = %Person{}
%Person{age: 0, name: "", stage: :baby}
iex> bob = %Person{name: "Bob"}
%Person{age: 0, name: "Bob", stage: :baby}
iex> gaz = %Person{name: "Gaz", age: 10, stage: :child}
%Person{age: 10, name: "Gaz", stage: :child}
When a predefined struct property is not specified when instantiating a struct, the default value is used.
I've put the Person
struct module in the code examples in the Learn With Me: Elixir repository on Github so that you can load it into IEx and use it. It's in the person.exs
file under the "lwm 18 - Structs" folder.
We can see from the is_map/1
function that structs are indeed a specialized type of map. There is no is_struct/1
function.
iex> is_map(bob)
true
iex> is_struct(bob)
** (CompileError) iex:16: undefined function is_struct/1
Accessing Structs
The struct properties (actually keys in a map) can be accessed using the dot notation, just like a map.
iex> bob = %Person{name: "Bob"}
%Person{age: 0, name: "Bob", stage: :baby}
iex> gaz = %Person{name: "Gaz", age: 10, stage: :child}
%Person{age: 10, name: "Gaz", stage: :child}
iex> bob.name
"Bob"
iex> bob.age
0
iex> gaz.stage
:child
iex> gaz.favorite_color
** (KeyError) key :favorite_color not found in: %Person{age: 10, name: "Gaz", stage: :child}
Attempting to access a property that does not exist results in a KeyError
being thrown.
Unlike with maps, brackets notation cannot be used with structs.
iex> bob[:name]
** (UndefinedFunctionError) function Person.fetch/2 is undefined (Person does not implement the Access behaviour)
Person.fetch(%Person{age: 0, name: "Bob", stage: :baby}, :name)
(elixir) lib/access.ex:318: Access.get/3
Interestingly, the error implies that I can create a Person.fetch/2
function and the bracket notation would work. It looks like Elixir has some sort of convention (called the "Access behavior" in the above error message) that allows a data structure to enable the use of the brackets syntax. I imagine that this is a more advanced feature I haven't learned about yet.
Updating Structs
Structs can be updated using the map pipe syntax. The struct to be updated is put before the pipe character and the keys and values that are to be updated are put after the pipe character.
iex> bob = %Person{name: "Bob"}
%Person{age: 0, name: "Bob", stage: :baby}
iex> old_bob = %Person{bob | age: 95, stage: :elderly}
%Person{age: 95, name: "Bob", stage: :elderly}
As with a map, no keys can be added to a struct using the pipe syntax. In fact, since the structure of a struct is defined at compile time, keys can never be added.
iex> old_bob = %Person{bob | age: 95, stage: :elderly, location: "home"}
** (CompileError) iex:15: unknown key :location for struct Person
Struct Key Types
Although struct keys have default values associated with them, Elixir still allows them to be dynamically typed and does not enforce a particular type at either compile time or runtime. This means that we can change the name in a %Person
struct to be something other than a string.
iex> person = %Person{name: 3.13, age: "Vlaai"}
%Person{age: "Vlaai", name: 3.13, stage: :baby}
I think that type enforcement could be useful here, at least at runtime, but that probably goes against the nature of Elixir, and gets into the realm of statically-typed languages. It would also make things more difficult for keys that are intended to have values of any data type, such as a generic :data
key that can be assigned any kind of data.
Packaging Struct Data and Functions
Since the struct definition is wrapped in a module, an Elixir struct module can also contain functions that can operate on that struct.
defmodule Person do
defstruct name: "", age: 0, stage: :baby
def new() do
%Person{}
end
def increment_age(person) do
%Person{person | age: person.age + 1}
end
def babify(person) do
%Person{person | age: 1, stage: :baby}
end
def can_retire?(person, retirement_age) do
person.age >= retirement_age
end
end
Let's load that module into IEx and start using the functions.
iex> bob = Person.new("Bob")
%Person{age: 0, name: "Bob", stage: :baby}
iex> bob = Person.increment_age(bob)
%Person{age: 1, name: "Bob", stage: :baby}
iex> bob = Person.increment_age(bob)
%Person{age: 2, name: "Bob", stage: :baby}
iex> Person.can_retire?(bob, 60)
false
iex> bob = %Person{bob | stage: :retired}
%Person{age: 2, name: "Bob", stage: :retired}
iex> bob = Person.babify(bob)
%Person{age: 1, name: "Bob", stage: :baby}
iex> bob = %Person{bob | age: 62, stage: :retired}
%Person{age: 62, name: "Bob", stage: :retired}
iex> Person.can_retire?(bob, 60)
true
As you can see, if we add enough functions to the module, it will be possible to do everything we want to a data structure without having to touch the data structure itself, thereby making the implementation details less relevant.
I've noticed that a common convention in Elixir is to put a new
function in struct modules to provide a constructor function. So we provide one that takes a single argument. It's also possible to provide different construction functions that accept enough parameters to populate all the possible properties in a Person
struct or a construction function that accepts no parameters at all.
The above example is an example of some very nice packaging of the data definition in the same module as the functions that operate on that data. Since a struct module contains the structure of the data and the associated functions, it looks a bit like classes in object-oriented languages.
However, they are not the same thing. Unlike classes, Elixir struct data and functions are strictly separate. There is no instance data in a module and there is nothing like a this
context in Elixir. The module just contains a definition of the structure of the data, not the data itself.
Data structures are created using the struct definition as a template, but they are just data and nothing else. To do something with the data, they need to be passed to the functions provided in the related module. Do not attempt to do object-oriented programming in Elixir. You will be working against the design of the language and that never turns out well no matter what language you're working with.
Struct Key Value Enforcement
By default, you don't have to provide the values for any keys when creating a struct. If you want to, you can require that certain property keys be provided when constructing the struct by using the @enforce_keys
. The @enforce_keys
attribute consists of a list of keys whose values must be provided.
Here's an example where I enforce the :name
and :age
keys in the Person
module.
defmodule Person do
@enforce_keys [:name, :age]
defstruct name: "", age: 0, stage: :baby
def new(name, age) do
%Person{name: name, age: age}
end
...
end
Since the Person.new/1
constructor function only accepted a name, I got a compile error when I loaded the module. So I had to add the "age" parameter, transforming it into Person.new/2
.
The version of the Person
struct module with key enforcing can be found in the person_enforced.exs
file in the "lwm 18 - Structs" folder in the code examples in the Learn With Me: Elixir repository on Github.
iex> person = Person.new("Bob", 54)
%Person{age: 54, name: "Bob", stage: :baby}
iex> person = %Person{name: "Bob"}
** (ArgumentError) the following keys must also be given when building struct Person: [:age]
expanding struct: Person.__struct__/1
iex:4: (file)
iex> person = %Person{age: 12}
** (ArgumentError) the following keys must also be given when building struct Person: [:name]
expanding struct: Person.__struct__/1
iex:5: (file)
iex> person = %Person{}
** (ArgumentError) the following keys must also be given when building struct Person: [:name, :age]
expanding struct: Person.__struct__/1
iex:5: (file)
iex> person = %Person{name: "Bob", age: 12}
%Person{age: 12, name: "Bob", stage: :baby}
You can see from the example that if we create a %Person
struct without an age, we get an ArgumentError. Since the :age
key is listed in the @enforce_keys
attribute, Elixir now requires values to be provided for both the :name
and :age
keys. The value for :stage
key can still be omitted, since it is not among the required keys. In that case, it is just set to the default value.
Structs With a Large Hierarchy
The keys in a struct can contain any other data type, including other structs. So it's quite possible to build up a data structure with a large hierarchy.
Let's say that you have a struct with a hierarchy that's several levels deep.
defmodule Manufacturer do
defstruct name: "", num_of_factories: 0
end
defmodule Car do
defstruct make: "", model: "", color: "", manufacturer: %Manufacturer{}
end
defmodule Person do
defstruct name: "", age: 0, car: %Car{}
end
A person has a car, which has a manufacturer, giving us a hierarchy 3 levels deep. Normally, these modules would each be defined in their own file, but I put it in a single file here for simplicity.
This code can be found in the struct_hierarchy.exs
file in the "lwm 18 - Structs" folder in the code examples in the Learn With Me: Elixir repository on Github.
It's going to be a bit of a pain to modify the name of the manufacturer of the person's car. Let's see what that looks like.
iex> person = %Person{}
%Person{
age: 0,
car: %Car{
color: "",
make: "",
manufacturer: %Manufacturer{name: "", num_of_factories: 0},
model: ""
},
name: ""
}
iex> person = %Person{ person | car: %Car{person.car | manufacturer: %Manufacturer{person.car.manufacturer | name: "Toyota"}}}
%Person{
age: 0,
car: %Car{
color: "",
make: "",
manufacturer: %Manufacturer{name: "Toyota", num_of_factories: 0},
model: ""
},
name: ""
}
This updates the name of the car's manufacturer to "Toyota".
Yeah, I know that in real code, you'd have a collection (probably a map) of reusable manufacturer instances and wouldn't update the name like this, but I'm doing this as an example to show what updating something deep in a hierarchy looks like.
As you can see, every level of the hierarchy causes the update statement to get much longer and more complex. You have to update the manufacturer name, which creates a new instance of %Manufacturer
(since the old one was immutable). You have to then update the %Car
to refer to the new instance of %Manufacturer
, which creates a new instance of %Car
. Then you have to update the %Person
instance to refer to the new instance of %Car
, which of course creates a new instance of %Person
, which is then bound to the person
variable.
Now you have an updated data structure which shares some of the original key-value pairs from the non-updated version and has new key-value pairs.
Yuck, that was some complex syntax. Fortunately, Elixir provides us a couple functions that make this easier, put_in/2
and update_in/2
, both located in the Kernel module. The put_in
function allows us to set a specific value, whereas the update_in
applies a function to update the value.
Let's use that simpler syntax.
iex> person = %Person{}
%Person{
age: 0,
car: %Car{
color: "",
make: "",
manufacturer: %Manufacturer{name: "", num_of_factories: 0},
model: ""
},
name: ""
}
iex> person = put_in(person.car.manufacturer.name, "Toyota")
%Person{
age: 0,
car: %Car{
color: "",
make: "",
manufacturer: %Manufacturer{name: "Toyota", num_of_factories: 0},
model: ""
},
name: ""
}
iex> person = update_in(person.car.manufacturer.num_of_factories, &(&1 + 1))
%Person{
age: 0,
car: %Car{
color: "",
make: "",
manufacturer: %Manufacturer{name: "Toyota", num_of_factories: 1},
model: ""
},
name: ""
}
The call to put_in/2
updates the name of the car's manufacturer to "Toyota" in a much simpler way by specifying a path to a key and the value to use to update the key. So much easier to read and understand!
person = put_in(person.car.manufacturer.name, "Toyota")
The call to update_in/2
updates the number of factories a manufacturer has by specifying a path to the key and the function to apply to that key. In this case, the number of factories is incremented by 1.
person = update_in(person.car.manufacturer.num_of_factories, &(&1 + 1))
Note that there are also put_in/3
and update_in/3
functions, which accept a list of keys that make up the path instead of a path defined at compile time, like we used above. This allows the path to be determined at runtime instead of compile time.
Initializing Structs With Large Hierarchies
Structs with large hierarchies can be initialized with a large literal, just like how complex Javascript objects are initialized.
iex> person = %Person {
...> name: "Bob",
...> car: %Car {
...> make: "Toyota",
...> model: "Prius",
...> color: "silver",
...> manufacturer: %Manufacturer {
...> name: "Toyota",
...> num_of_factories: 25
...> }
...> }
...> }
%Person{
age: 0,
car: %Car{
color: "silver",
make: "Toyota",
manufacturer: %Manufacturer{name: "Toyota", num_of_factories: 25},
model: "Prius"
},
name: "Bob"
}
I'll bet that is useful for unit testing.
I also put this same initialization code into a module and then tried it out in IEx. This module can be found in the struct_hierarchy_initialization.exs
file in the "lwm 18 - Structs" folder in the code examples in the Learn With Me: Elixir repository on Github.
defmodule PersonData do
def get_person() do
%Person {
name: "Bob",
car: %Car {
make: "Toyota",
model: "Prius",
color: "silver",
manufacturer: %Manufacturer {
name: "Toyota",
num_of_factories: 35
}
}
}
end
end
Here's what it looks like to call the PersonData.get_person/0
function from IEx.
iex> c "examples/lwm 18/struct_hierarchy_initialization.exs"
[PersonData]
iex> PersonData.get_person()
%Person{
age: 0,
car: %Car{
color: "silver",
make: "Toyota",
manufacturer: %Manufacturer{name: "Toyota", num_of_factories: 35},
model: "Prius"
},
name: "Bob"
}
Why Use Structs?
Structs allow come compile-time checking and guarantees regarding which properties will be in the data structure. So if you define a key, aka property, in a struct, that property is guaranteed to be there. Anyone who is using your data type will know exactly what its structure is and what it is used for.
It's a lot easier to maintain an application when dealing with 10 specific structs than if you have a bunch of data in 10 generic maps, which provide little idea what sort of structure you'll find in those maps.
I'm also under the impression from what I've read so far that structs may be important to various advanced concepts and programming conventions in Elixir, but I don't know that for sure yet. At any rate, structs seem to be used often in real Elixir code.