Learn With Me: Elixir - Type Specifications (#69)

Since Elixir is a dynamically-typed language, it's not always obvious which types a function expects to receive and return. Some functions can handle parameters of any type while other functions will only work with parameters of a particular type. To help with figuring out which types a function is designed to work with, Elixir provides something called a type specification.

A type specification (also called a "typespec") is a special notation that allows the developer to document which types a function expects to receive and return. This helps the developer quickly know what to pass to a function and what to expect in return.

Typespecs associated with functions take the form "function_name(parameter1_type(), parameter2_type()) :: return_type()", where the function name is followed by parentheses containing the expected type of each parameter. Then there is a :: followed by the return type.

This information gets included as metadata in a module and is included in the documentation. When I was doing in-depth coverage of various Elixir modules and examining what each function did, I found the typespecs in the documentation to be very valuable to understanding how to use the functions.

Type Specification Examples

The best way to take a look at typespec examples is to look at the typespecs in the Elixir documentation. Here are some examples.

map_size(map()) :: non_neg_integer()

The function is named "map_size" (actually Kernel.map_size/1). The typespec shows that it receives a map as a parameter and returns a non-negative integer.

to_string(integer()) :: String.t()

The function is named "to_string" (actually Integer.to_string/1). It receives an integer as a parameter and returns a string.

Since Elixir is a dynamic language, it's entirely possible for a parameter to be of several different types. You can specify multiple types using the | character.

For example:

floor_div(integer(), neg_integer() | pos_integer()) :: integer()

This is the typespec for Integer.floor_div/2, which indicates that the function will accept two parameters. The first parameter must be an integer and the second parameter can be a negative integer or a positive integer (which excludes 0). The function returns an integer. This makes sense, since Integer.floor_div/2 is a division function, and it cannot divide by 0.

Elixir has a lot of types available to the type specification, including types like "neg_integer()" and "pos_integer()" when "integer()" isn't specific enough.

Sometimes any type is acceptable for a parameter, in which case the typespec would use any() for the type.

delete(list(), any()) :: list()

This is the typespec for List.delete/2. It accepts a list and an item of any type and it returns a list.

Typespecs have quite a variety of types and constraints available to them, and can describe a lot of functions. Here's the typespec for Integer.parse/2.

parse(binary(), 2..36) :: {integer(), binary()} | :error

It can take a binary (which is typically a string), a number between 2 and 38 (inclusive) representing the base. It will return a tuple containing an integer and a binary or is will return the atom :error.

There are a lot of possible built-in types that can be used in type specifications and there's a surprising amount of complexity one can contain. I'm not going to duplicate the documentation here, and I haven't yet used type specifications enough to go into great depth, so I encourage you to read the typespec documentation or at least skim over it. I've personally learnt the most just from reading the typespecs in the Elixir documentation.

I knew nothing about typespecs when I first started reading the Elixir documentation, but I was able to figure out most of them on my own. They are quite readable and useful.

User-Defined Typespecs

It's also possible to define special types to use in typespecs. These special types are called user-defined types. User-defined types are created either because they're more readable than the full typespec that they represent or they have some special meaning to humans in the context of the documentation (or both of those things). You can think of these user-defined types as type aliases that make type specifications more readable and meaningful.

For example, if you look at the types defined in the Enum module, you'll see something called t() which is defined as t() :: Enumerable.t(). This indicates that the user-defined type t() is actually an Enumerable.t(), which indicates an enumerable type.

I believe that anything at ends with ".t()" is a type that corresponds to a struct, protocol, etc. that is defined in a module. The point of creating a user-defined type here is that it's frequently used in typespecs in the Enum module and saves space. Just typing in "t()" indicates that it is the type associated with the current module.

Here's an example of "t()" being used in the typespec for an Enum function, Enum.any?/2.

any?(t(), (element() -> as_boolean(term()))) :: boolean()

The first parameter is "t()", which indicates that it is the type of the current module. What is the current module? It's Enum, so the type must be anything that is an enumerable. The second parameter is a function, which takes the form of "([parameters] -> [result])". This particular parameter must be a function that accepts an element as a parameter and returns a term() whose "truthiness" is used to make decision. The "as_boolean()" indicates that the truthiness of a value will be used.

There are also some types you're probably, not familiar with, like "element()" and "term()". "element()" is also a user-defined type defined in the Enum module as element() :: any() and "term()" is a built-in type defined as term() :: any(). These are defined in order to make the type specification more meaningful to humans.

That leads us to the next example of user-defined types, where types are specified just to make the type specification more meaningful to the humans that are reading it. Take a look at the types defined in the Map module, you'll see that there are two types defomed: key() and value(). They look like this: key() :: any() and value() :: any().

Why would they define two types that both map to "any()"? It's because they do hold special meaning to humans to help clarify the role of certain parameters. When a human sees these types being used in a typespec, they know that the key should go in a particular spot (and that it can be of any type) and the value should go in a particular spot (and that it can be of any type).

For example, here's the typespec for Map.update/4: update(map(), key(), value(), (value() -> value())) :: map(). That typespec indicates that the first parameter is a map, the second parameter is a key (of any type), the third parameter is a value (of any type), and the fourth parameter is a function that takes in a value and returns a value. The entire function then returns a map.

It's so much easier to figure out what to pass this function than if the typespec had been update(map(), any(), any(), (any() -> any())) :: map(). If I had seen this last typespec, I would still be wondering what to pass to those "any()" parameters. When those "any()" types are substituted with "key()" and "value()" types, it becomes a lot more clear what I need to pass the function.

Note that the typespec probably won't tell me everything I need to know to call the function. I'll probably still have to read the documentation as well. However, it will work together with the documentation to clarify how I can use the function.

You can also create user-defined types that represent a particular list, tuple, or map. The documentation gives the example color :: {red :: integer, green :: integer, blue :: integer} as such an example, which defines type that is a tuple that contains three integers. There are also human-readable names for each value in the tuple, which lets us know that they are intended to store red, green, and blue color values. So wherever I want to pass a color to a function, I can just use the "color()" type, and if I don't already know what that is, I can click on it in the typespec to be taken to the definition.

If you think of a user-defined type as an alias, you might think of another advantage to user-defined types. If you use a particular type in many typespecs in a module, and the definition of a type changes, you only have to change it in one place instead of many different places throughout the module. This can make user-defined types very convenient indeed.

How to Define Type Specifications

Typespecs are defined in the documentation using the @spec module attribute. The function typespecs are defined just above the function. Here's an example of the type specification for Integer.floor_div/2 that I took directly from the Elixir source code for the Integer module.

@spec floor_div(integer, neg_integer | pos_integer) :: integer
def floor_div(dividend, divisor) do
	if dividend * divisor < 0 and rem(dividend, divisor) != 0 do
	  div(dividend, divisor) - 1
	else
	  div(dividend, divisor)
	end
end

User-defined types are defined using the @type module attribute. These are typically just located on their own lines underneath the module documentation because they are usually used in multiple typespecs throughout the module. Here's an example of typespecs defining user-defined types. These are the key() and value() types from the Map module.

@type key :: any
@type value :: any	  

I've noticed that sometimes the types have parentheses following them in the source code and sometimes they don't. They always do in the generated documentation. I suspect that this is because they follow Elixir conventions where parentheses are optional for function calls. I'm not sure if typespecs types are functions, but it would not surprise me if they actually are. It seems like almost everything in Elixir corresponds to some kind of function.

The Other Uses Of Type Specifications

Type specifications actually come from Erlang. The typespec syntax is different in Erlang, but the concepts are the same. This means that Erlang code can have type specifications associated with it, although I don't know how they are shown in the Erlang documentation.

There's a tool called dialyzer that can perform static analysis on Elixir code using the type specifications and can spot possible problems with your code. Apparently, it's able to make pretty good guesses regarding typespecs for functions that don't have one, so it can even be helpful when analyzing code that doesn't have them. I don't know much about dialyzer than this, but I may cover this tool in the future.

Typespecs can almost certainly help with autocomplete functionality and related features in code editors. In fact, the code editor I typically use appears to be able to use this information. I'll go into more detail on which code editor I use and how I have it set up another time.

I'm far from an expert in typespecs at the moment simply because I haven't yet used them in my own code, but I certainly appreciate them from reading the Elixir documentation. I'm looking forward to making use of them in future projects.