Learn With Me: Elixir - Behaviours (#71)
Finally! I've finally gotten to the point where I've learned about behaviours. As I suspected all along, they are interfaces: an interface for an Elixir module.
What is an Interface?
Before I continue, I'm going to talk about what an interface is. Not all of you may be familiar with the concept or know it by that name. An interface is a concept of a contract that a unit of code (class, module, etc.) can fulfill. An interface is an abstraction that defines the functionality to be implemented, but not how that functionality is implemented.
So instead of a piece of code calling functions in a particular module or class, it can call functions from an interface instead. A particular implementation of that interface can then be passed to the code that is using the interface, but the code won't have any knowledge of any particular implementation. This allows loose coupling between units of code because the dependant code is only dependent on the interface, not any particular implementation. This provides a big advantage by allowing different implementations to be swapped without affecting the code that was dependent on the interface.
When using an interface to provide loose coupling for an e-mail provider, for example, you can easily switch to another e-mail provider without changing the dependent code, as long as you create a new implementation that implements the same e-mail interface. Database providers in certain environments often work the same way: they implement a particular interface that allows you to use the same API to talk to a different database without having to redo all your code. Operating system device drivers operate in a similar manner: the operating system doesn't have to know what a particular driver is doing as long as the driver implements the operating system's standard device interface for whatever type of device the driver can talk to.
A more typical use of interfaces is to allow code to be more testable. In your application, you can provide the "real" implementation of an interface when the code is running normally. Under unit test conditions, however, you can provide a mock implementation that just pretends to do something, allowing you to write unit tests for code without also testing all their dependencies.
Statically-typed languages such as C# and Java provide language support for interfaces, which is necessary, since the compiler needs to bind to an interface at compile time. Interfaces in these languages allow for loose coupling between units of code.
Dynamically-typed languages such as Javascript and Python tend to use "duck typing" where an object is assumed to have certain functions associated with it, otherwise there's a runtime error. There's less of a need for a formal concept of an interface in these languages since variables aren't statically-typed. So as long as a variable refers to an object that implements certain functions, you already have loose coupling. You can substitute the object with another one that has the same functions associated with it.
Behaviours in Elixir
Although Elixir is a dynamically-typed language, it does have the concept of an interface in the form of a behaviour. Elixir code does go through some kind of compilation step, whether by compiling in advance or doing it in an interpreter as it's run. The concept of a behaviour allows a formal concept of interfaces so that certain errors can be caught during the compilation process. As we'll see, however, the fact that Elixir is a dynamically-typed language means that some behaviour-related issues may not be caught until runtime.
Elixir uses the British spelling of "behaviour", so that's the spelling I'll be using here when I'm talking about the Elixir concept of a behaviour. I'm used to the American spelling of "behavior", so it wouldn't surprise me if I inadvertently use that spelling as I go on about Elixir behaviours.
I created an Elixir project that defines a behaviour and then has some modules that implement it. You can find this project in the the "lwm 71 - Behaviours" folder in the code examples in the Learn With Me: Elixir repository on Github. I'll be showing code from this project as I explain behaviours along with command line output showing the final result.
I created a behaviour for loggers, and here's what it looks like.
defmodule BehaviorExample.Logger do
@type log_list :: list({level(), String.t()})
@type level :: (:info | :warning | :error)
@callback log_message(log_list(), level(), String.t()) :: log_list()
@callback log_data(log_list(), level(), any()) :: log_list()
end
I defined a module that contains @callback
attributes. Each @callback
defines a function that must be implemented. Notice that learning about type specifications was very convenient because type specifications are used to define the @callback
functions. I'm not sure why @callback
is used, since these are just function descriptions. Perhaps these function definitions are also used to define callback functions in other places in Elixir. I'm not sure.
So we have two functions. They both receive a list of log messages that will be added to by the functions. For the sake of simplicity, I'm having them add to a list rather than creating some sort of side effect, like logging to a file. One of the functions receives a log level and a message, returning the list with the new log message added to it. The other function does the same thing with any arbitrary data, logging a string representation of that data. The data can be of any type, which is why the @callback
specification contains any()
.
Note that I made use of @type
definitions to make the @callback
definitions more meaningful and readable. It defines a log_list()
as a list of tuples, where the first element in the tuple is a log level, and the second element in the tuple is a string representing a log message. It also defines level()
as an atom that can have three possible values: :info
, :warning
, or :error
.
That's our behaviour! It's a contract that other modules can implement. Next, I'm going to create some modules that implement the behaviour.
Implementing a Behaviour
I created two modules that implement the behaviour. One is an upper-case logger: it changes the log message to upper-case characters. The other is a lower-case logger: it changes the log message to lower-case characters. Let's first look at the upper-case logger.
defmodule BehaviorExample.UpperCaseLogger do
@behaviour BehaviorExample.Logger
alias BehaviorExample.Logger
@impl Logger
@spec log_message(Logger.log_list(), Logger.level(), String.t()) :: Logger.log_list()
def log_message(log_list, level, message) do
[{level, String.upcase(message)} | log_list]
end
@impl Logger
@spec log_data(Logger.log_list(), Logger.level(), any()) :: Logger.log_list()
def log_data(log_list, level, data) do
[{level, inspect(data)} | log_list]
end
end
This module is called UpperCaseLogger
and we tell Elixir that it is implementing a behaviour by using the @behaviour
module attribute. The line @behaviour BehaviorExample.Logger
indicates that it is implementing BehaviorExample.Logger
. The compiler will check to see that it actually does implement the functions in that behaviour and give us warnings if we do not. If the behaviour functions are not all implemented, the compiler will still compile and let the code run, but it warns us because not fully implementing a behaviour could lead to trouble at runtime. This is unlike a language such as C# and Java, where not implementing all the methods in an interface will lead to an outright compile error. Elixir is a dynamically-typed language, and it can still run even with an improperly-implemented behaviour.
The UpperCaseLogger
module then implements the behaviour functions, converting the message to an upper-case string, and then adds the message (or data) to the list to be returned. I convert the data to a string by using the inspect/1
function, which can create a string representation of any data.
The @impl
attribute above each function indicates that the function implements a @callback
function in a behaviour. You can just type @impl true
if you want to, but it's more clear if you specify which behaviour the implementation applies to.
The impl
attribute is not actually required. You can remove the @impl
attribute, and nothing bad will happen. However, specifying an @impl
attribute provides some advantages. It documents which functions are behaviour implementations, which helps with maintenance and reading the code. It also disables documentation for that function, since the real documentation is associated with the @callback
definition in the behaviour. Finally, it helps static analysis tools like dialyzer understand your code better and make more helpful suggestions.
If you specify the @impl
attribute for one behaviour-based function, you will have to specify it for all @callback
implementations in a module or Elixir will complain. Likewise, if you specify the @impl
attribute for functions that are not @callback
implementations, Elixir will complain.
Let's look at the lower-case logger now. It's very similar to the upper-case logger.
defmodule BehaviorExample.LowerCaseLogger do
@behaviour BehaviorExample.Logger
alias BehaviorExample.Logger
@impl Logger
@spec log_message(Logger.log_list(), Logger.level(), String.t()) :: Logger.log_list()
def log_message(log_list, level, message) do
[{level, String.downcase(message)} | log_list]
end
@impl Logger
@spec log_data(Logger.log_list(), Logger.level(), any()) :: Logger.log_list()
def log_data(log_list, level, data) do
[{level, inspect(data)} | log_list]
end
end
Note that the module that implements a behaviour doesn't have to only implement the functions in the behaviour. It can implement multiple behaviours and contain other functions that are unrelated to any behaviour.
This is as far as the behaviour functionality of Elixir goes. Unlike statically-typed languages, there is no behaviour type associated with variables. It's just up to the developer to use a module that defines a behaviour when that is expected. If you use a different module, that error will be caught at runtime rather than by the compiler. So duck typing could be used in Elixir as well. As long as the modul contains the correct functions that will be called, then there will be no errors, even if it does not explicitly declare that it's implementing a behaviour. Elixir just gives us the concept of a behaviour to help us out a bit in organizing our code and to provide warnings when modules don't fully implement a behaviour. You should still use behaviours rather than duck typing because that helps developer understand the code and understand that certain modules need to implement certain functions.
Using the Loggers
Now that we have a logger behaviour and logger implementations, let's make use of them. I wrote a program that allows the user to type in messages to be logged and adds them to a list of log messages. Then it logs some data and prints out the final list of log messages. When a message is logged, a random logger module is chosen: the upper-case logger or the lower-case logger. The end result will be a random selection of upper-case and lower-case messages.
Here's what the topmost functions look like in lib/cli.ex.
def main(_argv) do
process()
end
@doc """
Runs the program logic
"""
@spec process() :: :ok
def process() do
log_list = fill_log_list([])
logger = get_random_logger()
log_list = logger.log_data(log_list, get_random_log_level(),
%{name: "Zim", occupation: :invader})
print_log_list(log_list)
end
The main/1
function is the main entry point. It ignores any command line arguments and just makes a call to process/0
to do the actual work. First I make a call to fill_log_list/0
to prompt the user to enter some log messages. Then it logs some data and prints out the final log list.
Whenever a message is logged, a random logger and log level is used. Here are the implementations of the get_random_log_level/0
and get_random_logger/0
functions.
#Returns a random log level
defp get_random_log_level() do
Enum.random([:info, :warning, :error])
end
#Returns a random logger implementation
defp get_random_logger() do
Enum.random([BehaviorExample.LowerCaseLogger, BehaviorExample.UpperCaseLogger])
end
The get_random_logger/0
function will return a random module. In Elixir we can treat modules like any other data and assign them to variables. This allows us to call a module without knowing exactly which module it is.
Now here's the implementation of fill_log_list/0
, which gathers log messages from the user.
def fill_log_list(log_list) do
#Get what the user types in and handle it
IO.gets("> ") |> String.trim() |> handle_message(log_list)
end
I used IO.gets/2
to display a prompt and retrieve the string that the user entered. Once the user has entered a log message, the message is passed to String.trim/1
, which removes the newline from the end of the string (since the user pressed the Enter key to indicate that they had typed in the entire message). From there, it gets passed to handle_message/2
, which takes the log list and the log message and logs the message.
Here is the implementation of handle_message/2
.
#Handles the log message entered by the user
defp handle_message("", log_list), do: log_list
defp handle_message(message, log_list) do
#Get a random logger
logger = get_random_logger()
#Log the message to the log list
log_list = logger.log_message(log_list, get_random_log_level(), message)
#Go back for more
fill_log_list(log_list)
end
The handle_message/2
function has two clauses. One handles the case where the user has entered something other than whitespace. The code gets a random logger and logs the message with a random log level. Then it does something interesting: it calls fill_log_list/0
again. This call goes back to the parent function and repeats the message entry process, forming a loop. Whereas other languages would use a while loop to achieve this, Elixir does not have any loop constructs in the language. Instead, it can make a recursive call.
This sort of thing feels wrong to me because in other languages I've used it could conceivably cause the call stack to get rather large. However, as I discussed when learning about recursion in Elixir, since this function call is the very last thing this function does, tail call optimization kicks in and instead of using up stack space, Elixir optimizes it to a loop in the final running code. So making recursive calls like this to implement looping functionality is quite normal in Elixir.
The other handle_message/2
function clause is called when the user doesn't enter anything other than whitespace, usually just pressing the Enter key instead of typing in anything. This causes the function to stop the looping behavior and just return the current log list.
Finally, here's the code for the printing functionality that prints out the log list to the console so that we can see what the end result is.
@doc """
Prints a log list to the console
"""
def print_log_list(log_list) do
print_separator()
log_list
|> Enum.reverse()
|> Enum.each(&print_log_entry/1)
end
#Prints a separator to separate content
defp print_separator(), do: IO.puts("----------------------")
#Displays a log entry on its own line in the console
defp print_log_entry(log_entry) do
log_entry
|> get_printable_log_entry()
|> IO.puts()
end
#Returns a printable string that corresponds to the log entry
defp get_printable_log_entry(log_entry) do
log_level = get_printable_log_level(log_entry)
log_message = get_printable_log_message(log_entry)
"#{log_level} - #{log_message}"
end
#Returns a printable string that corresponds to the log level
defp get_printable_log_level(log_entry) do
"#{inspect(elem(log_entry, 0))}"
end
#Returns a printable string that corresponds to the log message
defp get_printable_log_message(log_entry) do
"\"#{elem(log_entry, 1)} \""
end
I'll leave it to you to figure out how it works. It prints each log entry in the log list to the screen with the log level and message being extracted from the tuple that contains them.
Here's the console output from the program being run. I typed in a variety of messages and then it displayed what was logged.
> ./behavior_example
> This is a message
> Another message
> The widget has been unsufferable
> My flowers are windy
> My hovercraft is full of eels
>
----------------------
:error - "this is a message "
:warning - "ANOTHER MESSAGE "
:info - "the widget has been unsufferable "
:warning - "MY FLOWERS ARE WINDY "
:warning - "MY HOVERCRAFT IS FULL OF EELS "
:warning - "%{name: "Zim", occupation: :invader} "
You can see how the messages were randomly converted to lower- or upper-case text depending on which log module was used to do the logging. The very last warning was not something I typed in, but is an example of doing data logging that was done as part of the application code.
Conclusion
Behaviours are a bit weird to me. Elixir is a dynamically-typed language, but it has a formal definition of an interface for modules. The compiler checks that modules who claim to implement a behavior actually do so, or the compiler produces a warning. However, that's as far as it goes.
Unlike statically-typed languages, you can't indicate that the module you want to use must be a module that implements a behavior. You just use the module and if the function you expect to be there is there, it will get called regardless of whether the behavior is implemented. So it's like duck typing in that respect.
It's like a something between what Javascript does and what C# does, but as a dynamically-typed language, it's closer to Javascript than to C#. Note that since Elixir is a dynamic language, so there's nothing preventing you from just using duck typing and ignoring behaviours.
I think the main point of behaviours is defining a contract that provides guidance to developers who have a clearly-defined contract to implement instead of figuring out what functions need to be implemented in the form of duck typing. Behaviours also pretty useful to have in the documentation so that a developer knows that they can count on certain functions being in a module. A static code analysis tool like dialyzer can also make use of the behavior construct to catch possible issues.
Remember how I've been saying that behaviours resemble interfaces and protocols also resemble interfaces and I've been unclear what the difference is? Well, in the next post I'm going to cover what I've learned about protocols.