Learn With Me: Elixir - Functional Programming (#3)

Since Elixir is a functional language, it's best to become familiar with how functional programming works. Although I have used features in Javascript and C# in a functional manner before, this is my first functional language. So I'm going to write what I know here, but I'm almost certain there's more to functional programming than what I will cover here.

I want to give you at least some understanding of the nature of functional programming from my current viewpoint so that you will understand Elixir better as we go through it. I have no doubt that we will both be learning more about the nature of functional programming as we go along.

Data vs Functions

In object-oriented languages, data is typically combined with the functions that operate on it. In a functional language, the data and functions are strictly separated. Functions live on their own, although related functions are often grouped together into modules, and they are passed data as parameters and return data to the caller. Functions are regarded as data transformers.

Messy Side-Effects

A side-effect is where a function affects something outside of itself. Functions with side-effects will typically do things like modify shared data that wasn't passed to it, something which is normal for object-oriented code. Functions with side effects can also affect the environment by reading input data, producing output data, sending data over the network, drawing on the screen, etc. These are all effects which leak beyond the bounds of the function.

Functions without side effects will take in a set of parameters perform calculations solely based on that data, and return a value. Such functions are called pure functions and can exist completely independently. Pure functions are the ideal in functional programming. Not only do they make your code much more understandable and avoid certain type of bugs (how did this data get modified and where did that happen?), but they make your function easy to test. If a function does nothing but return a value based only on the input parameters, testing will be a lot simpler than if you have to figure out how the environment affects it and how it affects the environment.

Now clearly a program that has no side effects would be pretty much useless. Programs need to read input from elsewhere, draw on the screen, communicate with other programs, and generally interact with their environment. So you need to have at least some functions with side-effects. Functional programming practices recognize this and it's best to isolate those side effects to special functions, where each function is devoted to a side effect and nothing else.

Do you need to calculate a value and write it to the screen? That can be done, but don't do it in the same function. Have a pure function that will receive input, do the calculation, and then present the output, and another function that takes a piece of data and displays it on the screen and does nothing else. Isolating those side effects will help make functional code much more testable and easier to read and maintain. There's no guessing in which function values get written to the screen. It's all done in functions that specialize in doing that.

First-Class Functions

Functional programming needs to have functions that are first-class citizens of their language. Functions should be able to be passed around just like any data can. They should be able to be assigned to variables, passed to other functions as parameters, and passed back from functions as return values.

Some imperative languages like Javascript can do this. It's possible to do some functional programming in a language like Javascript, although the language was not built around that concept. It's not possible in languages like C, C++, early versions of Java, and early versions of C#. Functions simply couldn't be used that way.

Functions are Small and Composable

Functional programming emphasizes functions that are small, making them easy to understand and easy to test. Those functions can be combined in various ways to make other functions, and are often (but not always) reusable in a variety of situations. Any of you familiar with Unix will note the similarity between this and the Unix philosophy, which is to have a lot of small, specialized tools that can be combined in various different ways to build something more sophisticated.

It's common in functional languages to set up a data transformation pipeline, where a set of data goes through a set of simple transformation functions, with the result of the previous transformation function being passed to the next function. Applying multiple simple functions can often achieve the same thing as a more complex function, and they're much easier to test and understand.

Immutable Data

Ideally, data in a functional language is immutable. This solves a lot of programming issues: not only issues related to the sharing of mutable state in concurrent programming, but also avoiding issues where data is modified unexpectedly and you have to go tracing through the code trying to figure out how the data got into this state. You'll never truly appreciate the advantages of immutable data until you've written multi-threaded code that shares a mutable state and had to deal with the issues that arise from that.

Sharing mutable data typically involves locks to control access to one thread at a time, which can result in performance hits when one thread is waiting for another thread to finish. That isn't so bad at 2 or 3 threads, but when you scale up to hundreds of threads, that will kill performance and limit scalability. Elixir, ever optimized toward concurrency and scaling, sidesteps the issue entirely by making all data immutable.

Functional languages don't modify data: they transform it into a new set of data. Many functions in a functional language are dedicated to transforming data. This fits right in with the concept of immutable data.

The advantage of immutable is that it makes programming simpler and less error prone. The disadvantage is that it tends to be less efficient than changing mutable data. Modifying immutable data often involves some copying of the original data, but some surprising optimizations can be made if the compiler knows that the data is immutable.

Data can be shared between different data structures: since it's never modified, it can be reused without copying. One such data structure is a "trie" (sometimes pronounced "tree" and sometimes pronounced "try"). It allows large immutable data structures such as an array to be modified with minimal copying. After modification, most of the data is shared between the two data structures.

For a really interesting explanation of tries and immutable data structures, check out this video of Anjana Vakil's presentation on immutable data structures at JSConf. That was really mind-opening for me. There are a lot of other interesting videos out there explaining tries as well.

Functional languages use these sorts of optimizations in interesting ways so that modifying an immutable data structure to get another one is a lot more efficient that it appears. A non-functional language like Javascript does not natively have any immutable data structures optimized for functional programming, but libraries like immutable.js, which implements immutable data structures for Javascript, are available.

So although copying entire structures of immutable data in Elixir is much more efficient that it sounds to someone unfamiliar with the implementations of such structures, it will never be quite as efficient as just modifying a mutable data structure. Elixir happily takes that efficiency penalty to enable scalable concurrency. For Elixir, a minor performance hit to a particular piece of code is preferable to the major performance hit on an entire system as it attempts to scale up.

Higher-Order Functions

A higher-order function is one that can accept a function as a parameter or return a function. Higher-order functions are very common in functional programming. A very typical use is to pass a transformation function to another function, which will apply that transformation to every element in a collection. The Array.map function in Javascript or IEnumerable.Select method in C# are examples of higher-order functions, since they apply a function parameter to every element in the collection.

Functional vs Procedural

Functional languages and procedural languages (like C and Pascal) both tend to separate data and functions, but there is a big difference in philosophy. Functions usually aren't first-class citizens in procedural languages and procedural programs have much more of an imperative style. Functional programs tend to be more declarative. Procedural programs emphasize how something is to be done and functional programs tend to emphasize what needs to be done. The declarative functional mindset is one I have not yet mastered, but I hope that I'll get a better grasp on it as I move along through Elixir.

Procedural languages don't usually offer features like immutable data, higher-order functions, and easy composition. Functional languages emphasize function consistency (output is always the same for every set of input) and a lack of side effects.

Languages can be flexible though. C could probably be tortured into doing functional-like programming with function pointers and a lot of specialized libraries, and Javascript is so flexible that it can do a lot of that stuff as well, particularly with the help of libraries. Languages can always be bent to do something they weren't designed to do, but languages always are nicer to use when you use them for what they were designed for.

Functional languages, while oriented toward functional programming, won't prevent you from abusing them. I imagine that an imperative programmer could still manage to write more imperative-style code with a functional language and not use the language as it was intended. This is something that I dread doing, so I hope to better learn the functional style of programming as I learn Elixir.

The solution to this particular problem is to pay attention to existing Elixir code and practice, practice, practice. I'll never master Elixir and functional programming by doing nothing but reading books and watching videos. I'll have to actually do a lot of programming in Elixir.

Mastering Functional Programming

I have a feeling that I have a long way to go before I will have mastered functional programming. It's a different way of thinking about things and I suspect that I'll be making some mistakes as I go along. Many of the familiar patterns of programming that have worked for me in the past will likely be either useless or misleading in a functional programming environment.

Learn With Me: Elixir - Functional Programming (#3)

Kevin Peter

Kevin Peter

Data vs Functions

Messy Side-Effects

First-Class Functions

Functions are Small and Composable

Immutable Data

Higher-Order Functions

Functional vs Procedural

Mastering Functional Programming

Learn With Me: Elixir - ElixirLargeSort IntSort Project Part 4 (#82)

Learn With Me: Elixir - ElixirLargeSort IntSort Project Part 3 (#81)

Learn With Me: Elixir - ElixirLargeSort IntSort Project Part 2 (#80)

Learn With Me: Elixir - Tools and Learning Resources (#4)

Learn With Me: Elixir - Getting to Know Elixir (#2)