Learn With Me: Elixir - Recursive Functions (#27)

Functional languages achieve repetition via recursive calls. If you're not familiar with recursion, I recommend reading up on the topic, or you will get really lost here. Understanding recursion is key to understanding functional programming. I'll just summarize recursion briefly for those of you who may need a little refresher.

About Recursion

A recursive function is one that calls itself. A recursive function will typically call itself using a slightly smaller problem set and will stop calling itself once it has achieved the smallest possible problem set, where the solution is usually trivial. Then on the way back up the call stack, each function call context takes the result of the previous call (the smaller problem set) and uses that to solve its problem set.

In order to prevent infinite recursion, which is the equivalent of an infinite loop, the recursive function must stop calling itself under certain conditions. This often known as the "base case". The base case stops the chain of recursive calls. When the function calls itself with a problem set of a smaller size, this is known as the "recursive case".

To make this more clear, let's consider an example. Let's say we want to find the biggest number in a collection of integers. When looping, we'd iterate over the collection, keeping track of the biggest number we've found so far. At the end of the loop, the biggest number we've found is the biggest number in the collection

With recursion we'd call a function with the entire problem set: the collection of integers. That function would call itself passing the collection minus one element (the smaller problem set). That is the recursive case in action. This would continue until the size of the collection is a single integer, and the problem set cannot be reduced any more. This is the base case.

What's the biggest number solution for a collection with a single element? That number, of course! It's trivial to find the biggest number in the smallest possible collection of numbers.

On the way back up the callstack, each recursive call would complete. The code would look at the biggest element from the recursive call (the biggest number in the smaller problem set), compare it to the element that was left out of the recursive call, and then pass back the biggest value. Each operation is quite simple, and at the end, the largest value would bubble to the top.

Here's a breakdown of finding the biggest number in list [1, 2, 3, 4, 5]. Let's say the recursive function is called "biggest_value"

  1. Call biggest_value with the list [8, 12, -4, 6, 2]
  2. biggest_value recursively calls itself with the list [12, -4, 6, 2]
  3. biggest_value recursively calls itself with the list [-4, 6, 2]
  4. biggest_value recursively calls itself with the list [6, 2]
  5. biggest_value recursively calls itself with the list [2]
  6. The base case is triggered. The biggest number [2] is 2, so the function returns 2 to its caller
  7. The function compares its current number (6) to the result of the recursive function call, which is 2. 6 is bigger than 2 so the function returns 6 to its caller
  8. The function compares its current number (-4) to the result of the recusive function call, which is 6. 6 is bigger than -4, so the function returns 6 to its caller.
  9. The function compares its current number (12) to the result of the recursive function call, which is 12. 12 is bigger than 6, so the function returns 12 to its caller.
  10. The function compares its current number (8) to the result of the recursive function call, which is 12. 12 is bigger than 8, so the function returns 12 to its caller.
  11. The chain of recursive function calls ends and the biggest number in the list is 12.

This is a fine example of how recursion is used to solve problems in place of a loop.

Recursion in Javascript

Here's what this recursive algorithm looks like in Javascript

//This function assumes that numbers.length > 0
function biggestValue(numbers) {
	//Base Case
	if(numbers.length == 1) {
		return numbers[0];
	}
	//Recursive Case
	else
	{
		return Math.max(numbers[0], biggestValue(numbers.slice(1)));
	}
}			

The array in Javascript has a convenient slice() function to make things easier. However, since the slice() function copies the subset of an array, this is rather inefficient. It would be more efficient to pass the original array with index values indicating which part of the array we are considering. We'll do that it the C# example, since the C# List class has no built-in method to create a subset of the collection.

Recursion in C#

Here's the same algorithm implemented in C#.

//This method assumes that numbers.Count > 0
public int BiggestValue(List<int> numbers)
{			
	return BiggestInSubset(numbers, 0, numbers.length - 1)
}

//This method assumes that numbers.Count > 0
private int BiggestInSubset(List<int> numbers, int beginIndex, int endIndex)
{
	//Base Case
	if(beginIndex == endIndex)
	{
		return number[beginIndex];
	}
	//Recursive Case
	else
	{
		return Math.Max(number[beginIndex], BiggestInSubset(numbers, beginIndex + 1, endIndex));
	}
}

Other Recursive Strategies

There's also the divide-and-conquer approach to solving a problem using recursion where we chop the problem set in half in each recursive call, make two recursive calls for each half, and then merge the results together. This is a common approach in sorting algorithms.

A binary search of an ordered tree can be done recursively using this method as well, although we would only have to make the call for one of the halves, since the ordered nature of the tree allows us to already know which half the value being searched for is located in.

Recursive Functions in Elixir

In Elixir, recursive functions are used instead of loops, which is typical for a functional language. In fact, Elixir has no loops at all!

Let's look at an example of a recursive function in Elixir. We'll implement the same biggest number algorithm implemented in the C# and Javascript code examples above.

You can find the functions I show here in a module in the code examples in the Learn With Me: Elixir repository on Github. They are in the recursive_examples.exs file under the "lwm 27 - Recursive Functions" folder.

def biggest_number([number | []]) do
	number
end
def biggest_number([number | tail]) do
	max(number, biggest_number(tail))
end

Instead of using an if-else, which is how we would create a recursive function in most languages, we use multiple function clauses with pattern matching. The first clause matches a list with a single element, and handles the base case. When a list has only one element, then the solution is trivial: that element is the biggest number in that list.

The second clause matches all other lists containing elements and handles the recursive case. It does a recursive call, passing in the tail of the list. Once it has the biggest number in the tail, it uses the max/2 function to compare the biggest number in the tail to the current element, returning whichever number is larger.

The only case we don't handle here is an empty list. Empty lists have no numbers, and no biggest number, so I don't even handle those.

Here's the function being used in IEx

iex> RecursiveExamples.biggest_number([3, 14, 13, 19, 5, -10, 0, 12])
19
iex> RecursiveExamples.biggest_number([-43, -66, -2, 0, -1])
0
iex> RecursiveExamples.biggest_number([2, 4])
4
iex> RecursiveExamples.biggest_number([8])
8
iex> RecursiveExamples.biggest_number([])
** (FunctionClauseError) no function clause matching in RecursiveExamples.biggest_number/1

    The following arguments were given to RecursiveExamples.biggest_number/1:

        # 1
        []

    examples/lwm 27/recursive_examples.exs:2: RecursiveExamples.biggest_number/1

Attempting to call the function with an empty list results in an error because the empty list does not match any of the patterns on the function clauses.

Recursive functions like this are common in Elixir: far more common than recursive functions in imperative languages.

Recursion and the Call Stack

Whenever a function call is made in about every language and computing model I've ever heard of, some information about the function call is pushed onto the call stack. This is so the processor knows where to pick up after the function completes and returns, and it also saves the state of a function in the call stack with all the local variables. This is often called a "stack frame". A stack frame is pushed on to the stack when a function is called and a stack frame is popped off of the stack when the function returns. Computers have been implementing functions this way since the dawn of computers.

In all languages I've worked with so far, stack space is limited, and if you create a recursive function, it can cause the process to run out of stack space if it works with so much data that all the stack frames added onto the stack after each recursive call will actually overflow the available stack space. This is space typically about 1MB - 8MB, depending on the operating system. That's a lot of stack space, and there's room for a call stack many thousands of calls deep, but it can run out if you're not careful. I'm always careful when I implement a recursive function in most languages, ensuring it doesn't have to deal with a huge amount of data.

The stack space limitation is a big reason recursive functions aren't as common in imperative languages. When I saw how important recursion is to Elixir, my thoughts immediately went to stack space, which is the major factor limiting recursion.

I suspected that the stack space is generous in Elixir, but I didn't know the limitations, since I had been unable to find that information. I did know that the stack size cannot be infinite, since computers do not have infinite resources, so enough recursion would certainly bring Elixir to its knees as well.

In order to get an idea of the stack size in Elixir, I ran the following code .

defmodule ExampleModule do		
	def do_something(param) do
		do_something(param) + 1
	end
end

Then I ran ExampleModule.do_something(1) in IEx.

This is an example of infinite recursion, since it calls itself recursively indefinitely. There's no base case here, just a recursive case. I also added an addition operation so that it would have to save the function state on the call stack. Otherwise, it could see that the parameter never changes and not even bother with saving it on the call stack.

I expected this call to shortly fail with some sort of stack overflow error, which is what typically happens in other languages I've used. Instead, absolutely nothing happened. It just sat there, running and running. After a few minutes I noticed that my computer was starting to slow down and became less responsive.

Curious about what was happening, I looked at how much memory and processing power the Erlang VM was using. It turned out that the Erlang VM wasn't using that much processing power, but it was currently using 30GB! of memory. I don't have anywhere near 30GB of memory on my machine, so it quickly became obvious that that the Erlang VM was steadily chewing through the virtual memory, and my operating system was swapping memory pages to the hard drive, causing things to slow down. This meant that by the time I finally killed the IEx process, my function had recursively called itself billions of times!

I was seriously impressed by the robustness of the Erlang VM at this point. Even when a function was greedily gobbling up stack space, it didn't hit any particular stack limit, but kept going. Based on this observation, I'm pretty sure that the Erlang VM doesn't use the operating system stack for the stack used by code running in the VM, but instead allocates memory for the stack from the operating system heap, growing the stack as needed. I believe that the Erlang VM would keep allocating memory for the stack until my machine hit its virtual memory limit and then the whole thing would fail.

So I concluded that stack for Erlang VM processes is dynamic and that the stack size is simply limited by how much memory the Erlang VM can allocate on the machine. I had heard that the Erlang VM was robust, but I hadn't expected or imagined this level of robustness. This is ridiculously awesome.

This means that it's really difficult for recursive functions to run out of stack space. So as long as my functions are behaving themselves, I shouldn't have to worry much.

Tail Call Optimization

Fortunately, Elixir also an optimization method that allows recursive calls to use very little stack space. We can make use of something called "tail call optimization".

If the last expression in a function is just a recursive call without anything happening after the recursive call, that recursive call does not cause more space to be used on the stack. Elixir will optimize it into a loop (or something like a that), and the use of stack space will be a constant O(1) instead of O(n). This means we can have no fear of processing billions of data elements using recursion, just as long as we make sure we are allowing tail call optimization. This optimization, combined with Elixir's robust call stack, means that all my previous concerns about using recursive functions have been allayed.

Let's take a look at an example of a function that does not allow tail call optimization. This is a function that sums all the numbers in a list.

def sum_unoptimized([]), do: 0
def sum_unoptimized([head | tail]) do
	head + sum_unoptimized(tail)
end	

Tail call optimization cannot happen here because the recursive call is not the last thing in the function. Yes, the recursive call is located after the addition in the source code, but that's not the order in which the computer will run it. The computer will first make the recursive call and then do the addition. It's impossible to do the addition before the recursive function call has been completed.

Here's the same function with tail call optimization.

def sum_optimized(list) do
	sum_optimized(list, 0)
end

defp sum_optimized([head | tail], current_sum) do
	sum_optimized(tail, current_sum + head) 
end	
defp sum_optimized([], current_sum), do: current_sum

The first function, list/1 serves just as the interface. The caller does not need to pass in the initial sum, unless we wanted to specifically make that a feature. The two clauses of the list/2 function are private and serve as the base and recursive cases.

Tail call optimization is possible here because the recursive call to sum_optimized/1 is the very last thing the function does. We compute the result on the way in rather than the way out, passing it onto the next recursive call. When the call chain finally ends, the result of the last call just gets passed back to the beginning. This allows Elixir to not need to use additional stack space for each successive call: it just needs to keep track of the very first call to the recursive function and the call that it's currently in.

Since the current state of the summation is passed along as the parameter, there's no state to save and no call stack frames to add. As a result, we have a summation function that uses the same amount of stack space whether there are 10 elements or 10 billion elements.

The trade-off here is that the tail-call-optimized functions have the tendency to be harder to read and understand. I advise optimizing if your function is going to be going through a lot of data or if the function would still be easy to understand anyway. If the tail-call-optimized version is hard to understand and it will not be handling significant amounts of data, there's no need to use tail call optimization. There's no need to trade readability and maintainability for an insigificant performance benefit.

Tail call optimization in not something that Elixir invented, but was a concept that existed prior to the language. My guess is that tail call optimization is actually part of the underlying Erlang VM and probably existed as a concept prior to Erlang.

C# has no tail call optimization, which comes as no surprise to me, seeing in that it was not built for recursive functions. Javascript actually does have tail call optimization in the language standard as of ES6/2015. However, as of this writing, Safari is the only Javascript environment to implement it. There appear to be no plans to implement it in Chrome, and by extension, Node.js. I suspect that tail call optimization is not a priority for the maintainers of most Javascript environments.

There's so much more to functions in Elixir than there is in other languages I've learned! I guess that shouldn't be a surprise to me. Elixir is a functional language after all, so functions are going to be a key part of the language with less emphasis on the usual programming constructs that appear in imperative languages.