Deeper Dive To (Tail) Recursion

In part #1 of this series I wrote about reasons why I think recursion is important and try to reevaluate on this topic a bit. This time I would like to finally focus on tail recursion itself.

One of occasions on which I’ve been discussing tail recursion was when I was staying in hotel in London with one of my colleagues. Original conversation we had was about Prolog therefor I’ve wrote original examples in Erlang (since I don’t know Prolog at all but know that Erlang’s syntax is based on it). Anyway I’ll also add few examples in other languages to make this article a bit more interesting.

Tail Recursion

If you’re interested in proper definition tail recursion have a look at wikipedia article. To translate this definition to other language you might find easier to understand we can go back to recursion itself. In previous part we said that recursion means self reference - speaking about functions we can simply say that it’s function which calls itself internally. One thing that might be confusing is term “tail recursion” itself. In article about recursion we were often using functions for manipulating linked list. This makes sense because list is useful and simple recursive datastructure. However as you might know list are defined as being either empty (nil) or head cons tail where tail is other list (the one without head). I’ve intentionally used some examples using term (variable) tail in previous article just to make it clear that using tail of list doesn’t automatically mean your implementation is tail recursive. Tail recursion and tail of linked List are two completely unrelated things. Honestly I don’t know where tail recursion name comes from and even though I believe there are some reasons for calling it tail I think it only causes confusion. Another misunderstanding I think is pretty often is folks thinking that every function containing recursive statement on literally “last line” is tail recursive - It’s not. Let’s leave lazy languages like haskell aside and think just about execution of something like length(list) = 1 + length(tail(list)) - You can’t add 1 before you calculate length(tail(list)), right?

Tail recursive function is function where recursive call is very last (from execution point of view) expression within that function. This means that there is nothing left to do when recursion finish and result of recursive call is also result of parent scope (you don’t need to do anything like `add 1` to result of recursive call). In such cases compilers can implement optimization so parent function “exits” and recursive calls takes its place in stack. Almost like loop, right?

There is also one other way I like to think about this - Tail recursive is actually just calculating arguments for next recursive call until up to the point where it can calculate final result without any other recursive call.

Optimization

One think worth mentioning is that writing tail recursive code doesn’t automatically means that tail recursive optimization takes place. Optimization step depends on compiler. Generally every language with decent functional support (by decent I mean every language that isn’t necessary pure but claims to have at least partial functional paradigm support) has this optimization build-in. Also I would like to mention two controversial platforms. JavaScript is not yet capable of such optimization even despite promises of including this to ECMA standard was made a long time ago. However languages like PureScript, Elm or ghcjs does this optimization in compile time for you (generated JS uses loops). On JVM situation is quite similar. There were attempts to add this optimization to Java which are in some sense still actual. Yet JVM doesn’t have this optimization yet. Scalac (Scala’s compiler) can optimize you’re code in compile time (you need to add @tailrec annotation). On the other hand Clojure has construct called recur to go around absence of support on VM level.

Writing Tail Recursive Code

Rewriting recursive functions to tail recursive ones may seem to be a bit hard at first. Anyway from my experience it might be different from case to case and after some practice (of brain) it might feel even simpler to think in tail recursion out of the box. However I agree this is not true for every recursive algorithm and sometimes it requires some effort. That say it might seem that some points I made in previous article about recursion being ultimate technique over loops were bit misleading. Honestly I still think complicated tail recursion is still easier to understand and more importantly much easier to reason about than complex loops.

Examples

Let’s have a look on few examples of how we can turn recursive function to tail recursive one.

Factorial

This is one if basic implementations of factorial in Erlang.

-module(fac).

-export([fac/1]).

%% Basic recursive implementation of factorial
fac(0) -> 1;
fac(N) when N > 0 ->
    N * fac(N - 1).

Obviously this implementation is not tail recursive. Last expression evaluated is * with N and result of recursive call fac(N - 1). Lets change that. To do so we will use another “private function” which will take one more argument. This argument is often called accumulator since it’s used for storing intermediate results. Let’s look of final implementation since it says more than thousand words.

-module(tail_fac).

-export([tail_fac/1]).

%% Tail recursive factorial - public function
tail_fac(N) -> tail_fac(N, 1).

%% Actual private tail recursive implementation
tail_fac(0, Acc) -> Acc;
tail_fac(N, Acc) when N > 0 ->
    tail_fac(N - 1, Acc * N).

As you can see Erlang is good fir for example like this due to fact that function is define for particular arity. This mean we can use same name for function that takes one (tail_fac/1) and two (tail_fac/2) arguments and use them as they were completely different functions. Also we can expose just one of them.

As you can see tail_fac/1 just calls tail_fac/2 with initial accumulator 1. We can say that this is just initialization for call to tail_fac/2. tail_fac/2 on the other hand returns Acc when N == 0. The way I like to think about this is that we’ve changed the direction in which we compute factorial. Instead of starting from 1 we start from N. Let’s say we’re evaluating tail_fac(3). This calls tail_fac/2 with N = 3 and Acc = 0. N * Acc then is 3 * 1 which is 3. This is just “identity” of N so if we say 3! = 3 * 2 * 1 this calculates first just 3 as initial value. Also you can thing that what we actually did is rewrite factorial as 3! = ((1 * 3) * 2) * 1) so each sub expression has 2 arguments. N on the other hands keeps track of how many times we need to keep going. In next call N = 2 and Acc = 3. We call recursively one more time with N = 1 and Acc = 3 * 2 = 6. In next recursive call N = 0 and Acc = 6 * 1 = 6. Now we match first pattern (tail_fac(0, Acc)) and just return Acc which is our result - 6.

As you can see we last call with N = 1 is not necessary because n * 1 = n. This means we can add small optimization to our code like:

%% Actual private tail recursive implementation
tail_fac(N, Acc) when N < 2 -> Acc;
tail_fac(N, Acc) when N > 1 ->
    tail_fac(N - 1, Acc * N).

so we can return on N = 1 without extra recursive call.

Length

Now let’s have a look on length implementation. We already know this function for previous article. This is basic recursive implementation:

-module(length).

-export([length/1]).

%% Basic implementation of length
length([]) -> 0;
length([_]) -> 1;
length([_|T]) -> 1 + length(T).

This really looks much better than our previous attempt in JS (and also this is not broken). How this works? Simply - for empty list (btw [] is List in Erlang if I haven’t mention this before) is 0. For list with just one element the length is 1. For any other list it’s 1 + length of previous list. Say we have list like [1,2,3]. length of this list is 1 + (length([2,3])) -> 1 + (1 + length([3])) -> 1 + (1 + 1) -> 1 + 2 -> 3. Let’s make this tail recursive.

-module(tail_length).

-export([tail_length/1]).

%% Tail recursive length - public function
tail_length(L) -> tail_length(L, 0).

%% Actual private tail recursive implementation
tail_length([], _) -> 0;
tail_length([_], Acc) -> Acc + 1;
tail_length([_|T], Acc) -> tail_length(T, Acc + 1).

Once again I like to think about this as like calculation from other end. In previous example we calculated length of 3 element list by constructing expression containing sub expression of length calculation for each tail and evaluating it. There is certainly nothing bad about it from mathematical point of view. However our machines have certain attributes and limitations (like time and space). How tail recursive implementation works? Instead of calculating length of list’s tail first we calculate lenght of heads and continue by adding lenght of next head up to the point there is nothing left. If this sound confusing don’t worry. Just follow this computation with me.

Again lets assume we have list [1,2,3]. tail_length/1 acts just like public interface (initializer) for our private implementation so actual call is to tail_lenght/2 which looks like tail_length([1,2,3], 0). What we do next is to calculate length up to this point by adding intermediate (Acc) result to result for current head - 0 + 1. Since we don’t have lenght for any element yet we pass 0 in intial call (you can think this is length of empty list if you wish). Ok so length of heads up to this point is 1 (0 + 1). Then we need calculate length of tail in next call tail_length([2,3], 1). See that there is nothing we have wait for? Result of this expression is just result of recursive call. This is why compiler is able to optimize this under the hood. Let’s continue. Next call will look like this tail_length([3], 1 + 1) = tail_length([3], 2) because length of head is always one we just need to add 1 to length of previous heads every time we don’t match empty list. And finally last call matches 2 cause which evaluates 2 + 1 = 3 and this is our result.

You can see I’m using term tail a lot. Once again this has nothing to do with tail-recursion itself but rather with list we are using in examples. This might be a bit confusing but even though recursion is common while working with lists it’s really not the only place where we are speaking about (tail) recursion. This is why I used factorial as first example.

So far we have 2 nice little examples of tail recursion in Erlang. However you can easily transform all of this to any other language (even to one which has no tail recursive optimization build in if you want - but don’t expect any better characteristics than). Of course every language has it’s own specifics. Let’s have a look of possible Haskell implementation.

Notice that Haskell’s pattern matching is slightly different. In Erlang there were 3 patterns to match list - [] for empty list (nil), [x] for list with one element and [H|T] for list with more elements (Head and Tail). In Haskell you need just 2 patterns since list with one element (just head) is list with head and empty list as tail - head:[]. With this in mind lets have a look at actual code.

module Length(length) where

length :: [a] -> Int
length = length' 0

length' :: Int -> [a] -> Int
length' acc [] = acc
length' acc (h:t) = length' (acc + 1) t

You can notice few other differences from previous example. First we are using prime (') in name of private function. Also the order of arguments is different. Since Haskell is using currying this order of arguments seems to be more logic. This is nice example of how language features might affect the way you design your API to make its usage convenient. However we’re not hear to speak about neither language features nor currying. You can find plenty of articles to learn more about both.

Recursive functions for data-structures like list have one interesting property. There is really a lot of useful functions like map, filter, length, zip, sum and similar. What they have in common? Try to think about possible implementation. You always initialize them with some value. This value is of the same type as result of recursive call. And then you go over all elements inside this data-structure. In fact this pattern is so common that it has it’s own name and abstraction in almost every language (at least one with higher order functions). This is called folding. You might know this as folder foldl or reduce function in your language. Also you’re maybe familiar with fancy buzzword thing called MapReduce. Fun fact - Map is just one specific type of reduced. Term “map-reduce” is in that sense like saying “paint_inred-paint_incolor”. Anyway this name makes some sense since due to distributed nature of such systems map is type of reducer you can run in parallel (in distribution) and then you reduce all collected data way you want.

In next (and last) part of this series of articles I want to dedicate just to folding. Anyway for now let me show you one more example. Next algorithm can is recursive but can’t be implemented using folding.

Fibonacci Number

Fibonacci numbers are highly overused example for recursive algorithms. You probably saw it dozen times so here it’s once again:

-module(fib).

-export([fib/1]).

fib(0) -> 0;
fib(1) -> 1;
fib(N) -> fib(N - 2) + fib(N - 1).

And the most slides on conferences ends right here. For example this one by “Pragmatic Dave Thomas”.

I really don’t want to undermine mr. Thomas in any way but this is how same thing looks like in Ruby:

def fib(n)
  return 0 if n == 0
  return 1 if n == 1
  fib(n-1) + fib(n-2)
end

Doesn’t look so different to me¯\_(ツ)_/¯. Even though more ruby-like implementation would be probably:

class Integer
  def fib
    return 0 if self == 0
    return 1 if self == 1
    (self - 1).fib + (self - 2).fib
  end
end

So then you can use it like 10.fib => 55.

Anyway the reason I’m showing you Fibonacci number is because you need to know 2 previous results to calculate next number in sequence. It turned out that a lot of people found it difficult to transform this to tail recursive implementation. In fact it’s fairly simple! Just use two accumulators instead of one!

-module(tail_fib).

-export([tail_fib/1]).

tail_fib(0) -> 0;
tail_fib(1) -> 1;
tail_fib(N) -> tail_fib(N - 2, 0, 1).

tail_fib(0, Acc1, Acc2) -> Acc1 + Acc2;
tail_fib(N, Acc1, Acc2) -> tail_fib(N - 1, Acc2, Acc1 + Acc2)

That’s it! Principle is still the same just applied to different problem. I’m not going to go through evaluation once again so take this as exam if you want.

Final thoughts

I hope now you have better understanding what tail recursion is, how it works and how to write your own tail recursive functions. However in most cases you don’t really need to implement everything we did in this article. For instance our length function can be written in much less code using just one simple level of abstraction - foldr, foldl or reduce or Foldable if you want. More about this next time. Peace.

Deeper Dive To (Tail) Recursion - 2 Tail Calls Boosting recursive algorithms