Introduction to Erlang : Recursion (1/2)

March 30, 2011 |

Author Trigonakis Vasileios

Recursion

The definition of the word recursion is “(mathematics) an expression such that each term is generated by repeating a particular mathematical operation”, according to the WordNet. Recursion is one of the most powerful “tools” in a functional programming language and so it is for Erlang. Recursion can be used to apply divide and conquer techniques to problem solving, where a problem is broken to smaller subproblems, the subproblems are solved, and the results are “merged” to generate the final result.

Recursion happens when a function’s body definition includes a call to the function itself.

functionA(...) ->
    Body_before_recursion, % optional
    functionA(...),
    Body_after_recursion. % optional

Recursion is used instead of the conventional loop statements of other programming languages, such as while and for in C.

Examples

mlists:is_list/1

This is a Prolog-like definition of a function that checks if a construct is a list:

is_list([]) -> % the empty list is a list
    true;
is_list([_ | T]) ->
    mlists:is_list(T); % a compound construct is a list if it consinsts of
                       % a head and tail that has to be a list
is_list(_) ->
    false.
 
is_list2([]) -> % a more Erlang-friendly definition
    true;
is_list2([_ | _]) ->
    true;
is_list2(_) ->
    false.

mlists:length/1

This function calculates the lenght of a list:

length([]) ->
    0;
length([_ | T]) ->
    1 + mlists:length(T).
 
% an optimized version or length/1
length1([]) ->
    0;
length1([_, _, _ | T]) ->
    3 + length1(T);
length1([_ | T]) ->
    1 + length1(T).

Pretty simple, right? Lets measure the performance of mlists:length/1 and length1/1. For this purpose we will use a very useful BIF (Built-In Function); the timer:tc/3. This function can be used as:

% Module & Fuction : atoms
% Parameters : list
% returns: {TimeInMicroSeconds, Result}
timer:tc(Module, Function, Parameters).

Using it we get:

1> L = lists:seq(1, 100000).   
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
 23,24,25,26,27,28,29|...]
2> timer:tc(mlists, length, [L]).
{10864,100000}
3> timer:tc(mlists, length1, [L]).
{2165,100000}
4> L1 = lists:seq(1, 100).        
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
 23,24,25,26,27,28,29|...]
5> timer:tc(mlists, length, [L1]).
{11,100}
6> timer:tc(mlists, length1, [L1]).
{4,100}

Notice how a simple “trick”, consuming more than one element at a time, improves the performance.

for/1

The following function implements a construct similar to a for loop using higher level functions.

for(I, Condition, Increase, Do, Data) ->
    case erlang:apply(Condition, [I]) of
	true ->
	    DataNew = erlang:apply(Do, [I, Data]),
	    INew = erlang:apply(Increase, [I]),
	    for(INew, Condition, Increase, Do, DataNew);
	_ ->
	    Data
    end.

And can be used as:

2> I = 0, Condition = fun(J) -> J < 10 end.
#Fun<erl_eval.6.13229925>
3> Increase = fun(J) -> J + 1 end.
#Fun<erl_eval .6.13229925>
4> Do = fun(J, D) -> [J | D] end.
#Fun</erl_eval><erl_eval .12.113037538>
5> Data = [].
[]
6> loop:for(I, Condition, Increase, Do, Data).
[9,8,7,6,5,4,3,2,1,0]
7> f(). % can be used to forget the existing binding,
        % use f/1 as f(Binding) to forget a specific binding
ok
8> I = [a, b, c, d], Condition = fun(J) -> J =/= [] end.
#Fun</erl_eval><erl_eval .6.13229925>
9> Increase = fun(J) -> [_H | T] = J, T end.
#Fun</erl_eval><erl_eval .6.13229925>
10> Do = fun(J, D) -> {J, D} end.
#Fun</erl_eval><erl_eval .12.113037538>
11> Data = {}.
{}
12> loop:for(I, Condition, Increase, Do, Data).          
{[d],{[c language=",d"][/c][/c],{[b,c,d],{[a,b,c,d],{}}}}}
</erl_eval>

fib/1

Fibonacci of course!

fib(0) ->
    0;
fib(1) ->
    1;
fib(N) ->
    fib(N-1) + fib(N-2).

Tail Recursion

A function is called tail recursive when the recursive call to itself happens only in the last expression of the body in every clause. So,

non_tail_recursive(...) ->
    non_tail_recursive(...),    
    other_expression,
    ... .
 
tail_recursive(...) ->
    other_expression,
    ...     
    tail_recursive(...).

the first function is not tail recursive because the recursive call is followed by other expressions, while the second is, since the recursive call is the last statement.

Tail Recursion & Performance

In many programming languages tail recursion is good approach performance wise. In general, this is not the case in the latest releases of Erlang. Tail recursion is not guaranteed to give you better performance. This fact is even stated in the Efficiency Guide in Erlang documentation, as one of the “myths” about Erlang’s performance.

The proper way to evaluate performance is by benchmarking!

Tail vs. No Tail Recursive example

A tail recursive mlists:length_tr/1:

% not tail recursive
length([]) ->
    0;
length([_ | T]) ->
    1 + mlists:length(T).
 
% tail recursive
length_tr(List) ->
    length_tr(List, 0).
 
length_tr([], Length) ->
    Length;
length_tr([_ | T], Length) ->
    length_tr(T, Length + 1).

And the benchmarking:

2> L = lists:seq(1, 100000).
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
 23,24,25,26,27,28,29|...]
3> timer:tc(mlists, length, [L]).
{9538,100000}
4> timer:tc(mlists, length_tr, [L]).
{6421,100000}
5> L1 = lists:seq(1, 100).
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
 23,24,25,26,27,28,29|...]
6> timer:tc(mlists, length, [L1]).  
{21,100}
7> timer:tc(mlists, length_tr, [L1]). 
{9,100}
8> L2 = lists:seq(1, 1000000).       
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
 23,24,25,26,27,28,29|...]
9> timer:tc(mlists, length, [L2]).   
{126760,1000000}
10> timer:tc(mlists, length_tr, [L2]). 
{32865,1000000}
11> L3 = lists:seq(1, 10000000).                              
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
 23,24,25,26,27,28,29|...]
12> timer:tc(mlists, length, [L3]).       
{1262031,10000000}
13> timer:tc(mlists, length_tr, [L3]).
{286848,10000000}

Which reveals that in this case the tail recursive solution is quite faster than the non-tail recursive! The followed technique for transforming a function to tail recursive is very common and quite usefull and thus will be discussed in the next post.

Tail Recursion & Memory Leaks

Try running the explode/0 function:

nothing() ->
    nothing.
 
explode() ->
    explode(),
    nothing().

After some time you will get the following output:

11> loop:explode().
 
Crash dump was written to: erl_crash.dump
eheap_alloc: Cannot allocate 1140328500 bytes of memory (of type "heap").
Aborted

The reason why this happened is simple and should be obvious to most of you. Every call to explode/1 makes a new call, while the current one has not yet finished (non tail recursive -> something has to run after the recursive call returns) and so the space in the call stack cannot be freed for the current call until the next call returns. This never happens though, so eventually the runtime cannot allocate the memory space needed for creating a new call stack entry and crashes!

I will break this post here cause it is already quite long and I still have many things to write! So the next post will be also about recursion. I will present more examples and, for the first time, I will include some small programming excercises :-).

Series NavigationIntroduction to Erlang : Control FlowIntroduciton to Erlang : Recursion (2/2)

Posted in Erlang, Erlang Series, Programming |

Tags: erlang, recursion, series

Distributed Life