Parallelizing simple algorithms series
- Parallelizing simple algorithms series
- Parallelizing simple algorithms : Fibonacci
In the following weeks, I will try to write some posts about parallelizing simple algorithms with the Erlang programming language. The main motivation for executing program instructions in parallel is to complete the computation faster than the sequential equivalent solution. As it is used, when someone wants to present the “parallelization power” of a programming language/model, the Fibonacci numbers are used. So, following this conversion, my first post will be about parallelizing the calculation of the Fibonacci integer sequence.
Theory & Goals
Parallel computations can generally be categorized in two main classes; the data and the task parallel computations. These classes can be defined as:
- Data parallel: A data parallel computation is one in which parallelism is applied by performing the same operation to different items of data at the same time; the amount of parallelism grows with the size of the data.
- Task parallel: A task parallel computation is one in which parallelism is applied by performing distinct computations – or tasks – at the same time. Since the number of tasks is fixed, the parallelism is not scalable.
What I currently have in mind is to mostly (probably only) explore data parallelism. One simple reason for this decision is that the algorithms that I plan to parallelize are rather simple to be task parallelizable. In any case, in every post I will mention in detail the theory/motivation behind the parallelization.
Next I will formulate parallelism according to the level of parallelization:
- No parallelism: sequential execution of the program instructions.
- Fixed parallelism: a solution that applies a fixed amount of parallelism. For example, if the solution is designed for k-core machine, it would possibly be a k-way parallel algorithm. It is obvious that fixed parallelism has scalability problems, since it is built for a specific execution environment.
- Unlimited parallelism: a solution that applies all possible parallelism. For example, if we wanted to calculate the number of even numbers in an k-length integer array, we would use k-way parallelism. In most of the cases (where k >> number of cores) unlimited parallelism is going to have the opposite results than the ones expected; it will degrade the performance of the algorithm.
- Scalable parallelism: a solution that applies the proper amount of parallelism, according to the execution environment available and the problem to be solved. As it is easily understood, scalable parallelism is the one that someone should try to achieve.
Of course, achieving scalable parallelism is in most of the times severely more difficult than fixed and unlimited parallelism. A characteristic example is the Bitonic sort, which is the scalable parallel version of the mergesort sorting algorithm. In my tests I will try to create scalable parallel solutions, but in many cases I will simply use fixed parallel ones.
In a nutshell, my main goals from the following posts will be:
- to introduce the power of parallelism
- to relate the theory of parallelism (task vs. data, no vs. fixed vs. unlimited vs. scalable) with some practical examples
- and to present how “interesting” the Erlang programming language is
About Erlang
Erlang is a functional programming language which has many features more commonly associated with an operating system than with a programming language: concurrent processes, scheduling, memory management, distribution, networking, etc.
The initial open-source Erlang release contains the implementation of Erlang, as well as a large part of Ericsson’s middleware for building distributed high-availability systems.
“Experimental Setup”
The programmed solutions will be benchmarked in a machine with the following characteristics:
- Intel Core 2 Duo @ 1.8GHz, 2GB RAM, 3GB swap
- Ubuntu 10.10, kernel version 2.6.35-24
- Erlang R13B03, SMP enabled