September 12, 2013

Concurrent Ada Programming

Ada's model for doing concurrent programming is absolutely marvelous, and today I'm going to give you a small taste of it. More or less all programming languages provide tools for concurrent programming, but few do it as elegant as Ada, where the whole concept has been build into the language from it's inception in 1983.

In Ada concurrent programming is a first class citizen, not an afterthought, and in this day and age where even the cheapest of processors have multiple cores, being able to properly harness the power of those many cores is important. A lot of programs could benefit from a bit of concurrent magic, but since concurrent programming brings with it a whole new set of problems, many programmers shy away from it. Not so with Ada programmers. We have tasks, protected objects, scheduling, guards and entries right there in front of us. In this article I'm going to show you how tasks and protected objects can be used to add concurrency to a very simple program.

The first thing we're going to do is setup a small program that solves the plain problem of counting 20 times to 1_000_000_000.

One possible Ada solution to this problem is this:



Before we get into actually compiling and running Sequential, allow me to first explain what's going on in the program.

First we with the two packages Ada.Task_Identification and Ada.Text_IO. What this does is make the subprograms and types from those packages visible and available to our program, somewhat analogous to the #include directive used in C. After that we setup our "main" procedure and name it Sequential. In Ada you can call your main procedure whatever you like. The use clauses we encounter next lets us access the tools of the Ada.Text_IO and Ada.Task_Identification packages without having to prefix every type and subprogram with the package name. This is equivalent to the using namespace statement in C++.

The Ada.Task_Identification package makes it possible for us to figure out the id of a running task and Ada.Text_IO makes the Put_Line procedure available to Sequential.

We declare and initialize two objects in the declarative part of the program (between is and begin): Id and Jobs. The Id constant contains a String representation of the environment task identifier. As our program only have one running task (or thread if you will), which is the main environment task, the output done in the Put_Line line will show the same Id for every job completed.

The Jobs variable is the queue of jobs, in our case 20. Note that the Positive type is actually a subtype of Integer with the range 1 .. Integer'Last. If you try to assign a value below 1 or above Integer'Last then the program will raise a Constraint_Error exception.

After the begin keyword we're in the actual body of the program, and the first thing we do is setup a for loop that'll count down from 20 to 1, as indicated by the reverse keyword and the given range of 1 .. Jobs.

Inside this loop we do the actual work: Counting to 1_000_000_000 for each job and then outputting a line of text indicating whenever a job has been completed and by which task id.

And that's it.

In order to compile the program do:



And then to execute and time:



My laptop yields the following when executed and timed:

thomas@t420:~/Ada/Simple_Tasking time sequential
Job 20 done by main_task_000000000064A010
Job 19 done by main_task_000000000064A010
Job 18 done by main_task_000000000064A010
Job 17 done by main_task_000000000064A010
Job 16 done by main_task_000000000064A010
Job 15 done by main_task_000000000064A010
Job 14 done by main_task_000000000064A010
Job 13 done by main_task_000000000064A010
Job 12 done by main_task_000000000064A010
Job 11 done by main_task_000000000064A010
Job 10 done by main_task_000000000064A010
Job 9 done by main_task_000000000064A010
Job 8 done by main_task_000000000064A010
Job 7 done by main_task_000000000064A010
Job 6 done by main_task_000000000064A010
Job 5 done by main_task_000000000064A010
Job 4 done by main_task_000000000064A010
Job 3 done by main_task_000000000064A010
Job 2 done by main_task_000000000064A010
Job 1 done by main_task_000000000064A010

real 0m52.737s
user 0m52.636s
sys 0m0.001s

As you can see it took a good 52 seconds to finish all 20 jobs. At no point during the execution of the program did the system utilize more than one CPU core. Each job was completed in an absolute sequential manner, starting with job 20 and ending with 1.

Surely this is sad, considering my laptop is equipped with a CPU sporting a whooping four cores. Lets add some tasking magic to the program and see how much faster we can make it go.



Let's dive in!

We with and use the same packages as before, but after that there are quite a few new things, first of which is a protected object. In Ada these are objects that export procedures, functions and entries that enables us to interact with the encapsulated data structure. In our case we've setup a protected object called Jobs, but we could just as well have created a protected type and then declared one of more objects to be of that type.

We can only operate on the data structure of a protected object via the exported subprograms and this is done under automatic mutual exclusion. A protected object will allow many concurrent readers (via exported functions), but only one writer (via exported procedures and entries).

A protected object is just what we need to protect our job queue from race conditions, now that we have several tasks querying and updating it.



First we declare our Jobs protected object. We export the sole Get (Job : out Natural) procedure, which is the only way for us to interact with the very simple data structure, consisting only of the variable J.

In the body of Jobs we find the actual implementation of the Get procedure. All it does is put the current value of J into the Job parameter and then decrease the value of J by one. With this simple system in place we're ensured that access to J is only ever granted to one single task at the same time. Only when that task is done is the next one allowed access.

With a secure job queue in place we move on to creating the tasks that will be doing the actual work:



First we declare a task type called Worker. We could've setup a task object, just as we did with the Jobs protected object, but since we want more than one worker a type is the way to go. The body of the task looks a lot like the body from the Sequential program, with a few minor differences. We declare the A_Job variable to be of the type Natural, which is a subtype of Integer with the range 0 .. Integer'Last. The Id constant is exactly like the one from the first program.

In the body of the Worker task we've got a plain loop that starts out with putting a job into the A_Job variable. We then check if the job is valid in the exit when... line and if not we exit the loop and the task is then completed.

The for loop and the Put_Line code is more or less the same as before.

Last we have this:



Here we declare an array of 4 Worker tasks. As soon as the program reach this spot, four workers are created and when we pass begin they spring to life. Since we already have 4 workers counting to 1_000_000_000 like there's no tomorrow, we don't really need to add any code to the main environment task, so we simply add a null statement. The main environment task is the master of the 4 workers, so it wont complete until all 4 workers have completed.

Lets see how the program performs now:

thomas@t420:~/Ada/Simple_Tasking$ time concurrent
Job 20 done by workers(4)_000000000065FCC0
Job 19 done by workers(2)_0000000000659240
Job 18 done by workers(3)_000000000065C780
Job 17 done by workers(1)_0000000000655D00
Job 16 done by workers(4)_000000000065FCC0
Job 15 done by workers(2)_0000000000659240
Job 13 done by workers(1)_0000000000655D00
Job 14 done by workers(3)_000000000065C780
Job 12 done by workers(4)_000000000065FCC0
Job 10 done by workers(1)_0000000000655D00
Job 11 done by workers(2)_0000000000659240
Job 9 done by workers(3)_000000000065C780
Job 6 done by workers(2)_0000000000659240
Job 8 done by workers(4)_000000000065FCC0
Job 7 done by workers(1)_0000000000655D00
Job 5 done by workers(3)_000000000065C780
Job 4 done by workers(2)_0000000000659240
Job 3 done by workers(4)_000000000065FCC0
Job 1 done by workers(3)_000000000065C780
Job 2 done by workers(1)_0000000000655D00

real 0m18.727s
user 1m11.454s
sys 0m0.025s

YAY! It is much faster now, and all four cores on my CPU is running at 100% while the program executes. Also note how the jobs are no longer completed sequentially. You can experiment with the amount of workers by changing the size of the Workers array. On my box the sweet spot is firing up 20 workers at the same time. With twenty workers I'm very close to a clean 18 seconds. The second fastest result is obtained with 4 workers. Every other amount of workers I've tried is slower.

There's of course a whole lot more to Ada and concurrent programming than shown here. This little example has barely scratched the surface. A good place to start is the Ada Programming Wikibook.

You can grab the code for these examples at github.com/ThomasLocke/Simple_Tasking.

Adding concurrency to an Ada program is very simple, and the model for doing it is both easy to understand and to work with. When you're using Ada there's simply no excuse for not adding concurrency to your program where it will benefit from it.