Structured Concurrent Programming

Luis Mateu
e-mail: XXlmateu@dcc.uchile.clXX (without the X)

Abstract

Today, the usual environment for programming concurrent applications is a system which provides multiple threads sharing all the objects. This environment is easy to implement on a multiprocessor with shared memory. However, trying to implement it efficiently on a distributed memory system is difficult. This situation has led to distributed object systems like Java RMI, CORBA or Nexus. These systems simplify the implementation of sharing objects by imposing restrictions on which objects can be shared.

There are two problems with this approach. Firstly, porting multi-threaded applications from shared memory multiprocessors to distributed object systems requires a lot of redesign and reprogramming, because the semantics of both models are different. And secondly, multi-threaded programming has a bad image among developers because it can lead to some errors which are devilishly hard to find.

The goal of this research is to promote a new structured approach to concurrent programming. This approach is based on a new programming model called safe-threads which offers a uniform semantics for both shared memory mutiprocessors and distributed memory machines. We claim that our approach is structured because it ensures the mutual exclusion of safe-threads when manipulating the same object and therefore it eliminates the source of programming errors which are the most difficult to find.

The problem of mutual exclusion

One of the most hazardous programming errors in concurrent applications is when the programmer fails to ensure the mutual exclusion of threads when manipulating the same shared object. This kind of errors can produce incorrect outputs at random times without any warning to the user. In other cases, the incorrect output is detected or the application crashes but the error is imposible to diagnose because, due to its random nature, it can not be reproduced in a debugging environment.

A widely accepted design decision for concurrent languages has been to delegate the responsability of ensuring the mutual exclusion of threads to the programmer (through locks, semaphores or monitors variants for example), opening the door to the fearsome programming errors mentioned above. For this reason, it can be argued that modern concurrent programming languages are unstructured.

In the seventies, Brinch Hansen and C.A.R. Hoare pionnered researches on structured concurrent programming by introducing the critical region[Brinch72] and the monitor[Hoare74] concepts. These were structured constructions for concurrent languages because they ensured always the mutual exclusion of threads while performing operations on share data. Unfortunately, as stated by Brinch Hansen in [Brinch99], those researches have been ignored in recent concurrent languages (as in Java for example).

The model of safe-threads

Besides the critics of Brinch Hansen, we have found in the Internet numerous critics of multi-threaded programming. Some writers compare threads to the pointer arithmetic of C, others recommend seriouly not using them and even they qualify threads like a design flaw of Java.

These critics motivated us to conceive a new model of concurrency for object oriented languages. We have named this model as safe-threads because we claim that it allows safer programming than with traditional threads. Safe-threads are based on the original concept of monitors and the remote method invocation of Java.

From a conceptual point of view, the main idea of the safe-threads model consists in visualizing the runtime environment as a large number of logical machines, each one executing a process communicating to other processes through remote method invocations (as in Java RMI or CORBA). A logical machine works conceptually as a true machine with its own memory for allocating objects and with its own single processor to execute threads. The logical machine also acts like a monitor executing remote calls one at a time. This ensures that the threads will mutually exclude while running on the same logical machine.

From the implementation point of view, the model can be implemented efficiently in a shared memory multiprocessor by placing all logical machines in the same physical address space (typically a Unix heavy process with multiple threads). This approach will introduce much less overhead than placing each logical machine in an independent address space. On distributed memory machines (such as clusters or networks of workstations) the model can be implemented by partitioning the set of logical machines and placing each partition on one physical address space of one processor.

The design methodology behind the safe-threads model consists in decomposing a complex system in a multitude of components (active or pasive), each one executing on a logical machine and communicating with objects living in others components through remote invocations. These components are like heavy processes from a conceptual point of view but are light-weight in implementation.

The light-weight implementation of the model of safe-threads makes possible to create a large number of this components because their overhead in memory usage and execution time is minimal when compared with the same components implemented with traditional threads. For example, a remote invocation of an object living in the same physical address space of the caller can be implemented with an overhead as low as the overhead of the invocation of a synchronized method in Java.

On the other hand, the safe-threads allows the programmer to not worry about the physical location of two interacting components. If a method invocation refers to an object living in another component, the semantics of parameter passing is the same for the case in which the object lives in the same physical address space or when it belongs to another physical machine. This means that object parameters are passed by reference when they have a remote interface and by copy in any other case (as occurs with Java RMI).

Moreover, each component has its own set of class variables (static variables in Java). They will never share class variables even when they live in the same physical address space, to guarant a uniform behavior with the case when they live in different address spaces.

The disadvantages of the model of safe-threads

The model of safe-threads stresses on the correctness of the computation by ensuring that a component will never be lead to a inconsistent state as can occur with traditional threads working simultaneously with the same shared data. However some efficiency is sacrified, because the safe-threads always enforce the mutual exclusion, hence diminuishing the concurrency of the application.

With traditional threads programmers can avoid to synchronize threads in places where no inconsistent state can be reached. In other cases, programmers can implement readers and writers type of synchronization. In this way, the programmer can increase the available concurrency of the application. A programmer is not allowed to decide on these optimizations with the model of safe-threads.

One way to circumvent this inconvinience of the model of safe-threads is to make the components more fine-grain.

Goals of this research

The general goal of this research is to develop an implementation of the model of safe-threads by embedding it into an existing programming language. We want to publish this implementation on the internet, so that the community can experiment with the model to evaluate its vertues.

The specific goal is to implement a dialect of Java where the traditional model of threads has been replaced with the model of safe-threads. This dialect is completely compatible with standard Java for strictly sequential programs, but incompatible for multi-threaded ones. Unfortunately, keeping the compatibility with the standard Java threads would keep also their fragility.

The implementations of safe-threads

The first implementation of the model of safe-threads for Java was developed by Jurgen VanHam as his thesis for the European Master in Object Oriented & Software Engineering Technologies. In his implementation, he modified Kaffe: an open source distribution of the Java Virtual Machine. He enriched this VM with the safe-threads, by modifying the way that certain instructions are interpreted.

This implementation has served as a proof of concept: to show that safe-threads can be implemented. It is also possible to write and test programs that use safe-threads.

The main problem with this implementation is that it is too slow, because the modifications work for the kaffe interpreter only, not the jit compiler.

New research directions

The idea is to achieve a safe-threads implementation for Java more efficient than an interpreter. This can be obtained by:

There are also other directions of research:

Bibliography

The most important research about the subject has been published in the following papers:

Further reading: