Luis Mateu
e-mail: XXlmateu@dcc.uchile.clXX (without the X)
There are two problems with this approach. Firstly, porting multi-threaded applications from shared memory multiprocessors to distributed object systems requires a lot of redesign and reprogramming, because the semantics of both models are different. And secondly, multi-threaded programming has a bad image among developers because it can lead to some errors which are devilishly hard to find.
The goal of this research is to promote a new structured approach to concurrent programming. This approach is based on a new programming model called safe-threads which offers a uniform semantics for both shared memory mutiprocessors and distributed memory machines. We claim that our approach is structured because it ensures the mutual exclusion of safe-threads when manipulating the same object and therefore it eliminates the source of programming errors which are the most difficult to find.
A widely accepted design decision for concurrent languages has been to delegate the responsability of ensuring the mutual exclusion of threads to the programmer (through locks, semaphores or monitors variants for example), opening the door to the fearsome programming errors mentioned above. For this reason, it can be argued that modern concurrent programming languages are unstructured.
In the seventies, Brinch Hansen and C.A.R. Hoare pionnered researches on structured concurrent programming by introducing the critical region[Brinch72] and the monitor[Hoare74] concepts. These were structured constructions for concurrent languages because they ensured always the mutual exclusion of threads while performing operations on share data. Unfortunately, as stated by Brinch Hansen in [Brinch99], those researches have been ignored in recent concurrent languages (as in Java for example).
These critics motivated us to conceive a new model of concurrency for object oriented languages. We have named this model as safe-threads because we claim that it allows safer programming than with traditional threads. Safe-threads are based on the original concept of monitors and the remote method invocation of Java.
From a conceptual point of view, the main idea of the safe-threads model consists in visualizing the runtime environment as a large number of logical machines, each one executing a process communicating to other processes through remote method invocations (as in Java RMI or CORBA). A logical machine works conceptually as a true machine with its own memory for allocating objects and with its own single processor to execute threads. The logical machine also acts like a monitor executing remote calls one at a time. This ensures that the threads will mutually exclude while running on the same logical machine.
From the implementation point of view, the model can be implemented efficiently in a shared memory multiprocessor by placing all logical machines in the same physical address space (typically a Unix heavy process with multiple threads). This approach will introduce much less overhead than placing each logical machine in an independent address space. On distributed memory machines (such as clusters or networks of workstations) the model can be implemented by partitioning the set of logical machines and placing each partition on one physical address space of one processor.
The design methodology behind the safe-threads model consists in decomposing a complex system in a multitude of components (active or pasive), each one executing on a logical machine and communicating with objects living in others components through remote invocations. These components are like heavy processes from a conceptual point of view but are light-weight in implementation.
The light-weight implementation of the model of safe-threads makes possible to create a large number of this components because their overhead in memory usage and execution time is minimal when compared with the same components implemented with traditional threads. For example, a remote invocation of an object living in the same physical address space of the caller can be implemented with an overhead as low as the overhead of the invocation of a synchronized method in Java.
On the other hand, the safe-threads allows the programmer to not worry about the physical location of two interacting components. If a method invocation refers to an object living in another component, the semantics of parameter passing is the same for the case in which the object lives in the same physical address space or when it belongs to another physical machine. This means that object parameters are passed by reference when they have a remote interface and by copy in any other case (as occurs with Java RMI).
Moreover, each component has its own set of class variables (static variables in Java). They will never share class variables even when they live in the same physical address space, to guarant a uniform behavior with the case when they live in different address spaces.
With traditional threads programmers can avoid to synchronize threads in places where no inconsistent state can be reached. In other cases, programmers can implement readers and writers type of synchronization. In this way, the programmer can increase the available concurrency of the application. A programmer is not allowed to decide on these optimizations with the model of safe-threads.
One way to circumvent this inconvinience of the model of safe-threads is to make the components more fine-grain.
The specific goal is to implement a dialect of Java where the traditional model of threads has been replaced with the model of safe-threads. This dialect is completely compatible with standard Java for strictly sequential programs, but incompatible for multi-threaded ones. Unfortunately, keeping the compatibility with the standard Java threads would keep also their fragility.
This implementation has served as a proof of concept: to show that safe-threads can be implemented. It is also possible to write and test programs that use safe-threads.
The main problem with this implementation is that it is too slow, because the modifications work for the kaffe interpreter only, not the jit compiler.
Daniel Romero has been working on this approach, building a preprocessor with the help of JavaCC and Java Tree Builder. One of the main conclusions of his work is that descendent parsers (like those generated by JavaCC) are not well suited for semantic analisys. Therefore the need to reprogram the preprocessor with a generator of ascendent parsers (like SableCC).
The most complex work here is to perform the type analisys of the sources. This analysis is needed to be able to detect all the cases where the invariants of the safe-thread model could be violated and hence generate safe Java code.
This approach is probably the simplest way to implement the safe-threads model, because there is no need to perform type analysis, no complex sintax to work with, etc. The implementation must only transform certain JVM instructions into sequences of instructions which ensure the invariants of the safe-threads model.
In this paper we introduce the model of safe-threads as a way to program robust multi-threads applications. We explain what is the problem with traditional threads and describe from a conceptual point of view our model of safe-threads. We also discuss the decisions that we took when designing the model and why.
Our model is based on the ideas of remote method invocation (RMI) and Hoare's monitors.
This paper describes an extension of Common Lisp for programming object oriented distributed applications. The first goal of the authors in this work was distributed transparency instead of robustness, but they obtain a programming model which is conceptually very close to what we have done for Java.
The paper can be downloaded here.
This is a Java environment for programming distributed applications based on mobile components. As in NetClos the main goal is transparent distribution. The concept of mobile component here is comparable to the LJVMs of our safe-thread model.
The home page of the project is in http://www.dsg.technion.ac.il/fargo/.
This is the best book that I have read about concurrent programming in Java. For a basic introduction of the perils of concurrency I recommend specially reading the chapters 2 and 3. For understanding the basis of synchronization read the chapter 4.
In this paper the author examines the synchronization features of Java and finds that they are insecure variants of his earliest ideas in parallel programming published in 1972-73. The claim that Java supports monitors is shown to be false. The author concludes that Java ignores the last twenty-five years of research in parallel programming languages.
This and the next two papers are important for historical reasons. They are the most important research in structured concurrent programming.
This paper presented the syntax construct known as critical regions in classic books on operating systems design. This construct was a proposal for structured representation of multi-threaded programming in high level languages. The main characteristic of critical regions is that they ensure the mutual exclusion of threads when manipulating shared data.
Monitors are another well known syntax construct for ensuring the mutual exclusion of multi-threaded programs. Java monitors are a mix between Hoare's monitors and Brinch Hansen's regions, but sacrifying the robustness because they don't ensure mutual exclusion: it is the responsability of the programmer to find the place where synchronization is needed.
This work is a departure from monitors and regions. It introduces a syntax notation to describe parallel algorithms. The author models parallel programs as multiple processes which don't share any data, but communicate through messages.
This paper shows a simple way to implement distributed object systems such as Java RMI. Moreover, I suspect that Java RMI is based entirely on this paper. The only problem is that all the code is written in Modula-3 which is unknown for most people. But don't be affraid about that.
This paper shows the first system of distributed objects. It was a very ambitious project, designing a completely new language from scratch.