Friday, 24 August 2007
Mini Axon for Java
A few months ago I was at LugRadio Live where I attended a talk by Michael Sparks from the Kamaelia project. Kamaelia is a python-based component framework for building concurrent applications. The core of the Kamaelia system is called Axon. The main concept of Axon is the Microprocess. A Microprocess is simply a class that implements a generator method. For those unfamiliar with Python, generators are Coroutines, i.e. a method where instead of executing and returning a value when called, it returns an object that can generate values. For example the following code will print 3 integers 1, 2 then 3.
def generate_ints(N):
for i in range(N):
yield i
gen = generate_ints(3)
print(gen.next())
print(gen.next())
print(gen.next())
The interesting part of this is that each time the next method is called control is returned back to just after where the last yield was called, restoring the state of all of the local variables. It is this feature that Axon exploits to implement its concurrent behaviour. In some ways its reminiscent of Java's green threads or SEDA
One thing that really stood out about Axon, is the getting started tutorial doesn't go through contrived use cases for the framework. Instead it starts with a tutorial that walks you through building cut down implementation of the framework. By implementing the system from the ground up (albeit a very cut down one), you get a much better understanding of the system when it comes to using it. It also demonstrates that the system is built on simple clear concepts.
One of the goals for Kamaelia was to support scalable network servers (although it is much more broadly applicable), which is interesting for implementing a mail server. I decided to run through the Mini-Axon tutorial but implementing it in Java rather than Python. The first issue is that Java does not support Coroutines. It is possible to simulate the behaviour by using an inner class that stores all state as fields rather than local variables. This was the basis for my initial implementation, but the implementation was clunky and intelligent. I then stumbled upon a library called Yielder. Yielder provides Coroutines through byte code manipulation of class files. To implement the above Python example in Java:
Iterable<Integer> gen = new Yielder<Integer>() {
public void yieldNextCore() {
for (int i = 1; i <= 3; i++) {
yieldReturn(i);
}
}
}
Iterator<Integer> i = gen.iterator();
i.next();
i.next();
i.next();
To allow this behaviour the '-javaagent=lib/yielder.jar' is required so that subclasses of the Yielder class can be identified and modified. To implement Mini-Axon using Yielder a Microprocess is a direct subclass of the Yielder. The next most important class in the Axon framework is the Component. Functional areas of an application extend the Component class. In order to support safe concurrent applications Components behave according to a specific pattern, where each Component will have only a single reader or writer at one time. Each Component is initialised with a number of boxes (or queues, or buffers, or whichever term you prefer to use). In a typical situation, within the main method (yieldNextCore), the Component loops reading requests from its Inbox, processing them and writing responses to its Outbox. The Component should yield processing at an appropriate point during its execution. Components are not restricted to single inbox/outbox combinations. It is possible to implement aggregation or multicast by reading or writing from multiple boxes. Communication between Components is handled by wiring their boxes together. This can be done in a number of ways, a simple example is the a Postman which copies items from the outbox of one component into the inbox of another. It is also necessary to have a mechanism to actually run the components. This is handled by yet another Microprocess which can schedule the running of other Microprocesses.
Because everything in Axon is a Microprocess including scheduling components, there are all sorts of ways that Components can be wired together. It shows that the model that Axon uses is very flexible. I will probably do some experimentation with using Axon in Meldware. I am currently envisaging a solution using Apache MINA or Grizzly to generate events that would drive an Axon scheduler. The incoming data would be partitioned, either into commands (single text lines) or parts of a larger message request (e.g. the message part of an SMTP DATA command) and passed as messages between Components. Components would handle activities such as command parsing, retrieving mail data from the folders and the streaming the bodies of messages from the store. One of the trickier aspects of moving to an event based approach to building a system like Meldware is how transactions should be handled. Declarative transactions (such as those using in JBoss AOP) generally rely on a single thread processing the entire transaction. However if we want to move away from blocking I/O and use an event based approach, it will be necessary to have transactions that span multiple threads. To handle this case it looks like I will need to suspend and resume transactions manually via the JTA API. It should be possible to implement some sort of transactional Component that handles the suspension and resumption of transactions around the yield call. This class could be extended by Components that need transaction behaviour. How this would work with multiple components sharing the same transaction is something I still need to work through.
This is all still speculative. I am yet to implement any useful behaviour with my implementation of Axon, so I will see how it goes. There are still some bugs in the Yielder library (doesn't work with ecj and requires debug to be enabled), which I am working with the author of Yielder to resolve. If you are interested in the Java implementation of Axon it is available here.

