NETLIB II

You'll laugh again! You'll cry again!! You'll URL again!!!

Multithreading the NETLIB protocols

Abstract

The existing NETLIB protocol code -- that which takes a URL, fetches the object, and presents the contents -- is, for historical reasons, written in a single-threaded, callback-based model. Now that the needs of Java have prompted the creation of an all-platform multithreaded process support library, we can rewrite the NETLIB protocol code to use multiple threads. The advantages we anticipate include threadsafe support for the java.net.URLConnection interface, a more natural code design leading to better understanding and support, and a more modular design easing the task of any future protocol additions.

NETLIB now

Netlib actually consists of several conceptually separate modules, all lumped together by historical accident. Lou is proposing a functional reorganisation; here we are concerned only with what he classifies as the URL Loader Library: basically, the functions that are now called NET_GetURL, NET_ProcessNet, etc.

Currently, one presents an object by calling NET_GetURL with a URL structure specifying the object, the desired presentation type, the front-end window context, and an exit callback routine pointer. NET_GetURL eventually calls the protocol-specific load routine (e.g., NET_HTTPLoad) to initiate the object retrieval. This load routine starts the connection to the appropriate server, and sets up the protocol state machine. When the load routine finds itself doing a potentially blocking operation -- e.g. connect(2) -- it performs the operation in a non-blocking manner, registers the socket with the front end, and returns.

When the front end detects that the registered socket is ready, it calls NET_ProcessNet. This then calls the protocol-specific process routine. The protocol-specific routine looks at the current state of the protocol FSM, uses this to complete the current operation and continue through to the initiation of the next potentially blocking operation before setting the state and returning. When the blocking operation is complete, the front end again calls NET_ProcessNet and the FSM continues.

When enough information is obtained to determine the object type, the protocol process routine calls NET_StreamBuilder to create a chain of modules ending with the front-end presentation code. Then, as the FSM reads in data, it pushes the data through this stream.

The protocol-specific modules call several front-end calls for a number of reasons: to update the progress display, to prompt for passwords, to alert the luser, etc.

When all data has been fetched, everything is cleaned up nicely. If one of the NETLIB interrupt routines is called to interrupt one or more operations, the appropriate protocol-specific interrupt routines are called to jump the FSM into a shutdown state.

Other protocol operations -- POST, etc. -- are handled similarly but will not be detailed here.

The problem

The biggest problem with this design is the design criteria for the FSM: because of the constraint not to block, the state boundaries must occur "in the middle" of blocking operations. In other words, every state consists of finishing the previous step and initiating the next. This design requires a certain twisted thinking, which does not lead to easy creation or support. Further, protocols such as IMAP4, in which the server may "volunteer" data unexpectedly, may not be properly supported. Finally, this design does not naturally lend itself to the requirements of the blocking java.net.URLConnect interface.

The opportunity

Along with giving us the problem of URLConnect, the incorporation of Java has given us an opportunity: it necessitated the creation of a multithreaded-process-supporting "Portable Runtime" library. Now, with the ability to spin off a new thread for each URL fetch, we can recode the protocol modules is a simple, natural way without blocking the main UI thread. These modules will by necessity be thread-safe, and will lend themselves nicely to URLConnect. They should be easier to create from the protocol specifications. They should be much more comprehensible, shortening the ramp-up time for new people, making bugfixing and support easier, and allowing more people to work with new protocol modules.

The design

In short, the new NET_GetURL will do the following:

Create a set of resources that can be used by the protocol modules. This would include all the information currently set by or fetched from the front end and used by the protocol modules.
Create a bidirectional message queue (we expect to use two PREventQueue's).
Register the incoming queue monitor with the front end, with a callback to NET_ProcessNet (or rather, its replacement).
Create a new thread, passing it the new queue.
Create an event, with the handler being the cross-protocol thread-side "get URL" routine, and drop this event in the outgoing queue.
Return.

The new thread then proceeds with the operations common to all protocols, then jumps into the protocol-specific code. This thread may happily block anywhere -- connect(2), read(2), gethostbyname(3), wherever. When the thread needs to communicate with the front end -- to update the progress display, to prompt for a password, etc. -- it would not call the front-end call directly, but rather a replacement call provided to the threaded module code. This replacement call would wrap up the parameters in an event and drop it in that thread's message queue. Calls requiring a response would automatically use synchronous events.

This event would trigger the condition of the monitor registered with the front end. The front end would eventually get around to calling the callback, NET_ProcessNet. This call would call the event's handler routine, which would proceed to do the actual front-end call. The existing portable runtime event queue operations support all of this already.

When the thread has enough data for the output stream to be built, it will call a new StreamBuild operation which, like the front-end call wrappers, will send an event to the main thread. The main thread can safely build the outgoing stream, and will note the resulting stream operations. This synchronous event will block the subthread until the output stream is ready. We decided to place the actual stream building in the master thread, to avoid having to rewrite all the stream modules in a thread-safe manner. Perhaps eventually some of these modules might be rewritten so that they may be called from the individual threads, but this is not necessary.

When the thread fetches some data, it would enqueue an event to the main thread. It should not have to block on this, modulo space constraints. The front-end will, as usual, notice the monitor condition and call back to ProcessNet. ProcessNet will call the stream's WriteReady service to determine how much data it may send, and then call the Write operation with up to that much data. Any remaining data will be conceptually pushed back on the top of the queue.

Interrupts from the main thread will be handled by the NET_Interrupt call by signalling or killing the thread. If killed, the thread's cleanup routines could handle closing the incoming connection and outgoing stream cleanly. If signalled, the handler could trigger the cleanup, and then let the thread continue around again to process another "get URL" event. This way, if we find that thread creation is expensive, we could keep around a "stable" of get-URL threads that keep getting reused.

New resources required

We will need a new front-end call that can take a PRMonitor and a callback and register them with the main UI event loop. The callback should be called when the monitor's condition is signalled. A similar functionality is already in place for Java, though it is hardcoded in place. The XFE can do this with existing code, and I am told that the other platforms can do this easily too.

It would be nice if the NSPR code had a way of signalling a thread that may be blocked in a kernel call. There are some concerns about being able to kill (cancel) blocked threads on all platforms. One possible solution might be to have a function that simulates the blocking call by initiating an asynchronous IO operation, then waiting on either its completion or interruption.

We would like prioritised event queues. We might end up just doing this ourselves.