Programming Languages can either directly implement application-specific functionality or provide tools for implementing such functionality. Typical languages do both. There is a particular tool — coroutines — that is missing from mainstream programming languages. The availability of coroutines in these languages would help significantly in a wide range of applications, including markup language processing and the text processing operations usually associated with it. This presentatation describes what coroutines are, what their role is, various attempts at implementing them, and what it will take to make them available.
Keywords: Processing; Programming
| XML Source | PDF (for print) | Author Package | Typeset PDF |
The general thesis of this presentation is that to be effective tools, programming languages need to fit the needs of the applications implemented in them. This can be done in two ways: being specialized to an application domain, or being general.
A programming language can directly implement required functionality. This is the role of domain-specific or specialized languages. A programming language can provide the tools with which required functionality can readily be implemented. This is the role of a general-purpose language. In real life, almost every programming language is a mixture of specific and general features:
Early computers were primarily used to process numbers in scientific, engineering, or commercial contexts. Programming languages had two goals:
Early programming languages include:
Increased availability of more computer resources — most notably the introduction of more powerful graphics facilities — inspired two complementary trends in the ’80s and ’90s:
Object-oriented programming represents a major step away from a numerically-oriented approach to computer programming, towards a more general approach to computer-stored data. It is a major programming feature, in that it has a significant impact on how one writes programs.
That’s where things now stand, more-or-less, at least with respect to “main-stream” programming languages: C, C++, Java, and C#. Specialized programming languages are still around — Python, Perl, etc., and, of course, OmniMark — which support alternative styles of programming.
The distinction betwen general and specialized programming languages continues to exist, and probably always will. Specialized languages can afford to be experimental — you can always go back to the main-strem languages when they don’t do the job. General-purpose languages are conservative — you do similar things in similar ways in C++, Java, and C#, with variations only in the details. Interesting features can be found in most specialized programming languages — features that we’d like to see in general-purpose languages, but which those languages’ inate conservatism keeps from joining the main-stream. However, coroutines are such a basic functionality, of such general use, that they should be part of all main-stream general-purpose languages.
The particular thesis of this presentation is that coroutines should be part of main-stream programming languages:
For the text-processing and markup-language processing community, coroutines are especially important — there are many examples where they are the best way of doing things.
The following illustrations attempt to explain the what and why of coroutines.
Many information-processing applications actually consist of two or more processes, with one process producing data and another making use of or consuming it:

(The idea here is that the individual processes travel from left to right as they do their work — it might have helped if there were arrows on the blue lines pointing to the right.)
There’s nothing particularly markup-processing-specific about this. Rather, it describes not only the relationship between a markup parser and its client, but many of the other related activities, such as filtering and converting data both on input and output. The producer-conumer relationship between parts of a program, or between programs, is very general:

There are two key points to note about coroutines. First of all, the processes in all these illustrations are synchronous. This means that there is only one process running at a time, and control is passed back and forth between them explicitly. There’s no explicit syncronization mechanism required when using coroutines.
Secondly, the relationship between processes is intrinsicly symmetric. For example, this is the first illustration flipped upside-down — in spite of the relableing, it describes the same processes and the same relationships:

Coroutines differ from functions in their implementation in requiring a different organization of the information usually stored on the computer’s stacks. A typical function-based program has one stack:

Each entry on the stack typically corresponds to the invocation of a function. “Frames” are pushed on the stack with each call, and popped off with each return from a function:

Computer architectures since the early ’70s have supported this kind of program organization by providing machine registers and instructions that directly model and manipulate a stack.
The relationship between called functions and their callers is primarily encapsulated in what happens on the stack. Coroutines require two or more stacks, because each process has its own path through the program’s logic:

The relationship between coroutines cuts across stack boundaries. Within both the producer and the consumer, functions may be called and may return, with data passed between the coroutines at any stage:

The single most important producer-consumer relationship for markup language processing is that between a markup parser (an XML parser, for example) and the application that processes the marked-up data. For example:

The ideal arrangement is that illustrated above, where the processing application can think of itself as being in charge, requesting information from the markup parser when needed. In practice, with XML parsers implementing the SAX interface, for example, it’s the markup parser that’s in charge:

Worse — and here’s the key problem we have when using programming languages like C, C++, C#, and Java — is that the true picture is more like the following:

What’s going on is that the SAX parser calls an application-supplied method for each piece of element information. It’s a function-call relationship, not a coroutine relationship.
Program state can be represented as data or as control-flow state. A function/method-based process requires all persistent state to be represented as data. In the SAX parser case, the applcation has to store away where it is in objects between each piece of element information. It can’t just ask the SAX parser for the next piece of information. In contrast, a coroutine-based SAX-like interface would allow persistent application state to be represented both as control-flow state and as data state.
Coroutines are not limited to two-process relationships. It’s often useful to have one or more filtering/conversion processes intervening between primary producer and consumer processes:

And communication between coroutines is not limited to pairs. It’s commonly the case that some data requires an intervening processor, and some not:

Producer-consumer relationships are typically a property of large (i.e., real-world) programs, so any small examples of coroutining code will require a bit of imagination and faith on the part of the reader. Here’s a very basic example of what programming with coroutines in C++ would look like. Here’s the “mainprogram” — which is itself a coroutine — everything is — that creates a new coroutine, and then starts talking to it:
Coroutine* the_other = new Coroutine (myself);
for (;;)
{
the_other->resume ();
if (the_other.terminated ()) break;
// Use the data produced by the coroutine.
}And here’s the coroutine’s code:
void my_coroutine (Coroutine* invoker)
{
for (;;)
{ // Do some stuff to produce some data.
invoker->resume ();
}
// The invoker resumes when the coroutine exits.
}The key thing to note about the two coroutines is their symmetry: they both are sitting in “for” loops while communicating with each other. Here’s a slightly larger example, giving in outline, what using a coroutine-based use of a markup parser would look like:
// The markup parser's client application.
void main ()
{
Parser* markup_parser = new MarkupParser ();
for (;;)
{
markup_parser->resume ();
if (markup_parser.terminated ())
break;
switch (markup_parser->eventType ())
{
case elementStart:
// invoke a function/method for the element
} // etc. ...
}
}
// The markup parser itself.
void chapterElementHandler (Parser* parser)
{
parser.resume ();
if (parser->eventType == elementStart ()
&& strcmp (parser->elementName (), "TITLE") == 0)
// process chapter title
for (;;)
{
parser.resume ();
if (parser->eventType == elementEnd)
return;
// process other chapter content
}
}The SAX model of markup processing uses control-flow state on the XML parser side of things, and uses data state on the client processing side of things. An alternative processing model is the DOM model of markup processing, which uses data state on both sides — the markup parser builds a tree, and the client application uses it. This makes sense for a wide range of applications. But where it makes sense is where the nature of the data encoded in the XML markup is in some “natural” way best represented by a tree structure. It’s the data, at the end of the day, that determines what kind of processing best suits it. Where you’ve got serial, narratively structured data, a more serial or streaming form of processing is more appropriate. This is where a SAX-like model is best, and where coroutining is desirable.
There are a number of ways of going about getting coroutining. First of all, you can use threads. Threads do the job. Each thread has its own stack — that’s the key thing.
But threads are an expensive way of managing a tightly-coupled relationship between processes. Threads are asynchronous, whereas coroutines are synchronous. Besides the fact that coroutines can be easier to use — their user doesn’t have to think about the asynchronicity issues — there’s a difference in the processing cost of managing aynchronicity. We’ve measured an up to a 12:1 difference in performance in tests.
Another approach to implementing coroutines is to do it oneself. After all, it doesn’t take much low-level code. Here’s what it takes to switch between coroutines on an Intel platform:
void CoroutineSwitch (coro_stack_info* fromCoroutine,
coro_stack_info* toCoroutine)
{
__asm {
push ebx
push esi
push edi
mov eax,fromCoroutine
mov dword ptr [eax],esp
mov eax,toCoroutine
mov esp,dword ptr [eax]
pop edi
pop esi
pop ebx
}
};(As a side note, the code illustrates a motivation for coroutines; switching happens fast — as fast as calling and returning from a function.)
(As another side note, the above code is what’s expected by the Microsoft Assembler. Following is the same thing — though not quite the same code — for “gcc”, as used under Linux.)
void CoroutineSwitch (coro_stack_info* fromCoroutine,
coro_stack_info* toCoroutine)
{
asm ("movl %ebx,-4(%ebp)");
asm ("movl %esi,-8(%ebp)");
asm ("movl %edi,-12(%ebp)");
asm ("movl 8(%ebp),%eax"); /* fromCoroutine */
asm ("movl %esp,(%eax)");
asm ("movl 12(%ebp),%eax"); /* toCoroutine */
asm ("movl (%eax),%ebp");
asm ("movl -4(%ebp),%ebx");
asm ("movl -8(%ebp),%esi");
asm ("movl -12(%ebp),%edi");
};As noted above, this form of coroutine switching performed up to 12 times faster than using threads. But the experiment failed — when used in a program that also used threading, it caused the program to crash under both Windows and Linux. It turns out that the Linux run-time library uses the stack pointer to identify which thread it’s in when accessing “thread-local data”. Moving the stack pointer around the way the above code does confuses its access attempts. Windows run-time library seems to use the stack pointer to determine the edge of the “heap” when allocating memory (“new” or “malloc”), and moving the stack pointer around seems to make it think that memory has become corrupted.
There have been other attempts to directly implement coroutining facilities for C and C++:
One approach to coroutining is for a service to implement it itself. The Apache C++ SAX Parser implements coroutining in its “progressive parse” functionality — call the “ParseFirst” method to create the parser coroutine, and repeatedly call “ParseNext” to resume it. But “ParseNext” doesn’t return all of the interesting events — in particular, the opening and closing of input streams are not returned as “ParseNext” events. So it’s useful for some purposes, but not for all. But it is definitely a step forward. And it’s a very expensive way in general of providing such functionality — it puts a large burden on the shoulders of the programmer.
Another way of implementing symmetric processes is to go beyond coroutining. Stackless programming languages that don’t use a “stack” at all for invoking functions. In such languages, you can suspend a process at any time and then return to it later. “Stackless Python” is the currently popular embodiment of this idea, although it’s been around for a long time: a programming language called Oregano did it in about 1971, for example.
Lots of programming languages support coroutines. Icon and Scheme are only two of many currently in use. But they’re all special purpose languages. None of the “mainstream” languages support coroutines. The argument of this presentation is that we need this functionality in the mainstream languages because they are used a lot and are still the best way of writing applications in many contexts.
We need coroutines in our main-stream programming languages. Why? because:
Why shouldn’t be just ignore the current so-called “main-stream” languages and go with languages that do implement coroutines? Because:
Why should the markup community worry about this stuff? Isn’t it the job of some other community? Maybe yes, but we can’t afford to wait for them:
The only folk with an interest in getting something done are the user community. It will take a while. But we’ve got to start somewhere.
So how are we going to start the ball rolling? Well: