Where does the Racket web server store continuations?

When you do web programming with continuations, where does the Racket store them? They’re kinda big, aren’t they? Wouldn’t they tend to bog down the server after a while?

When you use continuations with the Racket web server, you’re regaining the upper hand in the inversion of control problem in web programming, where—owing to the lack of state in the HTTP protocol—the usual approach is that the client (browser, web app, whatever is consuming the HTTP responses you’re generating) has the state, and is driving your application. With continuations, you turn this ship around and start to reclaim more of the state yourself. That is, the server itself starts to carry more responsibility. That means that your running web programs store their state directly on the server; the running program—the server—directly stores state. Think of state as here, roughly, as the values of variables and the current call stack.

So where is it? How does the server carry that state?

When you use Racket web programming primitives like send/suspend, send/suspend/dispatch, and so on, you generate URLs which, if followed by your user, lead to your code being resumed at the place where the continuation was generated.

(Just to be clear: continuations are an optional feature of Racket web programming. If you’re reading this article, I assume you’re interested in continuations for web programming.)

There are a couple ways to address this question, depending on the two ways Racket currently implements continuations for the web server. The distinction is between stateful and stateless servlets.

Stateful servlets

With stateful servlets, whenever you use send/suspend & friends, continuations are stored in main memory on the server.

The nice thing about stateful servlets is that you’re programming straight Racket. There are no fancy tricks; the whole Racket language is available to you, and you can program with abandon. It’s comfortable and familiar.

The main sticking point with stateful servlets is their poor scalability. Memory consumption in applications written as stateful servlets can be rather high, even for quite simple programs with barely any state. Depending on your application, you may be able to rein in some of this memory profligacy and reclaim some memory using techniques like send/finish. But you’ll probably find memory consumption going up and up anyway and finding that Racket reaps your continuations due to memory pressure.

Stateless servlets

With stateless servlets, state is, well, missing. When you use send/suspend and friends, the current continuation is serialized and stuffed into a URL, and that’s it.

Stateless servlets, and the associated web-server languages, are a brilliant technique that tackles head-on the memory consumption problem typically encountered with stateful servlets. Memory consumption is reduced dramatically.

Nothing comes for free, though. The web-server language has some limitations. What you don’t see is that your web programs get transformed in such a way as to make the continuations compactly serializable. This means that, under some conditions, you may need to abstain from certain features of Racket. Some constructs need to be adapted: normal structs, for instance, need to be replaced by serializable structs (if you want them to be part of your state).

Moreover, it’s not necessarily correct to say that, with stateless servlets, there’s absolutely no state whatsoever. State gets stored somewhere. There are essentially two possibilities:

  • encode the continuation entirely in a URL;
  • hash the serialized continuation, storing it on disk.

The concept of a stuffer is relevant here. Stuffers are functions that generate the URLs when you use send/suspend and friends.

Let’s look at these two possibilities in detail.

URL carries all state

With stateless servlets, you can have the URL carry all state. This means that the server stores nothing at all. Neither on disk nor in memory. The URL contains enough data to completely determine what state the program was in when it generated the URL.

Here’s a typical example of a URL generated by a stateless servlet:

http://localhost:12345/;((%22c%22%20.%20%220
((3)%204%20(((lib%20%5C%22web-server%2Flang%2F
abort-resume.rkt%5C%22)%20.%20%5C%22lifted%2F4
%5C%22)%20((lib%20%5C%22web-server%2Flang%2Fweb
-cells.rkt%5C%22)%20.%20deserialize-info:frame-
v0)%20(%23%5C%22%2FUsers%2Fjesse%2Flisp.sh%2Fco
ntinuations-on-the-server%2Fmultiply-stateless.
rkt%5C%22%20.%20%5C%22lifted%2F837%5C%22)%20(2
%20.%20%5C%22lifted%2F2631%5C%22))%200%20()%20
()%20(0%20(1%20(h%20-%20()))%20(c%20(v!%20(2%20
(u%20.%20%23%5C%22number%5C%22)%20%5C%22first
%5C%22)%20%23f%20%23f)%20c%20(v!%20(3)%20%23f
%20%23f))))%22))?number=123

(Yes, that’s a URL. I added newlines for, uh, readability. The actual URL is a single line, of course.)

As you may suspect, the continuation was serialized and turned into a ginormous URL. The great thing here is that there truly is no state on the server. The (potentially) bad thing is that the length of the URLs goes up considerably. Thus, if you have lots of URLs, your HTTP response bodies will get big(ger).

Another potential downside is that the state is visible in plain text. If you look carefully, you can see the name of the directory on my laptop where I’m writing this article! The security considerations here are definitely worth thinking about. Shouldn’t we some avoid presenting this information in plain text? Yes, probably. But then we come back to the classic problem: if we give our consumer a hashed URL, and they click on it, how do we know what (serialized) continuation that corresponds to? We need to have—you guessed it—some kind of state, server-side, to answer that.

We can do that; let’s see how.

Serialize continuation to disk

The huge URLs in the previous section are the result of serializing a continuation and doing nothing else. That is, we naively stuff the continuation into a URL, and that’s it. There are more sophisticated options when it comes to the so-called stuffers in the Racket web server. One that I recommend is the MD5 stuffer. Given a continuation, we serialize it and hash that using the MD5 hashing algorithm, then store—on disk, on the server—the content of that serialization as Racket data.

URLs generated this way look like this:

http://localhost:12345/;((%22c%22%20.%20
%220010f13b4500e6ea79a7714868498836%22))?number=123

See that MD5 hash tucked away in there?

Using this stuffer produces plain text files whose names are, simply enough, those MD5 hashes. The size of the files is quite small (a few hundred bytes in my experience) Typical example:

#"((3) 4 (((lib \"web-server/lang/abort-resume.rkt\")
. \"lifted/4\") ((lib \"web-server/lang/web-cells.rkt\")
. deserialize-info:frame-v0)
(#\"/lisp.sh/continuations-on-the-server/multiply-stateless.rkt\"
. \"lifted/816\") (2 . \"lifted/2592\")) 0 () () (0
(1 (h - ())) (c (v! (2 (u . #\"number\") \"second\") #f
#f) c (v! (3 47157667) #f #f))))"

(As before, with the bunker-busting URL, I added some newlines.)

Memory usage of the server when generating URLs this way is the same as what was observed for the no-stuffer option. A number of files are created, of course.

What do we learn from this?

With stateful servlets, because continuations are stored on the server, memory consumption will go up. You may be able to control this, to some degree, by quitting the application where appropriate, because that causes continuations to be flushed from memory. But if you have complex logic that allows users to navigate around in complicated ways before exiting the web program, you’ll have to keep these continuations around, and you’ll probably find Racket reaping the continuations for you.

In the case of stateless servlets, you have a couple options. You can go truly stateless, with the result that really long URLs may be generated. One can generate shorter URLs using hashing, but this has the effect that some state needs to be stored on the server. With the MD5 stuffer, the continuation gets serialized to disk—the state needs to be stored somewhere, so if it isn’t in memory then the disk is a sensible next candidate.