Browser extensions are kernel modules for browsers

Back in October, after a few weeks of working on the River Trail Firefox extension, I had a trite epiphany: if a browser is an OS, then extensions are kernel modules for that OS.

Ordinary JavaScript code on an ordinary web page runs “in user space”, without any special privileges. It isn’t supposed to be able to do things like, say, allocate OpenCL buffers. But browser extensions offer a workaround. You can write a browser extension that makes it so that someone can, for example, click a button on an ordinary, unprivileged web page, and it will cause native code to run on their computer. Whoa!

Moreover, thanks to js-ctypes, you can do this without writing any code that isn’t JavaScript. Who knew? For instance, this is a toy Firefox extension that makes it so you can write a JavaScript program that calls into libc to find out the length of a string:

When I realized that this was even possible, I excitedly told an acquaintance:

A transcript of a chat conversation between Lindsey Kuper and Alex Clemmer on October 8, 2014.  The conversation proceeds as follows.  Alex: "You click a button and it runs native code?  How horrifying!"  Lindsey: "I know!  I don't know why it took me so long to realize this, but browser extensions are like kernel modules."  Alex: "I thought browser extensions were just neat little javascript things that got appended ot the bottoms of web pages and executed.  It sounds like that must be wrong."  Lindsey: "If web pages are to browsers as userland programs are to OS kernels, then browser extensions are totally kernel modules."

Alex said, “I thought browser extensions were just neat little JavaScript things that got appended to the bottoms of web pages and executed.” That describes the behavior of, say, Greasemonkey user scripts pretty well. But if we think of a browser as an OS, and web apps as the userland programs running on that OS — which I think is a sensible analogy — then, well, browser extensions are kernel modules.

The technique I’m using in the toy Firefox extension above, which makes it possible to run code in a system library from an ordinary web page, is essentially the same as the technique that I use in the River Trail Firefox extension to interact with the OpenCL library on the user’s system. There are two main things River Trail does: it compiles a certain subset of JavaScript to OpenCL C, and it orchestrates interaction with the OpenCL runtime so that the resulting OpenCL C code can be compiled to machine code, sent to a compute device, and run, and the results retrieved. The part of River Trail that does the compilation from JavaScript to OpenCL C runs entirely “in user space” — it’s just an ordinary, unprivileged JavaScript program. It only has to resort to calling “into the kernel” — or into the extension — at the times when it actually needs to interact with the underlying OpenCL platform. And so the extension component of River Trail can be relatively small: the River Trail library is 10,000 lines of code, not counting tests, but the whole River Trail extension is under 1,000 lines.

We could have implemented everything the library does as part of the extension, too, but there’s no need. Keeping the extension small is desirable, since it’s running in a privileged context, and keeping the amount of privileged code small means that there are, we hope, fewer bugs that can do bad things.

Since starting to work on River Trail, I notice this two-level architecture everywhere I look. We can even see the pattern repeated at the level of OpenCL. A given OpenCL implementation can probably run mostly in user space; for instance, the part that compiles an OpenCL C program to machine code for a given GPU is likely all in user space. But to actually execute that machine code, the OpenCL implementation most likely has to talk to some graphics driver that runs in kernel space. This interaction may take place at a very low level; the kernel-level driver may not even know anything about OpenCL. Likewise, the River Trail browser extension doesn’t know or care that the OpenCL C code it’s dealing with started out as JavaScript before the River Trail library got hold of it.

This lack of assumptions can give us modularity and the ability to reuse components for unforeseen purposes. OpenCL doesn’t have to target that particular graphics driver; it can, at least in theory, interact with any graphics driver that presents a certain interface. Conversely, the kernel-level driver may be generic enough (or robust enough) to handle all kinds of stuff coming from user space, not just OpenCL. Coming back to the analogy with browsers and browser extensions, the River Trail library could talk to, for instance, WebCL instead of the River Trail extension. And, although this this wasn’t the originally intended purpose, it’s absolutely possible for JS libraries other than the River Trail library to talk to to the River Trail extension and use it however they want.

Thanks to Dan Luu, Darius Bacon, Stan Schwertly, Eiríkr Åsheim, Colin Barrett, Jesse Ruderman, Greg Pfeil, Julia Evans, and Jamey Sharp for reading drafts of this post. The last part in particular owes a huge debt to Jamey!