Quick summary ↬
The web is single-threaded. This makes it increasingly hard to write smooth and responsive apps. Workers have a bad rep, but can be an important and useful tool in any web developer’s toolbelt for these kinds of problems. Let’s get up to speed on Workers on the Web!
I’m weary of always comparing the web to so-called “native” platforms like Android and iOS. The web is streaming, meaning it has none of the resources locally available when you open an app for the first time. This is such a fundamental difference, that many architectural choices from native platforms don’t easily apply to the web — if at all.
But regardless of where you look, multithreading is used everywhere. iOS empowers developers to easily parallelize code using Grand Central Dispatch, Android does this via their new, unified task scheduler WorkManager and game engines like Unity have job systems. The reason for any of these platforms to not only support multithreading, but making it as easy as possible is always the same: Ensure your app feels great.
In this article I’ll outline my mental model why multithreading is important on the web, I’ll give you an introduction to the primitives that we as developers have at our disposal, and I’ll talk a bit about architectures that make it easy to adopt multithreading, even incrementally.
The Problem Of Unpredictable Performance
The goal is to keep your app smooth and responsive. Smooth means having a steady and sufficiently high frame rate. Responsive means that the UI responds to user interactions with minimal delay. Both of these are key factors in making your app feel polished and high-quality.
According to RAIL, being responsive means reacting to a user’s action in under 100ms, and being smooth means shipping a stable 60 frames per second (fps) when anything on the screen is moving. Consequently, we as developers have 1000ms/60 = 16.6ms to produce each frame, which is also called the “frame budget”.
I say “we”, but it’s really the browser that has 16.6ms to do everything required to render a frame. Us developers are only directly responsible for one part of the workload that the browser has to deal with. That work consists of (but is not limited to):
Detecting which element the user may or may not have tapped;
firing the corresponding events;
and compositing those layers into the final image the user sees on screen;
(and more …)
Quite a lot of work.
The browser has to go through a variety of work for each frame it puts on the screen. (Large preview)
At the same time, we have a widening performance gap. The top-tier flagship phones are getting faster with every new generation that’s released. Low-end phones on the other hand are getting cheaper, making the mobile internet accessible to demographics that previously maybe couldn’t afford it. In terms for performance, these phones have plateaued at the performance of a 2012 iPhone.
Note: RAIL has been a guiding framework for 6 years now. It’s important to note that 60fps is really a placeholder value for whatever the native refresh rate of the user’s display is. For example, some of the newer pixel phones have a 90Hz screen and the iPad Pro has a 120Hz screen, reducing the frame budget to 11.1ms and 8.3ms respectively.
To complicate things further, there is no good way to determine the refresh rate of the device that your app is running on apart from measuring the amount of time that elapses between requestAnimationFrame() callbacks.*
The usual advice here is to “chunk your code” or its sibling phrasing “yield to the browser”. The underlying principle is the same: To give the browser a chance to ship the next frame you break up the work your code is doing into smaller chunks, and pass control back to the browser to allow it to do work in-between those chunks.
There are multiple ways to yield to the browser, and none of them are great. A recently-proposed task scheduler API aims to expose this functionality directly. However, even if we had an API for yielding like await yieldToBrowser() (or something of the sort), the technique itself is flawed: To make sure you don’t blow through your frame budget, you need to do work in small enough chunks that your code yields at least once every frame.
At the same time, code that yields too often can cause the overhead of scheduling tasks to become a net-negative influence on your app’s overall performance. Now combine that with the unpredictable performance of devices, and we have to arrive at the conclusion that there is no correct chunk size that fits all devices. This is especially problematic when trying to “chunk” UI work, since yielding to the browser can render partially complete interfaces that increase the total cost of layout and paint.
More after jump! Continue reading below ↓
How do you feel about building HTML emails these days? In his online workshop on Building Modern HTML Emails, Rémi Parmentier will show how to code bulletproof, responsive HTML emails that work well in Gmail, Apple Mail, Yahoo and Outlook today. Online, and live. Sep 16–24, 2021.
Jump to the workshop ↬
const worker = new Worker(“./worker.js”);
Before we get more into that, it’s important to note that Web Workers, Service Workers and Worklets are similar, but ultimately different things for different purposes:
A SharedWorker is a special Web Worker, in that multiple tabs or windows of the same origin can reference the same SharedWorker. The API is pretty much impossible to polyfill and has only ever been implemented in Blink, so I won’t be paying any attention to it in this article.
While Workers are the “thread” primitive of the web, they are very different from the threads you might be used to from C++, Java & co. The biggest difference is that the required isolation means workers don’t have access to any variables or code from the page that created them or vice versa. The only way to exchange data is through message-passing via an API called postMessage, which will copy the message payload and trigger a message event on the receiving end. This also means that Workers don’t have access to the DOM, making UI updates from a worker impossible — at least without significant effort (like AMP’s worker-dom).
Web Workers are fully supported in every browser since IE10. (Large preview)
Support for Web Workers is nearly universal, considering that even IE10 supported them. Their usage, on the other hand, is still relatively low, and I think to a large extent that is due to the unusual ergonomics of Workers.
Concurrency Model #1: Actors
My personal preference is to think of Workers like Actors, as they are described in the Actor Model. The Actor Model’s most popular incarnation is probably in the programming language Erlang. Each actor may or may not run on a separate thread and fully owns the data it is operating on. No other thread can access it, making rendering synchronization mechanisms like mutexes unnecessary. Actors can only send messages to each other and react to the messages they receive.
As an example, I often think of the main thread as the actor that owns the DOM and consequently all the UI. It is responsible for updating the UI and capturing input events. Another factor could be in charge of the app’s state. The DOM actor converts low-level input events into app-level semantic events and sends them to the state actor. The state actor changes the state object according to the event it has received, potentially using a state machine or even involving other actors. Once the state object is updated, it sends a copy of the updated state object to the DOM actor. The DOM actor now updates the DOM according to the new state object. Paul Lewis and I once explored actor-centric app architecture at Chrome Dev Summit 2018.
Of course, this model doesn’t come without its problems. For example, every message you send needs to be copied. How long that takes not only depends on the size of the message but also on the device the app is running on. In my experience, postMessage is usually “fast enough”, but there are certain scenarios where it isn’t. Another problem is to strike the balance between moving code to a worker to free up the main thread, while at the same time having to pay the cost of communication overhead and the worker being busy with running other code before it can respond to your message. If done without care, workers can negatively affect UI responsiveness.
Additionally, postMessage is a fire-and-forget messaging mechanism with no built-in understanding of request and response. If you want to employ a request/response mechanism (and in my experience most app architectures inevitably lead you there), you’ll have to build it yourself. That’s why I wrote Comlink, which is a library that uses an RPC protocol under the hood to make it seem like the objects from a worker are accessible from the main thread and vice versa. When using Comlink, you don’t have to deal with postMessage at all. The only artifact is that due to the asynchronous nature of postMessage, functions don’t return their result, but a promise for it instead. In my opinion, this gives you the best of the Actor Model and Shared Memory Concurrency.
Comlink wraps a worker and gives you access to the exposed values. (Large preview)
Concurrency Model #2: Shared Memory
A SAB, like an ArrayBuffer, is a linear chunk of memory that can be manipulated using Typed Arrays or DataViews. If a SAB is sent via postMessage, the other end does not receive a copy of the data, but a handle to the exact same memory chunk. Every change done by one thread is visible to all other threads. To allow you to build your own mutexes and other concurrent data structures, Atomics provide all sorts of utilities for atomic operations or thread-safe waiting mechanisms.
In 2019, my team and I published PROXX, a web-based Minesweeper clone that was specifically targeting feature phones. Feature phones have a small resolution, usually no touch interface, an underpowered CPU, and no proper GPU to speak of. Despite all these limitations, they are increasingly popular as they are sold for an incredibly low price and they include a full-fledged web browser. This opens up the mobile web to demographics that previously couldn’t afford it.
PROXX running on a Nokia 8110 (“Banana phone”). (Large preview)
To make sure that the game was responsive and smooth even on these phones, we embraced an Actor-like architecture. The main thread is responsible for rendering the DOM (via preact and, if available, WebGL) and capturing UI events. The entire app state and game logic is running in a worker which determines whether you just stepped on a mine black hole and, if not, how much of the game board to reveal. The game logic even sends intermediate results to the UI thread to give the user a continuous visual update.
The UI continually updates even if the worker is still busy figuring out the final state of the game field.Other Benefits
I have talked about the importance of smoothness and responsiveness and how workers help you achieve these goals more easily. Something that has only been explored at the surface is that Web Workers can also help your app consume less of your user’s battery. By making use of more CPU cores in parallel, the CPU might be able to use “high performance” mode more sparingly, consuming less power overall. David Rousset from Microsoft did an exploration into power consumption of web apps.
Adopting Web Workers
If you made it here, you hopefully have a better understanding of why workers can be useful. Now the next obvious question is: How?
Workers have not seen large adoption, so there also isn’t a lot of experience and architecture around workers. It can be hard to tell ahead of time which parts of your code are worth moving to a worker. Rather than advocating for one specific architecture over another, I have made good experience with an approach that allows incremental adoption of workers:
Most of us already build our apps by using modules and the basic primitive, as it is what most bundlers use to perform bundling and code splitting. The main trick is to be strict about separating your UI code from the pure computation parts. This will reduce the number of modules that make use of the main-thread-only API like the DOM and as a result can do their work in a worker.
Additionally, try to rely on synchronicity as little as possible, allowing you to adopt asynchronous patterns like callbacks and async/await later on with ease. With this in place, you can move modules from the main thread to a worker using Comlink and measure to see if it impacts your performance positively or negatively.
For adopting workers in an existing code base, things can be a bit trickier. Invest some time to critically analyze which parts of your code need to depend on the DOM or other main-thread-only APIs. If possible, remove these dependencies through refactoring and incrementally adopt the model above.
In either case, the key part is to make the impact of off-main-thread architecture measurable. Don’t assume (or guess) that something is faster or slower once in a worker. Browsers sometimes work in mysterious ways where something that seems like an optimization has the opposite effect. It is important to get numbers to make informed decisions!
Web Workers And Bundlers
WebpackFor Webpack v4, the worker-loader plugin made Webpack understand Workers. Since Webpack v5 Webpack understands the Worker constructor automatically and can even share modules between the main thread and worker to avoid double-loading.
RollupFor Rollup, I wrote rollup-plugin-off-main-thread, which should make workers work out-of-the-box.
ParcelParcel deserves a special shout-out as both v1 and v2 support Workers out of the box with no additional configuration.
With all of these bundlers, it’s common to develop your app using ES Modules. However, that in itself brings another problem.
Web Workers And ES Modules