Lazy loading in Webpack

In the previous article we've described the basics of lazy loading in React. Now we're diving a bit deeper, and we are going to explore the logic laying behind the handy API.

Two APIs

Let's recall the syntax of React.lazy call:

const View = React.lazy(() => import('./view.js'));

There are two not so related APIs used here.

The first API is the very React.lazy handled by React itself, the second is the import called “dynamic” which is handled by webpack. To understand both APIs let’s start with webpack’s, because it’s more complex.

Webpack code splitting terms

Before we start, let’s define two important terms.

First, there is a module. A module is a piece of code that developers import into their project. Here’s an example of a module import:

import React from 'react';

It does not matter what we use for importing: import or require; ESM or CJS. They both define an injection of a “foreign” module into the current one.

During the development a module is just a file. But in a bundle the module is a combination of a function that contains the code of that file, and an ID of this module.

Then, there is a chunk. It is a file that contains modules. Your project may contain only one chunk (usually called app.js or main.js) or a lot of them.

Yeah, even though we’re saying “vendor bundle” it’s actually “vendor chunk” from webpack’s point of view. That’s why the setting that disables vendor bundle creation is a part of SplitChunksPlugin.

Like modules, every chunk has its own unique ID.

That’s it, modules and chunks. It’s important to understand the difference between modules and chunks, because webpack works with both and usually in a similar way. Here’s a picture to summarize:

A scheme of webpack converting modules into chunksA scheme of webpack converting modules into chunks

Types of chunks

For simplicity let’s define two types of chunks: entry chunks and regular ones.

Entry chunks contain modules we set as entry points in webpack configuration. Webpack adds the code it needs to maintain everything right into these chunks. Module registration, caching, chunk loading, etc — all the functions for it lay here.

Regular chunks are the rest of them. Those that are not entry chunks. Regular chunks code is way easier to understand, so let’s start with them.

Regular chunks

A regular chunk does a pretty simple job: it registers itself and modules it contains — that’s it.

Technically a chunk looks like an object where keys are module IDs, and values are the modules themselves. Also a chunk contains its own name and a small piece of code for registration.

To get the full picture let’s describe entry chunks.

Entry chunks

An entry chunk contains everything webpack needs to work, hence it’s more complex than a regular chunk.

First of all, an entry chunk defines a chunks registry. The chunks registry looks like a plain array with modules inside. There are some tricks but we will describe them later. Anyway, you may look at the registry on any site built with webpack. Just check the window.webpackJsonp object:

A screenshot of pretty printed webpack's chunks registryA screenshot of pretty printed webpack's chunks registry

Then, an entry chunk defines a modules registry. It looks even simpler than the chunks registry: it’s a plain object where keys are module IDs and values are the modules themselves.

In addition to the modules registry, an entry chunk defines a modules cache registry. A module cache is a result of module evaluation. Usually the module's cache is an object that contains an exports field that, in turn, contains the value the module exports. The cache is useful because webpack tries not to evaluate modules twice.

The require function is also defined by an entry chunk. When the require function is called to import a module it tries to find the module's export in the cache. If there’s nothing there, the function finds the module in the registry, evaluates it, saves the result into the cache and returns it.

Finally, there is a function called ensure that is also defined by the entry chunk. This function downloads the missing modules when they are requested. This is the key function in a whole lazy loading scenario. So, let’s see how it works.

Webpack's 'require' ensuring

Let’s say we’re lazy loading a view in our app. It may look like this:

const Blog = React.lazy(() => import('./views/blog.js'))

As we said at the start of the article, an import statement here is called “dynamic.” Webpack translates this import into this code:

const Blog = React.lazy(() => ensure('views_blog_js').then(require.bind(require, './views/blog.js')))

Here views_blog_js is an ID of the chunk which webpack has generated for our view. And ./views/blog.js is an ID of the module that will be required after the chunk download.

It turns out all this lazy loading is simple. First, something calls the function that may trigger additional loading. In our case React.lazy is the caller.

Then, the function calls ensure with the name of the chunk that should be downloaded. ensure starts downloading and returns the promise which resolves when the downloading is done.

Finally, due to the fact that the require function has been subscribed to the promise state, it’s fired when the promise is fulfilled. The result of the require function call becomes the result of the whole promise chain. React.lazy uses this result, because it’s the module it expects to get.

'require' vs. 'ensure'

The call of require does not do any asynchronous work. The call does not even know about the whole downloading process, it just expects that module should be in the modules registry or the modules cache registry. The whole magic is done by the ensure function.

Here’s how ensuring works. Every chunk has its own “status” and this status is stored in the chunks registry. If the status says that the chunk was downloaded earlier, then ensure simply resolves and does nothing. If the status says that the chunk is downloading right now, then ensure waits till it’s done and resolves.

But if the chunks registry does not know about the requested chunk, then ensure calculates the path of the chunk, adds <script> tag into <head>, sets its src attribute to the path and waits till the load or error event of this script happens. After that, ensure resolves the promise and lets require do its job.

What’s important here is that there’s no communication between the chunk that is asking for downloading and the chunk being downloaded. The first one initiates downloading using ensure function, waits till it’s done, and then tries to find the modules it needs in the modules registry. Here’s how the ensure call looks:

A scheme with interaction of ensure call, browser, code of new chunks and webpack's registriesA scheme with interaction of ensure call, browser, code of new chunks and webpack's registries

The browser triggers the script’s load event after the execution of the script. This fact ensures that the chunk has been loaded, the modules it contains have been added to the registry, and require may go to find them without any doubts:

A scheme with interaction of require call and webpack's registriesA scheme with interaction of require call and webpack's registries

React lazy loading

React.lazy simply takes the promise we pass, waits for its resolving, then takes a default field from the result object, and uses it as a component. Yeah, there is a lot of work behind the scenes, but this work is related to the renderer of React, not to the React.lazy itself.

Because React renderer has to wait for a promise to resolve, we must provide a fallback — something that the renderer may show during the waiting. But you probably know this from the previous article.

What you may not know, but what is obvious right now is that we can use React.lazy not only for views loading, but for any async action. We just have to mimic the module signature — return an object with the default field.

Here we go:

import React from 'react';
import ReactDOM from 'react-dom';

const blogComponent = { default: () => <h1>Blog</h1> };

const Blog = React.lazy(
  () => new Promise(resolve =>
    setTimeout(() => resolve(blogComponent), 3000)
  ),
);

const appNode = document.getElementById('app');

const App = () => (
  <React.Suspense fallback={<p>Loading</p>}>
    <Blog/>
  </React.Suspense>
)

ReactDOM.render(<App/>, appNode);

This code will show “Loading” for three seconds and then will replace it with the “Blog” headline.

Instead of using setTimeout here, we could actually go to a server, fetch some data and return the Blog component after that. The combination of React.lazy, React.Suspense, and the trick with the default field makes it possible to encapsulate the logic of showing a preloader on a different level.

But you should not go this tricky way because the React team is going to make it much easier! React team will add a possibility to use React.Suspense in React 18 for any asynchronous action.

Bonus: what happens if a regular chunk is downloaded earlier than an entry one

“Alright,” might you say, “but if an entry chunk prepares an environment for the regular ones, what happens when one of these regular chunks is downloaded earlier than an entry one?”

Excellent question!

The reason why everything works well is a small technical trick. As we said earlier in this article, entry chunk defines chunks registry. This registry looks like an array which is added to window. Due to the fact that the registry should be globally available, its name is predefined, so the regular chunks create it if the registry does not exist. Here’s how it looks in the code:

(self.webpackChunkmain = self.webpackChunkmain || [])
  .push([
    ['views_blog_js'],
    {
      './views/blog.js': (module, exports, require) => {
        /* module code */
      },
    }
  ]);

webpackChunkmain — is a code name for the chunk registry. self is used here instead of window probably to make the code compatible with workers.

As you see, the regular chunk creates the registry and then pushes itself there. But it’s not the trick. The trick is a way that an entry chunk behaves:

var chunkRegistry = self.webpackChunkmain = self.webpackChunkmain || [];

chunkRegistry.forEach(jsonpCallback.bind(null, 0));

chunkRegistry.push = jsonpCallback.bind(null, chunkRegistry.push.bind(chunkRegistry));

First, the entry chunk checks if an array for the registry has been created earlier (by any of the regular chunks). Then it runs a callback for each item of the array, which registers modules, sets statuses for chunks, etc.

Finally, the chunk rewrites the push method of this array to run the callback on every push. It means that when the next regular chunks push themselves into the array, the callback will intercept them and register their modules.

Pretty easy, but beautiful.

The phantom menace