Save Expand icon

Ron Valstar
front-end developer

Prerendering with JSDOM

Ever since Googlebot could render JavaScript (about ten years ago) I stopped worrying about prerendering content in the SPA that is my site.
I really cannot be bothered with SEO. But one thing that did always bug me was that RSS previews did not show any content, but the initial index prior.

There were some half hearted attempts to use Netlify and/or Prerender.io, but I never really got it working properly.

Doing it myself

Since I sometimes enjoy hour long train rides, and I can develop this site on Termux using Vim; I thought I'd have another go at it using JSDOM. My experience with JSDOM is in context of unit-testing. And although I firmly believe DOM related testing should be done in a browser, prerendering might just work perfectly with JSDOM.

Getting started

Way back before JavaScript, pages were written to the server in the folder represented by location.pathname. From that folder the index.html would be served.
For prerendering we will just use that old DirectoryIndex behaviour and create a shitload of folders with index htmls.

After building the site we'll use JSDOM to fake a browser environment, load a route, and write the changed HTML structure to the filesystem.

To get a JSDOM instance requires an HTML string and a location (the latter is optional, but useful in our case).
The HTML is what is initially used as starting point.

import {JSDOM} from 'jsdom'

const dom = new JSDOM(html, { url })

Next to figure out is what code to run to render any page. Just running the regular main-js might seem smart but it loads a lot of things we do not really need for a first render. We wouldn't want to remove the no-js class in the prerender. We don't need the header experiments to run, we don't need SVG icons. Arguably, we don't even need to render any components yet. We just need to render content.

So we do a dynamic import for all the views, and run the open method from the router.

Mind you this site is vanilla JavaScript with a simple router. But I guess you'd just have to createRoot for React, or bootstrapApplication in Angular, and conditionally hide certain functionality by setting globalThis.prerendering prior to initialisation.

Errors everywhere

At this point errors started being thrown around. Most of these stemmed from window properties not being accessible on globalThis. You see JSDOM is not a mock implementation of the entire document object model, most web API's are left out for brevity. So after JSDOM instantiation some more preparation is required.

For instance when you do document.querySelector you simply assume document to be a global.
Simply merging window into globalThis might seem like a good plan but it fails so often that it is easier to set specific properties one by one.

Then there is the missing fetch. But since we're still running on NodeJS, we'll have to write an adapter from fs.readFile to fetch.
So this:

globalThis.fetch = file=>{
  const path = './dist'+file
  const body = readFileSync(path)
  return Promise.resolve(new Response(body))
}

There are also some methods that simply require a 'noop', and we're almost good to go.

NodeJS doesn't do raw imports

Webpack was still using raw-loader to import and place SVG symbol definitions. This does not work in NodeJS directly as it does not understand the !!raw-loader! syntax.

This is easily remedied by replacing the raw-loader with a custom build step generating a js file that exports the SVG data as a string.

But more importantly; the amount of SVG data multiplied by the amount of pages is substantial in terms of extra MB's (27.7 to be exact). The SVG data is really not needed in the prerendered HTML.

So we'll do that conditionally by checking the globalThis.prerendering boolean we set at the start of our prerendering code.

Isolation

All this worked for a single route but not for multiple. I had hoped it would work by simply setting the JSDOM URL and calling the router again, or reinitialising JSDOM, but alas.

Luckily the answer is the solution to another problem. Prerendering over 300 pages takes some time. A logical pattern would be to use workers, since these can run in parallel. An added benefit is the isolated scope in which to call the dynamic imports.

Now all I had to do is figure out the amount of workers to call and create a dynamic equivalent for Promise.all.

n - 1

So Stackoverflow says the answer is n - 1, so we'll go with that. Accounting for the fact that some return an empty array, we'll get something like:

import {cpus} from 'os'
const maxWorkers = Math.max(4, cpus().length-1)

(using ESM syntax here, otherwise use require)

Promise.all, but for dynamic arrays

We cannot instantiate all workers at once, so we need to implement something like a dynamic Promise.all.

I would have used a real JavaScript generator but ESM NodeJS seems to have issues with function* { yield }. We can achieve the same functionality with nested functions and a bit more code.

const generator = getWorkerGenerator(pages,html)
await dynamicPromiseAll(generator, maxWorkers)

function getWorkerGenerator(uris,html){
  const _uris = uris.slice(0)
  return function(){
    const uri = _uris.pop()
    return uri&&createWorker(uri,html)
  }
}

function dynamicPromiseAll(generator, max){
  return new Promise((resolve)=>{
    let num = 0
    for(let i=0;i<max;i++) addPromise()
    function resolvePromise(){
      num--
      addPromise()
        ||num<=0
        &&resolve()
    }
    function addPromise(){
      const promise = generator()
      if (promise) {
        promise
          .then(resolvePromise)
          .catch(console.log.bind(console,'Catch err'))
        num++
      }
      return promise
    }
  })
}   

Result

The result is an extra 287 files with a total size of 5.2 MB.

before after
size 8.6 MB 13.8 MB
files* 399 686

* not counting the 4854 files generated by the static search.

Too long, didn't read

You don't need cloud services for prerendering (which requires middleware/htaccess routing configuration).
Using JSDOM you can easily setup a build script that renders HTML for every page.
You will need to expand JSDOM somewhat, to handle XHR for instance. You also should be mindfull how you much you render in terms of filesize.

You can checkout my source here.