Exploring the core of Cloudflare Workers, workerd

Or say, host your over-engineered worker locally with a even more over-engineered setup • January 17, 2024

Introduction

I assume you already know Cloudflare Workers and the serverless paradigm, but if not, Cloudflare Workers is a platform to build and deploy serverless applications written in JavaScript/TypeScript (or compiled to the WASM target, like from Rust) without configuring or maintaining infrastructure like you usually would. You as a customer, are charged as per-use basis.

Cloudflare Workers runs on Cloudflare’s global network in hundreds of cities worldwide, making it excellent to build small-scale applications on and expand further without much effort.

This is powered by their open sourced core, workerd which is also used by their own development CLI, wrangler (which uses miniflare internally if you want a TypeScript API), but we're not interested in that today.

What we are interested in however, is workerd, and diving how it works.

Setup

TIP

You can find the repository for this here: taskylizard/workerd-demo

We are going to use Bun to install some dependencies and build a URL Shortener worker which will use KV to manage our slugs.

Start by creating a new project, using hono for our worker code:

sh
bun create hono
# Follow instructions, use the bun template
cd workerd-demo
bun create hono
# Follow instructions, use the bun template
cd workerd-demo

Next, install workerd from npm which ships compiled binaries (You can compile from source if you want of course) and also @cloudflare/workers-types so KV types like KVNamespace are available:

sh
bun add workerd
bun add -D @cloudflare/workers-types
bun add workerd
bun add -D @cloudflare/workers-types

We are all set now.

Building our worker

It's time to build our URL Shortener worker, of course you could throw an existing worker if you have one.

It will have two routes, /create POST to create a redirect, taking slug and destination and a (/:slug) GET that will match any slug and return redirects or errors appropriately.

Let's begin on src/index.ts:

ts
import { Hono } from 'hono'
import { validator } from 'hono/validator'
import { logger } from 'hono/logger'

const app = new Hono<{ Bindings: { kv: KVNamespace } }>()
app.use('*', logger())

interface RequestBody {
  slug?: string
  destination?: string
}
import { Hono } from 'hono'
import { validator } from 'hono/validator'
import { logger } from 'hono/logger'

const app = new Hono<{ Bindings: { kv: KVNamespace } }>()
app.use('*', logger())

interface RequestBody {
  slug?: string
  destination?: string
}

Here, we add a KVNamespace binding to Hono's Bindings generic that should allow us typesafe contexts, and then make a RequestBody interface for our /create endpoint. I added logging from hono/logger for inspection.

ts
app.post(
  '/create',
  validator('json', (value: RequestBody, c) => {
    const properties = ['slug', 'destination']
    if (properties.some((v) => !(v in value))) {
      return c.text(`${properties.find((v) => !(v in value))} is missing.`, 400)
    }
    return value
  }),
  async (c) => {
    const { slug, destination } = await c.req.json()
    await c.env.kv.put(slug, destination)
    return c.text(`Created redirect: ${slug} -> ${destination}`)
  }
)
app.post(
  '/create',
  validator('json', (value: RequestBody, c) => {
    const properties = ['slug', 'destination']
    if (properties.some((v) => !(v in value))) {
      return c.text(`${properties.find((v) => !(v in value))} is missing.`, 400)
    }
    return value
  }),
  async (c) => {
    const { slug, destination } = await c.req.json()
    await c.env.kv.put(slug, destination)
    return c.text(`Created redirect: ${slug} -> ${destination}`)
  }
)

The /create endpoint has two body props as said before, and we destructure them out of the request body. As a sanity check (even if this is on our own computers), if one of then is missing, we return a appropriately written error. We then do a simple insert on our kv storage and return a message on success.

INFO

Note that the functions need to be async, as KV operations are async.

ts
app.get('/:slug', async (c) => {
  const { slug } = c.req.param()
  const destination = await c.env.kv.get(slug)
  if (destination === null) return c.text('Could not find that slug.', 404)

  return c.redirect(destination)
})
app.get('/:slug', async (c) => {
  const { slug } = c.req.param()
  const destination = await c.env.kv.get(slug)
  if (destination === null) return c.text('Could not find that slug.', 404)

  return c.redirect(destination)
})

Now, here's our slug route. Firstly, we extract the incoming slug param by destructing it out of c.req.param(), which is automatically inferred as string by hono's typestrong primitives. Then we check if it exists in KV, if not, return a 404 error, otherwise redirect with hono's .redirect().

And with this, we are pretty much done. Add the following to our package.json:

json
"scripts": {
    "start": "workerd serve config.capnp",
    "build": "bun build --format esm --outfile dist/worker.mjs src/index.ts"
  },
"scripts": {
    "start": "workerd serve config.capnp",
    "build": "bun build --format esm --outfile dist/worker.mjs src/index.ts"
  },

Here's the full worker code:

src/index.ts
ts
import { Hono } from 'hono'
import { validator } from 'hono/validator'
import { logger } from 'hono/logger'

const app = new Hono<{ Bindings: { kv: KVNamespace } }>()
app.use('*', logger())

interface RequestBody {
  slug?: string
  destination?: string
}

app.get('/:slug', async (c) => {
  const { slug } = c.req.param()
  const destination = await c.env.kv.get(slug)
  if (destination === null) return c.text('Could not find that slug.', 404)

  return c.redirect(destination)
})

app.post(
  '/create',
  validator('json', (value: RequestBody, c) => {
    const properties = ['slug', 'destination']
    if (properties.some((v) => !(v in value))) {
      return c.text(`${properties.find((v) => !(v in value))} is missing.`, 400)
    }
    return value
  }),
  async (c) => {
    const { slug, destination } = await c.req.json()
    await c.env.kv.put(slug, destination)
    return c.text(`Created redirect: ${slug} -> ${destination}`)
  }
)

export default app
import { Hono } from 'hono'
import { validator } from 'hono/validator'
import { logger } from 'hono/logger'

const app = new Hono<{ Bindings: { kv: KVNamespace } }>()
app.use('*', logger())

interface RequestBody {
  slug?: string
  destination?: string
}

app.get('/:slug', async (c) => {
  const { slug } = c.req.param()
  const destination = await c.env.kv.get(slug)
  if (destination === null) return c.text('Could not find that slug.', 404)

  return c.redirect(destination)
})

app.post(
  '/create',
  validator('json', (value: RequestBody, c) => {
    const properties = ['slug', 'destination']
    if (properties.some((v) => !(v in value))) {
      return c.text(`${properties.find((v) => !(v in value))} is missing.`, 400)
    }
    return value
  }),
  async (c) => {
    const { slug, destination } = await c.req.json()
    await c.env.kv.put(slug, destination)
    return c.text(`Created redirect: ${slug} -> ${destination}`)
  }
)

export default app

Setting up workerd

workerd uses a Cap’n Proto file to define its configuration, that is, which services to run, how they are linked and which resources to use, etc. (Lots of parts are undocumentated so experimentation is key).

In workerd terms, each function we want to execute is a worker, as we are going to see in our config.capnp:

config.capnp
capnp
using Workerd = import "/workerd/workerd.capnp";

const config :Workerd.Config = (
    services = [
        (name = "main", worker = .worker),
        (name = "kv", disk = ( path = "kv", writable = true, allowDotfiles = false ) )
    ],

    sockets = [
        # Serve on :8000
        ( name = "http",
            address = "*:8000",
            http = (),
            service = "main"
        ),
    ]
);

const worker :Workerd.Worker = (
    compatibilityDate = "2023-12-01",

    modules = [
        ( name = "dist/worker.mjs", esModule = embed "dist/worker.mjs" ),
    ],
    bindings = [
        ( name = "kv", kvNamespace = ( name = "kv" ) ),
    ],
);
using Workerd = import "/workerd/workerd.capnp";

const config :Workerd.Config = (
    services = [
        (name = "main", worker = .worker),
        (name = "kv", disk = ( path = "kv", writable = true, allowDotfiles = false ) )
    ],

    sockets = [
        # Serve on :8000
        ( name = "http",
            address = "*:8000",
            http = (),
            service = "main"
        ),
    ]
);

const worker :Workerd.Worker = (
    compatibilityDate = "2023-12-01",

    modules = [
        ( name = "dist/worker.mjs", esModule = embed "dist/worker.mjs" ),
    ],
    bindings = [
        ( name = "kv", kvNamespace = ( name = "kv" ) ),
    ],
);

Lots of things are happening, so lets go through them one by one.

TIP

If you want to see the technical details of the config, see node_modules/workerd/workerd.capnp.

Worker Config

capnp
const worker :Workerd.Worker = (
    compatibilityDate = "2023-12-01",

    modules = [
        ( name = "dist/worker.mjs", esModule = embed "dist/worker.mjs" ),
    ],
    bindings = [
        ( name = "kv", kvNamespace = ( name = "kv" ) ),
    ],
);
const worker :Workerd.Worker = (
    compatibilityDate = "2023-12-01",

    modules = [
        ( name = "dist/worker.mjs", esModule = embed "dist/worker.mjs" ),
    ],
    bindings = [
        ( name = "kv", kvNamespace = ( name = "kv" ) ),
    ],
);

This is a constant that sets up a worker service (everything is a service, even KV as you will find out later).

First is our compatibilityDate which sets your worker to a compatible version for workers, nothing too extraordinary.

Next is our modules array, which can house several scripts. Here, we add our built worker via Cap'n Proto's embed function, which inlines the contents of our built worker to the config at runtime.

Lastly, is the bindings array, as Cloudflare calls it for assigining their various workers related products like KV, D1, R2 to your worker. We assign a name to our kv and a KVNamespace name.

Workerd Config

capnp
const config :Workerd.Config = (
    services = [
        (name = "main", worker = .worker),
        (name = "kv", disk = ( path = "kv", writable = true, allowDotfiles = false ) )
    ],
    sockets = [
        # Serve on :8000
        ( name = "http",
            address = "*:8000",
            http = (),
            service = "main"
        ),
    ]
);
const config :Workerd.Config = (
    services = [
        (name = "main", worker = .worker),
        (name = "kv", disk = ( path = "kv", writable = true, allowDotfiles = false ) )
    ],
    sockets = [
        # Serve on :8000
        ( name = "http",
            address = "*:8000",
            http = (),
            service = "main"
        ),
    ]
);

This is where the fun really begins. We need only two parts of configuration here, services and sockets (boring stuff).

The services array, as I had hinted previously, houses all your workers related services. First is our own worker const, aptly called main, being assigned a name so we can pass it to our sockets config. And then, we add a kv tuple, and workerd luckily supports using your disk as a KV storage, by making files as key name and storing the values as contents. We set the path to kv, make it writable, disable allowDotfiles.

The sockets array, while not much interesting, will serve your worker and can serve multiple workers on different ports. It also houses services that need configuration 😄. We name our service "http", assign it to port 8000, http is nothing much unless you want to expose it remotely, and bind this service to our main service.

And we are pretty much done now.

Starting up

Run bun run build to build our worker, then run bun start to start workerd which will initialize our worker code that was just built. Now our worker is running and we can test it with any HTTP tool like Curl, HTTPie, Hoppscotch, etc. I'll use xh.

To create a slug called test:

sh
λ xh :8000/create slug=test destination="https://bignutty.xyz"
HTTP/1.1 200 OK
Content-Length: 46
Content-Type: text/plain;charset=UTF-8

Created redirect: test -> https://bignutty.xyz
λ xh :8000/create slug=test destination="https://bignutty.xyz"
HTTP/1.1 200 OK
Content-Length: 46
Content-Type: text/plain;charset=UTF-8

Created redirect: test -> https://bignutty.xyz

And to test it:

sh
λ xh :8000/test
HTTP/1.1 302 Found
Content-Length: 0
Location: https://bignutty.xyz
λ xh :8000/test
HTTP/1.1 302 Found
Content-Length: 0
Location: https://bignutty.xyz

And finally, if you really care, you can compile all this into a binary (though a fat binary, over 100MB!):

sh
λ bunx workerd compile config.capnp > worker
λ ./worker
λ bunx workerd compile config.capnp > worker
λ ./worker

There you have it! Your very own Cloudflare Workers, running using the core workerd runtime (that totally isn't the same thing as wrangler, I swear). I hope you enjoyed this blog post as much as I did.