Stories by Llorx on Medium

Your Node.js Benchmarks Are (Probably) Invalid

Llorx — Tue, 04 Nov 2025 16:40:06 GMT

⚙️ The Problem with Node.js Benchmarks

Node.js and its engine, V8, perform a lot of behind-the-scenes optimizations. That “magic” makes your JavaScript run fast — but it can also completely distort your benchmark results.

When I was developing pacopack, I wanted to squeeze every bit of performance out of Node.js so I used benchmark.js to measure critical parts of the code.

Then something weird happened: I simply reordered the tests, and the results changed. Significantly.

wut

🧙‍♂️ V8’s Hidden Optimizations

Depending on what you’re testing, Node.js may reuse internal caches, and V8 applies just-in-time (JIT) optimizations.

This means one test can influence the next — what I like to call optimization pollution.

At first, I tried running each test individually to isolate them. It worked… but it broke the convenience of using benchmark suites. So I built a solution that handled isolation automatically.

That’s how iso-bench was born.

🧩 Example

Let’s start with a simple benchmark suite using benchmark.js:

import Benchmark from "benchmark";
// The test functions
const functions = {
  read: function (buf) {
    return buf.readUint8(0);
  },
  direct: function (buf) {
    return buf[0];
  },
  read_again: function (buf) {
    return buf.readUint8(0);
  },
};
// Prepare buffers
const buffers = new Array(1000).fill(0).map(() => {
  const buf = Buffer.allocUnsafe(1);
  buf[0] = Math.floor(Math.random() * 0xff);
  return buf;
});
// The benchmark suite
const suite = new Benchmark.Suite();
for (const [type, fn] of Object.entries(functions)) {
  suite.add(`${type}`, () => {
    for (let i = 0; i < buffers.length; i++) {
      fn(buffers[i]);
    }
  });
}
suite.on("cycle", (event) => console.log(String(event.target))).run({ async: true });

The function read_again is identical to read — the only difference is that it runs after direct.

Results:

read       x 1,531,039
direct     x 216,381
read_again x 312,087 // slower than "read"? It’s the same code!

🤨 Why is the same function suddenly slower?

🔁 Changing the Order Changes Everything

If we just swap the order and put direct first:

const functions = {
  direct: function (buf) { // Now "direct" goes before "read"
    return buf[0];
  },
  read: function (buf) {
    return buf.readUint8(0);
  },
  read_again: function (buf) {
    return buf.readUint8(0);
  },
};

It yields:

direct     x 1,599,998 // "direct" 7x times faster than before?
read       x 163,890   // slower than before?
read_again x 184,126   // slower than before too??

WHAT.

The only thing that changed was the order of the tests — not the code itself. This is the “benchmark pollution” I was talking about.

🧱 The Fix: Run Benchmarks in Isolation

iso-bench solves this by running each benchmark in a separate process, keeping them fully isolated from one another.

Example:

import { IsoBench } from "iso-bench";
// The test functions
const functions = {
  read: function (buf) {
    return buf.readUint8(0);
  },
  direct: function (buf) {
    return buf[0];
  },
  read_again: function (buf) {
    return buf.readUint8(0);
  },
};
// Prepare buffers
const buffers = new Array(1000).fill(0).map(() => {
  const buf = Buffer.allocUnsafe(1);
  buf[0] = Math.floor(Math.random() * 0xff);
  return buf;
});
// The benchmark suite
const bench = new IsoBench();
for (const [type, fn] of Object.entries(functions)) {
  bench.add(`${type}`, () => {
    for (let i = 0; i < buffers.length; i++) {
      fn(buffers[i]);
    }
  });
}
bench.consoleLog().run();

It will yield these results:

read       - 1,709,528 op/s
direct     - 1,713,134 op/s
read_again - 1,701,391 op/s

Which make way more sense.

🧠 How It Works

Here’s what happens under the hood:

The main process sets up the suite but doesn’t execute the test functions directly.
For each test, a new Node.js process is spawned.
Each process runs the same entry point and receives a signal with the test that it should run isolated in this process.
The results are sent back to the main process for aggregation.

This ensures that each test runs in a completely clean context, unaffected by V8 optimizations or cache warmups from previous tests.

And as always — it’s lightweight and has zero dependencies. 🙂

Install it with:

npm install iso-bench

👉Documentation: https://github.com/Llorx/iso-bench

🧭 TL;DR

If you’re benchmarking Node.js code, don’t trust your numbers blindly. V8’s optimizations can change your results depending on test order.

Use iso-bench to run each benchmark in isolation — and finally get results you can believe without overthinking if optimizations are maybe applying.

How I Created My Own Testing Framework — An Opinionated Way of Testing

Llorx — Sat, 21 Jun 2025 09:53:22 GMT

How I Created My Own Testing Framework — An Opinionated Way of Testing

Motivation

When I started doing tests — although they were big monolithic tests doing multiple actions and assertions under the same test — I loved testing. I could implement a full class without running any other part of the software, and when I connected the class to the remaining code, it worked on the first try. That feeling is awesome!

Me implementing tests

But when I had to modify or add tests, they were a bit painful. Something was off. Testing should make your life easier — not harder.

After reading a lot about testing, I stumbled upon this incredible guide about the Arrange-Act-Assert pattern.

I loved the premise: make tests easy to process. Define explicit sections for arranging the test, doing an action, and asserting the result. So, I started adding // Arrange, // Act, and // Assert comments all around my tests.

I had mixed feelings doing this. Something was still off. I didn’t like using comments as section dividers. In my opinion, comments should explain code that is difficult to understand by itself. I read comments only when I don’t fully understand something, so my brain subconsciously ignores them until needed. It still wastes brain cycles to read the test and understand its sections.

With this in mind, arrange-act-assert was born.

Optimize Brain Cycles

Another thing I love is good practices enforced by software design, so I needed a library that forced me to create explicit arrange, act, and assert sections — rather than relying on optional comments.

Take this test, for example, written using the Node.js test runner:

import test from "node:test";
import Assert from "node:assert";

import { MyFactory } from "./MyFactory";
import { MyBase } from "./MyBase";

test("should do that thing properly", () => {
    const baseOptions = {
        a: 1,
        b: 2,
        c: 3,
        d: 4
    };
    const base = new MyBase(baseOptions);
    test.after(() => base.close());
    base.setData("a", 2);
    const factory = new MyFactory();
    test.after(() => factory.dispose());
    const processor = factory.getProcessor();
    const data = processor.processBase(base);
    Assert.deepStrictEqual(data, {
        a: 2,
        b: 27
    });
});

You’ll notice how you have to spend brain cycles to understand what’s going on. Whether it’s more or fewer, some are wasted.

To improve this, I added // Arrange, // Act, and // Assert comments:

test("should do that thing properly", () => {
    // Arrange
    const baseOptions = { a: 1, b: 2, c: 3, d: 4 };
    const base = new MyBase(baseOptions);
    test.after(() => base.close());
    base.setData("a", 2);
    const factory = new MyFactory();
    test.after(() => factory.dispose());
    const processor = factory.getProcessor();

    // Act
    const data = processor.processBase(base);

    // Assert
    Assert.deepStrictEqual(data, { a: 2, b: 27 });
});

This helps differentiate the sections and even discourages mistakes like mixing the Act and Assert steps:

Assert.deepStrictEqual(processor.processBase(base), {...}); // ❌ Bad

But I still had to look for the comments. Worse, I could still do weird things — like add multiple acts in a single test.

Enforcing Better Structure

So, I created a library where that same test looks like this:

import test from "arrange-act-assert";
import Assert from "node:assert";

import { MyFactory } from "./MyFactory";
import { MyBase } from "./MyBase";

test("should do that thing properly", {
    ARRANGE(after) {
        const baseOptions = { a: 1, b: 2, c: 3, d: 4 };
        const base = after(new MyBase(baseOptions), base => base.close());
        base.setData("a", 2);
        const factory = after(new MyFactory(), factory => factory.dispose());
        const processor = factory.getProcessor();
        return { base, processor };
    },
    ACT({ base, processor }) {
        return processor.processBase(base);
    },
    ASSERT(data) {
        Assert.deepStrictEqual(data, { a: 2, b: 27 });
    }
});

Yes, I hear you: “Ugh, those uppercase section names!” But that’s the whole point: they’re visible, they’re obvious, THEY’RE UPPERCASE — so your brain spends almost zero cycles identifying them.

I even considered adding a lowercase variant, but this is an opinionated testing library — and in my opinion, uppercase is way more noticeable.

This structure also helps you clearly distinguish the method you’re testing (processBase() in ACT) from the expected result ({ a: 2, b: 27 } in ASSERT).

Also, the library enforces one ACT per test. That’s a design feature to encourage cleaner, single-purpose tests.

Smarter Resource Cleanup

Consider this code using Node’s test runner:

test("should do that thing properly", () => {
    const base = new MyBase();
    const factory = new MyFactory();
    test.after(() => {
        base.close();
        factory.dispose();
    });
    [...]
});

If base.close() fails, then factory.dispose() will never be called. That’s a leak!

With arrange-act-assert, by design, it hints you to have a single cleanup callback for each element:

test("should do that thing properly", {
    ARRANGE(after) {
        const base = after(new MyBase(), base => base.close());
        const factory = after(new MyFactory(), factory => factory.dispose());
        [...]
    },
    [...]
});

Each disposable is tightly coupled to its cleanup logic, reducing the chance of unintended leaks.

Human-Error-Safe Snapshots

Tests exist because humans make mistakes — so the tools should also be designed to reduce human errors.

With this in mind, I introduced a new approach to snapshot testing: one that forces validation and reduces accidental acceptance of wrong outputs.

Snapshot testing is pretty straightforward:

test("should do that thing properly", {
    ARRANGE(after) {
        [...]
        return whatever;
    },
    SNAPSHOT(whatever) {
        return whatever.methodToSnapshot();
    }
});

How is this safer?

The library creates an unconfirmed snapshot. When run again, it fails because the snapshot is unconfirmed, so you must run the test again with --confirm-snapshots to explicitly confirm it. The library will assert the unconfirmed snapshot and create a confirmed snapshot.

You can’t confirm and create a snapshot in the same run, and you can’t confirm a snapshot if it returns a different result than the previous unconfirmed one. This two-pass design drastically reduces the chance of forgetting to validate what was snapshotted.

Addendum

That’s very much it! It also supports code coverage, parallelism, and, as always, has zero dependencies 😉.

If you like the Arrange-Act-Assert pattern, I’d love for you to try it out.

👉 GitHub Repository
📦 npm install arrange-act-assert

How I created this Crash-Safe Persistence System in Node.js — Avoid this common practice

Llorx — Wed, 28 May 2025 23:31:20 GMT

How I created this Crash-Safe Persistence System in Node.js — Avoid this common practice (With animations 😊)

The problem

As a developer, you may find yourself (if it hasn’t happened already) needing to store a small cache or some kind of local state that must persist across runs. The usual approach to store data is to use a database, but sometimes that’s overkill, especially if you just need to store a little bit of data. In these cases, you might end up just saving the data in a local file and rewriting that file when the data changes. And this is the flaw.

The flaw

Imagine that you manage a parking lot. Someone arrives with an orange car and wants to park it, so you send him to the slot 1, he parks, and leaves. Later, this same person returns with a blue car and wants to switch vehicles. You tell him to unpark the orange car and then to park the blue car in the same slot 1. In the best-case scenario, this will work, but if something happens while he is unparking the orange car (a worker needs to suddenly leave, for example), when a new worker joins, there will be no cars in any slot and he will not know what the situation was.

That’s exactly what can happen when you overwrite data in a file. If a crash or power outage occurs mid-write, you can end up with a partially written file, which is often unreadable. Best-case scenario? You detect the corruption and reset the file, only losing your data. Worst-case? You read a mix of old and new data, leading to a false sense of correctness.

The solution

Continuing the parking lot analogy, one common way to overcome this is to park the blue car in the slot 2, and when you are sure that the blue car is correctly parked, then unpark the orange car. This is known as copy-on-write.

Actually, there’s more to it: we need integrity checks to avoid loading invalid data, versioning to avoid duplicates, space reclaiming to avoid files growing forever, and, last but not least, data relocation to shirnk this reclaimed space if needed. All these methods are applied by this Node.js persistency library and will be explained in detail in this post.

The details

The main feature to avoid data corruption during a write is to apply copy-on-write. Instead of overwriting existing data, we always append a new version:

async function appendData(key:string, value:any) {
    const data = Buffer.from(JSON.stringify({ key, value }));
    const dataSize = Buffer.allocUnsafe(4);
    dataSize.writeUint32BE(data.length);
    await Fs.appendFile(dataSize); // Save the data size as 4 bytes
    await Fs.appendFile(data); // Save the data
}
await appendData("blue", { value: "X" }); // write "blue"
await appendData("red", { value: "Y" });  // write "red"
await appendData("blue", { value: "Z" }); // write a new "blue" instead of overwriting the old one

But because the code will get more complex over time, I’m going to use visuals instead. This script can be visualized like so:

With this simple trick, if the last blue “value: Z” write fails, you still have the previous blue “value: X” entry untouched. I’m sure that you’ve noticed a big problem with this: The data file will grow infinitely, but worry not, as we are going to implement a space reclaiming system.

After we ensure that the new blue “value: Z” entry is fully written, we reclaim the old blue “value: X” space, so when new data is added, it overwrites the unneeded old blue “value: X” entry:

But, how do we ensure that the data is written to the storage device? That’s a simple question but has a pretty complex answer, so we are going to summarize it with two features: fsync and time.

When you tell the OS to write data to the storage device, the OS will cache this data and send it to the storage device at some point in the future. To overcome this, we signal the OS to flush the data to the storage device with Linux’s fsync or Windows’ FlushFileBuffers, everything wrapped under the Node.js’s fs.fsync() API method.

But there’s a twist: Storage devices usually have an internal cache. Even after flushing the data, they might delay writing data physically. That’s very device-specific and different manufacturers offer different methods to avoid losing this cache, like using a capacitor to store enough power to flush the internal cache during a power outage. But still, to be safe, we assume the worst. The only way to ensure that it’s safe to reclaim old data is with time. When we want to reclaim old data, we can wait a specified amount of time to ensure that storage devices have finally persisted the new data. After some reading, I think that 15 minutes is, by far, way more than needed to assume the data is safely persisted, even on older or poorly designed devices:

But, what can we do with this empty space? Just fill it with new entries? But what if you deleted a large section and don’t expect to write that much? Worry not again, as we can implement a resilient compacting system by following the same principles as before.

When a space is reclaimed because the reclaiming delay has triggered or the data was directly deleted by the user, we can move entries from the end of the storage to this free space, and after that, we are going to dealloc (or truncate) this storage to reduce its maximum size.

But we don’t just move the data. To avoid data loss during compaction we need to follow the same principles defined above, so first we copy the last entry to the free location, and only after we are sure that the data is written (reclaim delay), we can free the old entry that we copied:

When deleting entries, keep in mind that you’re also removing all of their previous versions that haven’t yet passed the reclaim delay:

But, what would happen if I’m overwriting a reclaimed space and a power outage happens? I will read mixed data when I open the file again! Yes, that’s why we added integrity checks. Every time we store data, a hash is computed and stored alongside the data, so when we load the data we will know if the data is valid. If any invalid data is found, compaction is applied:

When entries are moved or rewritten, how do we know which version is the newest? Each entry has a version number. When loading, we simply keep the entry with the highest version (and then we apply compaction, as always):

After all this, we still have a small problem: each entry could have a different size, so when we save our data, we need to somehow save the data size so we know where this entry ends and the next entry begins. But what would happen if an entry is corrupted? We won’t know the entry size, so we won’t know where the next entry begins and we will lose all the entries after this corrupted one:

To overcome this situation, we have two files: one for the metadata, where each entry has a fixed size, and another for the data that just stores bytes:

If an entry in the metadata file is invalid, we just skip ahead by its fixed size to encounter the next entry, ensuring we don’t lose everything after it:

Having two files, compaction is handled individually on each file. To compact the data file we move the data, create a new entry and reclaim the previous one:

Sometimes we end up with a space in the metadata file (not the data file) that we need to compact without moving any data. It’s easier, as we only need to copy the last metadata entry pointing to the very same data location. When two metadata entries with the same version are found, the leftmost one takes precedence:

The conclusion

With all this, we’ve built a very resilient local persistence system with:

💾 Data durability to avoid losing information in worst-case scenarios.
🛡️ Data integrity to avoid loading invalid entries.
♻️ Space reclaiming to avoid infinite growth.
📉 Automatic compaction to shrink the file sizes when needed.

You can find all of these features implemented in Node.js, without dependencies 😊:

👉 https://github.com/Llorx/persistency

npm install persistency

There are other approaches too, like Write-ahead logging (wal). Maybe I create a Node.js “persistency-wal” library and a detailed post on how it works 😉.