chearon.net

A few years ago I was reading about Chrome’s new layout architecture when a paragraph stopped me in my tracks:

Rather than a tree structure with boxes and pointers, we represent inline content in a flat list representing the tree. The primary benefit is that a flat list representation for inlines is fast, useful for inspecting or querying inline data-structures, and memory efficient.

I knew from working with contentEditable and native text selection APIs that tree data structures are difficult to use for text, which is naturally more linear. They continue:

The flat list is created for each inline formatting context in the order of a depth-first search of its inline layout subtree. Each entry in the list is a tuple of (object, number of descendants).

This sounded great to me, because a visit of every node of the tree in pre-order is a simple for loop over the whole list! I will show that loop later on.

Even better is the lack of pointers. If a tree can be represented as a flat list of nodes who need only an integer to indicate their descendants, then the tree can be freed in one call, there are no pointers that can be dangling, and only one lifetime to worry about.

Not only is the memory layout compact, which is good for CPU cache, it also has a minimal footprint, since each node has only one integer to track children instead of a child list or pointers, both of which have much higher memory costs multiplied by the number of nodes. It’s data-oriented design for tree data structures.

Dropflow is written in TypeScript, so I’m not concerned with lifetimes, but I have been reducing its memory footprint aggressively while preparing data structures for a future port to Zig¹.

Now, Chrome is not using this representation for the CSS Box tree. They use three pointers for parent, next, and previous, as well as a vector for children. The flat data structure is only used for the layout output. But I ended up using this data structure basically everywhere.

Pre-order visitation

Let’s say you have the following HTML and you pass it into dropflow:

<html><!-- Box 1 -->
  <div><!-- Box 2 -->
    <!-- Box 3 is created anonymously and starts here -->
      <em>I'm in box 4!</em>
      <strong>I'm in box 5!</strong>
    <!-- /Box 3 -->
  </div>
  <div></div> <!-- Box 6 -->
</html>

If you read the document start to finish and note only the opening tags, you’ve noted them in pre-order visitation. There is also an implicit box opened after a block-level element (<div>) opens before inline content (<em>)².

Go down the tree in the following diagram and you’ll get the same order, jumping back up through the parents to finish their children whenever you finish a child list:

If box 3 had a sibling after it, then it would be the 6th box, and box 6 would become box 7. Let’s move on to the data structures.

The Baseline Data Structure

Up until recently the data structure dropflow used to represent a box looked like the following.

class Box {
  id: string;
  children: Box[];
}

The id is an instance counter used for logging. I’ve long wanted this id to increase relative to the document itself, so that when the layout is regenerated, the ids are the same as last time. This will be relevant later.

Note that there are no parent or sibling pointers, because they can both be inferred. A design principal I follow is to never store redundant information.

Using our previous example, the tree would have been stored like this:

Box([ // Box 1 (<html>)
  Box([ // Box 2 (<div>
    Box([ // Box 3 (anonymous)
      Box(['I\'m in box 4!']),
      Box(['I\'m in box 5!'])
    ])
  ]),
  Box([]) // Box 6 (<div>)
])

Observations: that’s 6 arrays, one per box.

The Baseline Iterator

Dropflow usually iterates with a stack instead of recursion for better performance:

const stack: (Box | {sentinel: true})[] = [subject];
const parents: Box[] = [];

while (stack.length) {
  const item = stack.pop()!;

  if ('sentinel' in item) {
    const parent = parents.pop()!;
    // process parent post-order
  } else {
    // process item pre-order
    parents.push(item);
    stack.push({sentinel: true});
    for (let i = box.children.length - 1; i >= 0; i--) {
      stack.push(box.children[i]);
    }
  }
}

Observations: I don’t like having to have the sentinel on the stack, and I don’t like the verbosity of pushing the children on the stack in the correct order.

The New Data Structure

To keep track of children in the one giant tree array, a Box needs at least one integer to mark the end of its descendants. What I ended up with is a slice into the tree array:

class Box {
  treeStart: number;
  treeFinal: number;
}

I had first tried to copy Google by using a length, but I ended up using an inclusive index, treeFinal, to mark the last child. You’ll see why in the next section.

The treeStart is not strictly necessary since you’ll have the index of the box from iterating the tree. However it makes it easier to have methods on the box which need to access children — they’d otherwise need to be passed their index. And it acts as an identifier: the old id property is both removed and improved: it stays stable across reflows!

Here’s how our example tree looks using this new data structure:

const tree = [
  Box(0, 5), // Box 1 (<html>)
    Box(1, 4), // Box 2 (<div>)
      Box(2, 4), // Box 3 (anonymous)
        Box(3, 3, 'I\'m in box 4!'),
        Box(4, 4, 'I\'m in box 5!'),
    Box(5, 5) // Box 6 (<div>)
];

This time, we only use one array. Walking this data structure is different than usual recursive or stack-based iteration:

Just iterate the list start to end to visit boxes in DFS pre-order.
If a box has no children, its treeStart and treeFinal are the same.
To skip over a child, just move to its treeFinal + 1.

Observations: we’ve removed an array instance per tree node. That can be huge. An empty array takes 88 bytes in Firefox, so a document with 100,000 empty boxes would take at least 8MiB in the old implementation and only 242KiB in the new implementation!³

The New Iterator

Now to show why I used an index instead of a length for treeFinal, which has to do with the parents stack. By using an end index, it’s very easy to compare the parent at the top of the stack with the current index:

const parents: Box[] = [];

for (let i = subject.treeStart; i <= subject.treeFinal; i++) {
  const box = tree[i];
  // process box pre-order

  parents.push(box);
  while (parents[parents.length - 1].treeFinal === i) {
    const parent = parents.pop()!;
    // process parent post-order
  }
}

Observations: I like this code. It doesn’t involve the awkward sentinel, it naturally iterates the children in logical order, and it’s smaller. It’s also faster, though not a lot faster in JS.

This being the real world, there are drawbacks to this solution:

Earlier, I mentioned how to skip over a node by moving to its treeFinal + 1. I found that in practice, it is easier to mess this up than the old way. If you’ve pushed onto parents, you have to be careful to do this before the inner while loop, or not push at all.
The tree (global list of Box) must be passed around to every function. This is sort of an anti-object-oriented approach, which to me is a good thing, but we do lose some convenience.

Implementation

I first tried to use this data structure only for inline content, while still having the Box[] list on the block parts of the tree. This ended up slowing down all of my performance tests significantly due switching iterators whenever we switch between block and inline content.

So I decided to make the most invasive change: use it everywhere. It took me several months, but in the end, CPU cycles went down on all of my performance tests.

I have one test that reflows and paints a 14,000 line syntax-highlighted JS file. This went from 68 MiB peak memory to 61 MiB, and from 3.77G cycles to 3.61G (about 880 ms to 857 ms) on my 2024 MacBook Pro. The CPU improvements are nothing to write home about, but the memory and foundation for a future port to Zig are looking pretty good!

Commit