> A lot of what you're saying is about the history of DOM. That's because my who...

bzbarsky · on Nov 1, 2016

> Calls to DOM from JS are a lot more expensive than calls inside the engine.

Sort of. Calls to DOM from JS are a few dozen machine instructions for the call itself in modern browsers; a little more if lots of arguments are involved. The slowness is what the DOM implementation actually has to _do_ as a result of the call. If you write a loop in which you repeatedly modify styles and then ask for geometry information, then the only options an implementation has are to provide stale geometry information (the virtual DOM approach!) or end up with that loop being a lot slower than asking for all the geometry information up front and then doing your modifications.

We can have a useful discussion about whether it should be possible to ask for stale geometry information, but that has nothing to do with calls from JS into the DOM per se; a DOM implemented in JS but exposing the same API as the current DOM would have _exactly_ the same problem in that regard.

maga · on Nov 2, 2016

Yes, indeed, DOM is doing a lot of heavy lifting and the cost of the call itself might be small in comparison. And you're raising a valid point with geometry changes which are indeed a terrible thing to do in a loop due to recalculations/reflows they cause.

What got me first thinking about the cost of DOM calls were benchmarks we did back when we were building polyfills for IE6-IE8 to support new HTML5 APIs. One example in particular--mimicking data attributes. Our lib would hold a mapping between DOM elements and attached data attributes (as simple objects) and it would be significantly faster than using native element.dataset. While dataset isn't as simple as plain JS object, it's not much complicated either, and to my knowledge changing it doesn't cause reflows or repaints, so I expected it to be slower but not very much; hence, I concluded that the additional speed came from avoiding the DOM call itself. Since then I've been cautious about DOM and tried to cache whatever came from it and was supposed to be used repeatedly.

bzbarsky · on Nov 3, 2016

Ah, element.dataset is an interesting case. It's a _lot_ slower than normal DOM calls, because it needs to map a different underlying data structure: the data in the dataset needs to be reflected in the element attributes. And the set of names it exposes is not fixed.

So you have to implement it as a Proxy, so it can capture arbitrary property assignments, including for properties it doesn't have yet, and do the corresponding setAttribute calls. Unfortunately, once you're a Proxy your gets end up somewhat slow too. Partly this is because JITs haven't optimized proxies that much, and partly it's because they're rather hard to optimize in the best of circumstances.

I expect that the actual implementations of dataset are not as fast as they could be if they used scripted and inlinable proxy handlers _and_ the JITs had implementations for those. But there's still a lot more work involved in dataset than a plain object, especially if you have a small number of property names in practice so the plain object doesn't have to convert to dictionary mode or anything like that.... Even if dataset were implemented on top of a pure-JS DOM implementation (which exist), it would be a lot slower than just a simple object, unfortunately.

maga · on Nov 3, 2016

That explains it, along with the fact that getAttribute('data-*')/setAttribute() seem to be faster in microbenchmarks[1] than getting/setting properties on element.dataset. Thanks for clearing this up, I'd have never guessed that about dataset!

[1]https://jsperf.com/dataset-vs-getattribute-and-setattribute/...

bzbarsky · on Nov 3, 2016

Yeah, a get on a dataset has to do strictly more work than getAttribute: it has to convert the string it has to a "data-whatever" string and then call getAttribute.

But note that in microbenchmarks you are likely to get some confusing effects. For example, in Firefox this microbenchmark:

  document.getElementById('test').getAttribute('data-set')

will get optimized as follows:

1) getAttribute is known to be side-effect free when called with a string argument, its return value is not used, the call can be dead-code eliminated.

2) getElementById is known to be side-effect free when called with a string argument, its return value is not used after step 1, the call can be dead-code eliminated.

3) The get of the "document" property is known to be side-effect free, its return value is not used after step 2, the get can be dead-code eliminated.

So in the end the microbenchmark is measuring how fast the browser can increment a loop counter, and that only because we haven't bothered to try dead-code eliminating that. That's why you get numbers in the billions of operations per second range (comparable to the CPU clock speed; always a dead giveaway that your thing got optimized out). ;)

Note that Firefox will also perform loop-hoisting on all of the above if possible, so even if the return value were assigned somewhere that would not matter: the whole thing would just get hoisted out of the loop.

The setAttribute and "dataset.set = stuff" benchmarks don't have these problems, because those operations are clearly not side-effect free. A sufficiently advanced JIT might be able to determine that earlier iteration assignments are dominated by later ones and eliminate them, but now we're talking quite hard work on the part of the browser.

maga · on Nov 4, 2016

Yes, after watching so many of Vyacheslav Egorov's (from V8) talks on microbenchmarking, I'm convinced that JIT's are doing some serious witchcraft unmeasurable by microbenchmarks; yet googling relevant tests on jsperf is tough habit to shake off. As this example shows.

While we are on the subject, can I ask if there are any performance advantages of using data attributes over custom attributes for storing data in an element? In other words, can adding non-standard attributes to an element cause any deoptimizations? I know that JS engines use hidden classes/object shapes to optimize JS objects, I assumed something similar might be the case for DOM elements, in that case adding non-standard attribute must mean deoptimization.

moron4hire · on Nov 1, 2016

That is incorrect. React is faster than destroying the document and rebuilding it from scratch on every day's change. It is slower than making your own stateful edits.

maga · on Nov 1, 2016

True, but the culprit is still DOM calls being expensive. Virtual DOM minimizes those calls through various optimizations, including not destroying elements unnecessarily as you said, but it's not limited to it.

Consider the following scenario. You have two handlers on an event that both change the DOM and may result in cancelling each other. With the "raw" DOM you'll end up calling DOM at least twice (and changing it twice if no debouncing is used), whereas Virtual DOM both times calls to, well, its "virtual" DOM (which is a lot cheaper) and by the time it gets to its next cycle of updating the "real" DOM it may not need to change anything or do it once. These optimizations are hard to implement without resorting to a virtual DOM of one sort or another.

moron4hire · on Nov 1, 2016

Funny, I've gotten by fine for the last twenty years without it.

maga · on Nov 1, 2016

Well, I'm with you since I'm still not using React. But one has to acknowledge the point it makes about DOM handling.