> A lot of what you're saying is about the history of DOM.
That's because my whole point was to elaborate on the statement I made in the first comment:
>The legitimate grievances associated with JS over the years were due to the browser APIs and first of all DOM,
People who were complaining about web dev and JS over the years were mostly doing it because of the browser APIs and DOM, hence my recollection of recent history. And of course I do point out what was fixed, and I said from the start that it's far better now than it was before. I'm rooting for VanillaJS after all.
I cannot comment on other GUI toolkits since I don't work with them. You may be right that they too are low level, and I wholeheartedly you're right about DOM being better, because I hope DOM with Web Platform replaces them all one day. That doesn't exempt DOM from criticism though.
Technically DOM still doesn't have API for bulk element creation since innerHTML/insertAjacent comes as a separate API which is still a working draft. But that's just details.
There is another problem somewhat related to DOM being low level. Although it has to do with the browser implementations rather than the standard, it's still a problem for the end user, and the user is going to blame it on the web/JS being bad in general. That is performance. As it was mentioned here, DOM is separated from JS engine in browsers. Calls to DOM from JS are a lot more expensive than calls inside the engine. This gave rise to the whole Virtual DOM movement we see taking over the web dev. It's not just the verbosity of low level DOM that pushes people towards React and such, but the fact that at the end of the day their approach of mimicking DOM in the engine and limiting the DOM calls turns out to be more performant that the low level manipulations we do directly on the DOM.
> Calls to DOM from JS are a lot more expensive than calls inside the engine.
Sort of. Calls to DOM from JS are a few dozen machine instructions for the call itself in modern browsers; a little more if lots of arguments are involved. The slowness is what the DOM implementation actually has to _do_ as a result of the call. If you write a loop in which you repeatedly modify styles and then ask for geometry information, then the only options an implementation has are to provide stale geometry information (the virtual DOM approach!) or end up with that loop being a lot slower than asking for all the geometry information up front and then doing your modifications.
We can have a useful discussion about whether it should be possible to ask for stale geometry information, but that has nothing to do with calls from JS into the DOM per se; a DOM implemented in JS but exposing the same API as the current DOM would have _exactly_ the same problem in that regard.
Yes, indeed, DOM is doing a lot of heavy lifting and the cost of the call itself might be small in comparison. And you're raising a valid point with geometry changes which are indeed a terrible thing to do in a loop due to recalculations/reflows they cause.
What got me first thinking about the cost of DOM calls were benchmarks we did back when we were building polyfills for IE6-IE8 to support new HTML5 APIs. One example in particular--mimicking data attributes. Our lib would hold a mapping between DOM elements and attached data attributes (as simple objects) and it would be significantly faster than using native element.dataset. While dataset isn't as simple as plain JS object, it's not much complicated either, and to my knowledge changing it doesn't cause reflows or repaints, so I expected it to be slower but not very much; hence, I concluded that the additional speed came from avoiding the DOM call itself. Since then I've been cautious about DOM and tried to cache whatever came from it and was supposed to be used repeatedly.
Ah, element.dataset is an interesting case. It's a _lot_ slower than normal DOM calls, because it needs to map a different underlying data structure: the data in the dataset needs to be reflected in the element attributes. And the set of names it exposes is not fixed.
So you have to implement it as a Proxy, so it can capture arbitrary property assignments, including for properties it doesn't have yet, and do the corresponding setAttribute calls. Unfortunately, once you're a Proxy your gets end up somewhat slow too. Partly this is because JITs haven't optimized proxies that much, and partly it's because they're rather hard to optimize in the best of circumstances.
I expect that the actual implementations of dataset are not as fast as they could be if they used scripted and inlinable proxy handlers _and_ the JITs had implementations for those. But there's still a lot more work involved in dataset than a plain object, especially if you have a small number of property names in practice so the plain object doesn't have to convert to dictionary mode or anything like that.... Even if dataset were implemented on top of a pure-JS DOM implementation (which exist), it would be a lot slower than just a simple object, unfortunately.
That explains it, along with the fact that getAttribute('data-*')/setAttribute() seem to be faster in microbenchmarks[1] than getting/setting properties on element.dataset. Thanks for clearing this up, I'd have never guessed that about dataset!
Yeah, a get on a dataset has to do strictly more work than getAttribute: it has to convert the string it has to a "data-whatever" string and then call getAttribute.
But note that in microbenchmarks you are likely to get some confusing effects. For example, in Firefox this microbenchmark:
1) getAttribute is known to be side-effect free when called with a string argument, its return value is not used, the call can be dead-code eliminated.
2) getElementById is known to be side-effect free when called with a string argument, its return value is not used after step 1, the call can be dead-code eliminated.
3) The get of the "document" property is known to be side-effect free, its return value is not used after step 2, the get can be dead-code eliminated.
So in the end the microbenchmark is measuring how fast the browser can increment a loop counter, and that only because we haven't bothered to try dead-code eliminating that. That's why you get numbers in the billions of operations per second range (comparable to the CPU clock speed; always a dead giveaway that your thing got optimized out). ;)
Note that Firefox will also perform loop-hoisting on all of the above if possible, so even if the return value were assigned somewhere that would not matter: the whole thing would just get hoisted out of the loop.
The setAttribute and "dataset.set = stuff" benchmarks don't have these problems, because those operations are clearly not side-effect free. A sufficiently advanced JIT might be able to determine that earlier iteration assignments are dominated by later ones and eliminate them, but now we're talking quite hard work on the part of the browser.
Yes, after watching so many of Vyacheslav Egorov's (from V8) talks on microbenchmarking, I'm convinced that JIT's are doing some serious witchcraft unmeasurable by microbenchmarks; yet googling relevant tests on jsperf is tough habit to shake off. As this example shows.
While we are on the subject, can I ask if there are any performance advantages of using data attributes over custom attributes for storing data in an element? In other words, can adding non-standard attributes to an element cause any deoptimizations? I know that JS engines use hidden classes/object shapes to optimize JS objects, I assumed something similar might be the case for DOM elements, in that case adding non-standard attribute must mean deoptimization.
That is incorrect. React is faster than destroying the document and rebuilding it from scratch on every day's change. It is slower than making your own stateful edits.
True, but the culprit is still DOM calls being expensive. Virtual DOM minimizes those calls through various optimizations, including not destroying elements unnecessarily as you said, but it's not limited to it.
Consider the following scenario. You have two handlers on an event that both change the DOM and may result in cancelling each other. With the "raw" DOM you'll end up calling DOM at least twice (and changing it twice if no debouncing is used), whereas Virtual DOM both times calls to, well, its "virtual" DOM (which is a lot cheaper) and by the time it gets to its next cycle of updating the "real" DOM it may not need to change anything or do it once. These optimizations are hard to implement without resorting to a virtual DOM of one sort or another.
That's because my whole point was to elaborate on the statement I made in the first comment:
>The legitimate grievances associated with JS over the years were due to the browser APIs and first of all DOM,
People who were complaining about web dev and JS over the years were mostly doing it because of the browser APIs and DOM, hence my recollection of recent history. And of course I do point out what was fixed, and I said from the start that it's far better now than it was before. I'm rooting for VanillaJS after all.
I cannot comment on other GUI toolkits since I don't work with them. You may be right that they too are low level, and I wholeheartedly you're right about DOM being better, because I hope DOM with Web Platform replaces them all one day. That doesn't exempt DOM from criticism though.
Technically DOM still doesn't have API for bulk element creation since innerHTML/insertAjacent comes as a separate API which is still a working draft. But that's just details.
There is another problem somewhat related to DOM being low level. Although it has to do with the browser implementations rather than the standard, it's still a problem for the end user, and the user is going to blame it on the web/JS being bad in general. That is performance. As it was mentioned here, DOM is separated from JS engine in browsers. Calls to DOM from JS are a lot more expensive than calls inside the engine. This gave rise to the whole Virtual DOM movement we see taking over the web dev. It's not just the verbosity of low level DOM that pushes people towards React and such, but the fact that at the end of the day their approach of mimicking DOM in the engine and limiting the DOM calls turns out to be more performant that the low level manipulations we do directly on the DOM.