itertools.groupby isn't really the groupBy operation that people would normally expect. It looks like it would do a SQL-style "group by" where it categorizes elements across the collection, but really it only groups adjacent elements, so you end up with the same group key multiple times, which can cause subtle bugs. From my experience, it's more common for it to be misused than used correctly, so at my work we have a lint rule disallowing it. IMO this surprising behavior is one of the unfriendliest parts of Python when it comes to collection manipulation.
It's common in other languages as well (at least Haskell) and a bit surprising at first. However, a `.sortBy(fn).groupBy(fn)` is easy and of similar efficiency and when you actually need the local-only `groupBy()` you're happy it's there.
A bit more expressive overall.
At least it is better than lodash' useless groupBy which creates this weird key value mapping, loses order and converts keys to string and what not.
yep, that's a good example of what I refer to as IKEA assembling your groupby. You need to put something like 3 parts together before it does what you want, and they aren't that intuitive (or they only are in retrospect).
The resulting groups are also iterators which are exhaustible. It's good if you're running group by on a huge dataset to save some memory, but for everyday operations it's another trap to fall into.
Yes, for itertools.groupby to work as most people would expect, the data needs to be sorted by the grouping key first. That may obviously cause a significant performance hit.
https://docs.python.org/3/library/itertools.html#itertools.g...