One of my less well received interview questions is: "Okay, we both know theoret...

bermanoid · on June 15, 2011

If we're talking about actual sorting algorithms that aren't pathologically inefficient (i.e. if we're not allowing algorithms that deliberately make negative progress or do random stuff instead of trying to find the solution) I'd think the "pessimal" (which, TBH, I'm not 100% clear on the definition of, whether we're talking worst case runtime, worst best-case runtime, average, etc.) deterministic solution would have to be essentially to enumerate the permutation group of N elements, checking each time whether the permutation creates a sorted list or not. There are factorial(N) members in this group, which is pretty bad worst-case running time for a sort.

There are many ways to enumerate permutations, and we could probably pick a terrible one there, too, especially if we use a really naive implementation without any memoization or anything like that we could probably waste all sorts of time.

Bonus points for freshly generating the whole list of permutations fresh for each isSorted(applyPermutation(i,list)) test.

I think to do much worse you'd probably have to start cheating, throwing in garbage calculations that are merely used to waste time, not steps that anyone would actually think of using to progress towards the solution (i.e. I could certainly dream up a nasty brute-force problem to solve for the index integers used in the inner loop, but that's in the category of deliberate time wasting) - the nice thing about this solution is that it's 100% conceivable that someone that wasn't thinking straight could come up with it and implement it just like this. Hell, a poorly written Prolog would probably come close to implementing this algorithm if the sorting problem was specified in the right (wrong?) way...

There's also always SlowSort: http://c2.com/cgi/wiki?SlowSort, which works by finding the maximum element, then recursively sorting the rest of the list. That wouldn't be so terrible if it weren't for the twist: SlowSort finds each maximum element is by splitting the list into half, sorting both lists (using SlowSort, of course) to read off the maxima, and then picking the bigger one. I think this still runs faster than N!, but its runtime is guaranteed, whereas when generating permutations you might accidentally hit the right one in the first step (could always work around that by calculating all the lists first and then checking them, but again, that seems like cheating).

skrebbel · on June 15, 2011

I think you get the blank stares because half the interviewees don't know what "pessimal" means.

cubicle67 · on June 15, 2011

Cmd+Ctrl+d: no entries found

I guess it's a mix of optimal and pessimistic. Pessimistically optimal ("This may be optimal, but I doubt it") or Optimally pessimistic ("This is absolutely the worst possibly outcome ever") :)

Leynos · on June 15, 2011

The latter:

https://secure.wikimedia.org/wiktionary/en/wiki/pessimal

The other time I've seen a similar derivation used is in "The Story of Mel"

'Mel called the maximum time-delay locations the "most pessimum". '

amalcon · on June 15, 2011

We're clearly not talking about comparison sorts here, which blows the options wide open. O(keyspace) sorts (especially a naive counting-sort) are truly terrible for short lists with large keyspaces.

Absolute worst non-probabilistic for large inputs? Enumerate all possible lists given the keyspace, check if it contains the same values as the input, check if it's sorted. This is O((k^n)n!) worst-case (and I think average as well), where k is the size of the keyspace and n is the size of the input. Even if you take the obvious optimizations (enumerate only sorted lists, enumerate only lists of the correct length, etc) it's still terrible. The former improves it substantially, but not enough to finish before the heat-death of the universe on a substantial input size. The latter has no effect on the complexity.

edit: Corrected the complexity. It's even worse.

skimbrel · on June 15, 2011

Bogobogosort (http://www.dangermouse.net/esoteric/bogobogosort.html) would be a good candidate except that it uses random permutation and therefore might not terminate. Maybe if you replaced the random step with a deterministic one that always yielded a not-yet-seen permutation...

sanj · on June 15, 2011

I always like the "shaking box" algorithm:

1. Check if list ordered. If it is, we're done!

2. Randomize order of list. Go back to step 1.

N!

TeMPOraL · on June 15, 2011

Worst case O(\inf), it's known as 'random sort' or 'bogosort'.

Funny thing, it can theoretically be O(1) on quantum computers, assuming the multiple-universe interpretation of quantum mechanics: http://en.wikipedia.org/wiki/Bogosort#Quantum_bogosort ;).

Dove · on June 16, 2011

My favorite presentation (http://c2.com/cgi/wiki?QuantumBogoSort) goes approximately like this:

1. Randomize the list. 2. If the list is not sorted, destroy the universe.

random42 · on June 16, 2011

Is randomizing a list of n entities once is an O(n) operation. not sure how it would be done in O(1) on quantum computers.

TeMPOraL · on June 17, 2011

The link that Dove gave suggests there's an idea of quantum randomization that will split our universe into n! universes in constant time.

spatulon · on June 15, 2011

This is usually known as bogosort. http://en.wikipedia.org/wiki/Bogosort

skimbrel · on June 15, 2011

This is more commonly known as bogosort (http://en.wikipedia.org/wiki/Bogosort), and it's a good start for running time but isn't guaranteed to terminate. See my other comment for a link to something that goes one better than bogosort on running time.

X-Istence · on June 16, 2011

If the shuffle is truly random eventually you will find that the list is sorted and the bogosort will terminate. If your shuffle is not truly random then it may likely never terminate.

skimbrel · on June 16, 2011

Not true at all.

Given a uniform distribution over k discrete values (in this case permutations of the list), the expected probability of the next permutation is 1/k. This is not the same as saying that if you take k samples from the distribution, you are guaranteed to receive exactly one of each value. For any finite number n of samples, no matter how large, the probability of getting a given value tends towards 1 as n grows, but it only approaches it asymptotically. While the probability of not seeing any one value from the domain grows vanishingly small as n gets much larger than the number of possible values, it never hits zero. So even with a stochastically random source of list permutations, we cannot place a finite upper bound on a time in which bogosort is guaranteed to terminate.

It gets even worse with pseudorandom number generators — these display cyclical behavior over time (any good one will take quite a while to show it, but they all will), and you might have picked a seed for your PRNG that generates a cycle that never includes the sorted permutation.

So no, bogosort is not guaranteed to terminate. In practice, it has a reasonable chance of terminating for most inputs, and it will do so with an expected (not guaranteed!) runtime in O(n!) for a list of length n.

danvet · on June 16, 2011

Actually this is about one of the first things you learn in in a probability theory course, and the probability that bogosort completes in finite time is 1.

Now of course, there's infinitely many conceivable scenarios where bogosort never completes, but it doesn't matter. The theoretical foundation (Lebesgue measure theory) to make this mathematically rigid is a bit complex. An example to illustrate:

Assume you have're looking at a 2-dimensional space (over the real numbers) and you want to measure the area of subsets of it. E.g. the area of a square defined as:

{ (x,y) \in \R | 0 <= x <= 1 and 0 <= y <=1 }

Now the area of that is 1. Then carve out the y-axis like this:

{ (x,y) \in \R | 0 < x <= 1 and 0 <= y <=1 }

It's obviously a smaller set than the square above, but the set that's missing doesn't have any area (it's a line with no width). So it's sane to assume that the area of the new subset is still 1.

[btw: Lebesgue measure theory throws in a few more simple assumption and then derives a very nice theory for computing areas, and in extension integration of functions.]

Now probabilities are the same: Instead of points in the plane you have different runs of bogosort and the measure of the "area" for all events is defined to be 1. But you still can carve out complete "lines" (i.e. sets with even infinte many different runs of bogosort) that don't have any influence on the value of your total "area", i.e. probability.

To conclude, your in /pratice/ actually holds up in theory, too ;-)

Edit: Lebesgue measure is actually only the application onto real spaces, the fundamental thing that also applies to probabilities is just called measure theory. Mixed that up a bit.

skimbrel · on June 16, 2011

I was working from this definition of "almost surely": http://en.wikipedia.org/wiki/Almost_surely

If I understand that correctly, we're dealing with a case analogous to the coin-flipping scenario.

And since, as you admit, there are infinitely many scenarios where a bogosort never completes, we cannot place a finite upper bound on its running time for all cases, which is what complexity theory's big-O notation is all about. As I stated, the average-case complexity for bogosort is in O(n!), but the worst-case complexity isn't bound by any function.

smosher · on June 16, 2011

That's not right. True random sources make no guarantees, and nothing says they need to be exhaustive.

biot · on June 15, 2011

Start by encoding each item to sort into an expression consisting of factors of prime numbers so that it's reversible. Conway chain notation of the factors would make a nice pessimistic next step, though that might violate your termination requirement as there won't be enough Planck volumes in the known universe to represent anything beyond the first few terms. :)

jerf · on June 15, 2011

The word to google is "simplexity", though that gets used for other less amusing things too, so adding the word "sort" helps. Two examples:

http://web.cs.wpi.edu/~dd/2223/broder_pessimal_algorithms.pd...

http://www.hermann-gruber.com/data/fun07-final.pdf

sesqu · on June 15, 2011

I don't like this question. Comparison- or value-based? Worst in the average or the best case? What's to stop me from mapping the inputs into, say, busy beaver space?

kabdib · on June 16, 2011

Just the fact that you object and have something interesting to say puts you ahead of ninety percent of the candidates I see.

[One guy, no foolin', told a cow-orker that a byte, on an Intel platform, contained four bits. Wow.]

sesqu · on June 17, 2011

In an actual interview situation, I'd be likely to just choke, because it's a question I can't get a good handle on. I hope you prod & probe when asking that question.

mjw · on June 16, 2011

Ask a silly question, get a silly answer:

    for i = 0 to ackerman(list.length, list.length)
      pass

    return quicksort(list);