You do not need a database. It is trivial (and correct) to create a ~'<x>-status' topic. In the forward arc you are reliably propagating job requests (acked). In the backflow the processing status of job is posted for anyone interested. You can even propagate retries, etc. It is an MQ and RabbitMQ shines in defining complex dispatch toplogies.
Yeah but they already have a database, so it's not like they're adding a database to the system. And (as the article says) the database already contains state, so it makes way more sense to remove RMQ and hold all the state in the database (like they did).
Because a queuing system offers a different thing than a (relational) database.
You can build a queuing system with a database, but you have to do that. Some of the features and constraints of the database might even make your life harder than it has to be.
Instead, view it like that: there is a need for a queuing system and a job system. Either or both can be implemented using a database for certain concersn, but it can also be a custom implementation. It's not a great idea to mix the two things unless the operational and infrastructure costs and complexity outweigh the benefits of a clear separation.
There are libraries that implement queueing on top of databases that require very little setup by the user. For example https://github.com/timgit/pg-boss
Right. However, please don't forget that there are very often inherit limitations regarding scaling or availability and also that many fully fledged message queues come with a lot of perks like access management and administration / debugging tooling and interfaces.
I'm not saying that libraries like pg-boss and co. cannot sometimes replace a full queue implementation. But the tradeoffs need to be clear.
It's all a matter of how much throughput you need. A queuing system can handle, in the same hardware, orders of magnitude more than a traditional SQL database that writes rows to disk in a page-oriented fashion.
If your load is, say, a few hundred writes/second, stick with the database only, and it will be much simpler.
No, because you only need to read and write ids and maybe timestamps to your db, both of which are trivially indexed, rather than the whole blob of your message payload.
In many cases, the message payload is (or should be) an ID anyway. It's seldom desirable for the message payload to include a copy of an external source of truth, because it can cause sync issues. There are exceptions, of course.