Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

at which point, if you've got to use a DB to track status, really why bother with the queuing system?


You do not need a database. It is trivial (and correct) to create a ~'<x>-status' topic. In the forward arc you are reliably propagating job requests (acked). In the backflow the processing status of job is posted for anyone interested. You can even propagate retries, etc. It is an MQ and RabbitMQ shines in defining complex dispatch toplogies.


Yeah but they already have a database, so it's not like they're adding a database to the system. And (as the article says) the database already contains state, so it makes way more sense to remove RMQ and hold all the state in the database (like they did).


Because a queuing system offers a different thing than a (relational) database.

You can build a queuing system with a database, but you have to do that. Some of the features and constraints of the database might even make your life harder than it has to be.

Instead, view it like that: there is a need for a queuing system and a job system. Either or both can be implemented using a database for certain concersn, but it can also be a custom implementation. It's not a great idea to mix the two things unless the operational and infrastructure costs and complexity outweigh the benefits of a clear separation.


There are libraries that implement queueing on top of databases that require very little setup by the user. For example https://github.com/timgit/pg-boss


Right. However, please don't forget that there are very often inherit limitations regarding scaling or availability and also that many fully fledged message queues come with a lot of perks like access management and administration / debugging tooling and interfaces.

I'm not saying that libraries like pg-boss and co. cannot sometimes replace a full queue implementation. But the tradeoffs need to be clear.


I’ve had a positive experience with Procrastinate.

https://procrastinate.readthedocs.io/en/stable/


When you’re dealing with billions of messages, i think queuing systems may be tuned more for it?

I’d like to hear why people chose Kafka over some RDBMS tables.


Billions of jobs with hours long time to complete seems like something no one would have the resources for.


Just wait till a start convinces someone to give them money for a system that allows anyone, anywhere to queue up infinite loops XD


I'm going to be honest, I think some 80% of Kafka users are overengineering it.

Kafka has specific use cases but it seems a lot of people just go "ok use Kafka here" and wait for the load that rarely comes


In an online environment, having your system go down during those rare times will eventually cost you your business.


Sure, then kafka stays online and the rest doesn't because you forgot some details.


Performance and distributed nature.


Queues are good for connecting separated systems (like 2 or more separate companies).


It's all a matter of how much throughput you need. A queuing system can handle, in the same hardware, orders of magnitude more than a traditional SQL database that writes rows to disk in a page-oriented fashion.

If your load is, say, a few hundred writes/second, stick with the database only, and it will be much simpler.


how does that help if you still have to have a DB tracking status? you still need the same order-of-magnitude of DB throughput


No, because you only need to read and write ids and maybe timestamps to your db, both of which are trivially indexed, rather than the whole blob of your message payload.


In many cases, the message payload is (or should be) an ID anyway. It's seldom desirable for the message payload to include a copy of an external source of truth, because it can cause sync issues. There are exceptions, of course.


I don’t think it should be an ID - these platforms are really made for creating distributed event-driven systems.


The idea is your task should just run off an ID, no point passing all that data around.


How do you get the data relevant to that ID?

If the answer is a call to a shared database, you might as well not have RabbitMQ.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: