It looks like the NoSQL movement was just a fad and plenty of startups got burne...

dsmithatx · on Aug 16, 2017

I worked with one of the largest ad publishing companies. They wanted to track data about every client served an ad. This generated over 1.2 Terabyte per hour of data when the MySQL master started to max out. We had the largest possible multiple core system. It was going to cost my client $30k to upgrade to SSD drives to get more out of MySQL. Also note we had to store this data on an expensive SAN in order to feed the data at a reasonable rate to MySQL or PostgreSQL.

I had just learned of MongoDB and went to school at 10gen for their sys admin class. I talked to the developer about storing the data in NoSQL using a small sharded cluster on a Friday. Monday morning he asked me to setup a MongoDB cluster. Tuesday we moved over from MySQL. They ended up using much smaller servers, got rid of the 3par and epsilon SAN's and saving tons of money.

My point is there are certain situations where NoSQL is still the answer unless you can cluster your SQL write server. I've moved on from working with Ad publishing clients but, I'm sure there are other places where SQL databases are not adequate.

NoSQL might be or, have been, a fad but, like any tool when used for the right job it works.

qaq · on Aug 16, 2017

A production large scale ad. system system that was doing 1.2 TB per hour in writes, was migrated over from MySQL to MongoDB in roughly 24 hour window cool story Bro.

DenisM · on Aug 16, 2017

Good point. How is it possible to move that much data from SAN to mongo cluster in such a short time window?

qaq · on Aug 16, 2017

and rewrite and validate all the code that was accessing MySQL to use MongoDB

mcintyre1994 · on Aug 16, 2017

If more comments were like their story then this would be a way higher value comments section.

qaq · on Aug 16, 2017

how can more made up stories result in higher value comments section?

tehlike · on Aug 16, 2017

sounds like the wrong tool. what you are producing was probably logs data, which is immutable. There are far more efficient (storage, cpu) write-only stores.

skrebbel · on Aug 16, 2017

Such as?

hvidgaard · on Aug 16, 2017

Concider what a database does. It provides ACID properties and ability to query data. If all you need is writing data, the fastest you can do it write it directly to the disk, without the overhead a database comes with.

Using a loadbalancer in front of a farm of cheap logging machines, and aggreate the data you need for analysis to a suitable machine.

avereveard · on Aug 16, 2017

they'd still need to query their stuff, I guess, so you'd need to trow in there somewhere something to aggregate logs and get the metrics they're tracking out of it - which can totally be done in streaming, without the need of going trough the logs every time, for most metrics.

hvidgaard · on Aug 16, 2017

With that amount of data, streaming and only saving aggregated data is the only sane way. With 1.2TB/hour there is a limit to how much historical data that can be saved anyway, and we're talking about 30% utilization of a 10gbps network interface, so it's beyond using single machines for most usecases.

tehlike · on Aug 16, 2017

Query? Most likely not. At least not in the traditional "lets on the fly create a dashboard" sense.

avereveard · on Aug 16, 2017

query as in 'how much I bill this guy for it's click' - doesn't have to be sql nor on the fly of course

foxh0und · on Aug 16, 2017

Was $30k a lot of money for one of the largest add publishers?

roel_v · on Aug 16, 2017

OTOH, if you're a consultant, and you can say 'hey guys I can save you 30k, my fee is only 15k', and you can do it in a week - I mean, there are weeks where I bill less than 15k...

quickthrower2 · on Aug 16, 2017

3k a day consulting. I'm in!

dx034 · on Aug 16, 2017

How can MongoDB use that much less in such a situation? Especially prior to WireTiger?

An optimised schema in a relational database should be close to the minimum possible storage.

0xbear · on Aug 16, 2017

The largest ad system ever, Google AdSense, used MySQL until circa 2014 when it moved to a completely custom DB backend, F1. F1 is also a SQL database, however.

Google does use bigtable and such where appropriate, but for anything more complex than dumb key/value you can't do much better than regular DBs. Some people think they can, but 99% of the time they're mistaken.

stuffedBelly · on Aug 16, 2017

Startups got burnt by NoSQL because they wanted to use it for everything instead of thinking over use cases that fit. It's a good thing that this fad went away and now people would be more likely to educate themselves about relational vs. non-relational db before jumping into the fire pit.

romanovcode · on Aug 16, 2017

There are a good reason to use NoSQL, for example to replace EAV. But definitely not a good use to do everything on NoSQL so most companies opted out for 2 database solution = SQL + NoSQL and did not get burned.

The thing however is that since Postgres released indexed BSON support (which is actually faster then MongoDB) there is absolutely no point in opting for 2 database solution and making things harder for no reason.

TL;DR

Use Postgres.

RandalSchwartz · on Aug 17, 2017

Or, get the best of both worlds. ToroDB puts a mongo wire protocol in front of Postgres, which outperforms mongo significantly on the same hardware. Plus, you can get read-only views on the Pg side to join with traditional relational data.

pritambaral · on Aug 17, 2017

> Or, get the best of both worlds. ToroDB puts a mongo wire protocol in front of Postgres

Postgres alone is already the best of both worlds. With ToroDB, I am restricted to the MongoDB way of dealing with my data; with Postgres, I can mix SQL and NoSQL however I like, even in a single, simple SELECT query.

deepGem · on Aug 16, 2017

I have to agree that there was a lot of noise in the nosql world and a lot of people got carried away by the ease of use metric. I guess people were lured by the prospect of not having to write SQL joins.

That said I found nosql databases extremely helpful for storing and querying large unstructured data. Mostly because it was really hard to build relations and to store this data in tables. Think Wikipedia for instance. Since then my way of choosing databases is to try and model data into a relational db as much as possible and if that doesn't work out choose a nosql equivalent.

hvidgaard · on Aug 16, 2017

With Postgre you don't need to use a different database unless your workload is "bigdata". You can store relational data alongside unstructures json and query both.

gaius · on Aug 16, 2017

99% of "Big Data" workloads will run happily in Postgres. There are petabyte Postreses (Postgri?) in Production, have been for a decade now...

overcast · on Aug 16, 2017

Which amazes me how RethinkDB never eclipsed all of the other "nosql" offerings. A relational, realtime, document store. You get the benefits from both sides.

webscalist · on Aug 16, 2017

lol trello: https://news.ycombinator.com/item?id=3485186

myth_drannon · on Aug 16, 2017

well, there are so many... using mongodb for CRUD apps and realizing they can't do joins.

paulie_a · on Aug 16, 2017

Sadly some do not come to that realization and manage to implement a join "solution" anyways.

redwood · on Aug 16, 2017

that comment is over 2000 days old. wow

giancarlostoro · on Aug 16, 2017

Depends on what you're doing. We've used it on at least one production product at work where it works fine and probably will not be replaced. (If it's not broken don't fix it TM)

fulafel · on Aug 16, 2017

DynamoDB seems to be the flavour of the day for this, people are using it even in cases where a non-sharded SQL system would work fine.

frik · on Aug 16, 2017

Can we get revived WebSQL support?

We can thank it a rough Mozilla devs that long moved one, that WebSQL got stopped and is support in Webkit and Blink (100% mobile devices, 80% notebook), but not on Firefox. Instead this NoSQL IndexedDB was introduced. Let's get over NoSQL fad, and support SQL in web browser!