It looks like the NoSQL movement was just a fad and plenty of startups got burned by it and some still stuck with this tech , writing inefficient workarounds for something that comes OTB with the regular SQL databases.
I worked with one of the largest ad publishing companies. They wanted to track data about every client served an ad. This generated over 1.2 Terabyte per hour of data when the MySQL master started to max out. We had the largest possible multiple core system. It was going to cost my client $30k to upgrade to SSD drives to get more out of MySQL. Also note we had to store this data on an expensive SAN in order to feed the data at a reasonable rate to MySQL or PostgreSQL.
I had just learned of MongoDB and went to school at 10gen for their sys admin class. I talked to the developer about storing the data in NoSQL using a small sharded cluster on a Friday. Monday morning he asked me to setup a MongoDB cluster. Tuesday we moved over from MySQL. They ended up using much smaller servers, got rid of the 3par and epsilon SAN's and saving tons of money.
My point is there are certain situations where NoSQL is still the answer unless you can cluster your SQL write server. I've moved on from working with Ad publishing clients but, I'm sure there are other places where SQL databases are not adequate.
NoSQL might be or, have been, a fad but, like any tool when used for the right job it works.
A production large scale ad. system system that was doing 1.2 TB per hour in writes, was migrated over from MySQL to MongoDB in roughly 24 hour window cool story Bro.
sounds like the wrong tool. what you are producing was probably logs data, which is immutable. There are far more efficient (storage, cpu) write-only stores.
Concider what a database does. It provides ACID properties and ability to query data. If all you need is writing data, the fastest you can do it write it directly to the disk, without the overhead a database comes with.
Using a loadbalancer in front of a farm of cheap logging machines, and aggreate the data you need for analysis to a suitable machine.
they'd still need to query their stuff, I guess, so you'd need to trow in there somewhere something to aggregate logs and get the metrics they're tracking out of it - which can totally be done in streaming, without the need of going trough the logs every time, for most metrics.
With that amount of data, streaming and only saving aggregated data is the only sane way. With 1.2TB/hour there is a limit to how much historical data that can be saved anyway, and we're talking about 30% utilization of a 10gbps network interface, so it's beyond using single machines for most usecases.
OTOH, if you're a consultant, and you can say 'hey guys I can save you 30k, my fee is only 15k', and you can do it in a week - I mean, there are weeks where I bill less than 15k...
The largest ad system ever, Google AdSense, used MySQL until circa 2014 when it moved to a completely custom DB backend, F1. F1 is also a SQL database, however.
Google does use bigtable and such where appropriate, but for anything more complex than dumb key/value you can't do much better than regular DBs. Some people think they can, but 99% of the time they're mistaken.
Startups got burnt by NoSQL because they wanted to use it for everything instead of thinking over use cases that fit. It's a good thing that this fad went away and now people would be more likely to educate themselves about relational vs. non-relational db before jumping into the fire pit.
There are a good reason to use NoSQL, for example to replace EAV. But definitely not a good use to do everything on NoSQL so most companies opted out for 2 database solution = SQL + NoSQL and did not get burned.
The thing however is that since Postgres released indexed BSON support (which is actually faster then MongoDB) there is absolutely no point in opting for 2 database solution and making things harder for no reason.
Or, get the best of both worlds. ToroDB puts a mongo wire protocol in front of Postgres, which outperforms mongo significantly on the same hardware. Plus, you can get read-only views on the Pg side to join with traditional relational data.
> Or, get the best of both worlds. ToroDB puts a mongo wire protocol in front of Postgres
Postgres alone is already the best of both worlds. With ToroDB, I am restricted to the MongoDB way of dealing with my data; with Postgres, I can mix SQL and NoSQL however I like, even in a single, simple SELECT query.
I have to agree that there was a lot of noise in the nosql world and a lot of people got carried away by the ease of use metric. I guess people were lured by the prospect of not having to write SQL joins.
That said I found nosql databases extremely helpful for storing and querying large unstructured data. Mostly because it was really hard to build relations and to store this data in tables. Think Wikipedia for instance. Since then my way of choosing databases is to try and model data into a relational db as much as possible and if that doesn't work out choose a nosql equivalent.
With Postgre you don't need to use a different database unless your workload is "bigdata". You can store relational data alongside unstructures json and query both.
Which amazes me how RethinkDB never eclipsed all of the other "nosql" offerings. A relational, realtime, document store. You get the benefits from both sides.
Depends on what you're doing. We've used it on at least one production product at work where it works fine and probably will not be replaced. (If it's not broken don't fix it TM)
We can thank it a rough Mozilla devs that long moved one, that WebSQL got stopped and is support in Webkit and Blink (100% mobile devices, 80% notebook), but not on Firefox. Instead this NoSQL IndexedDB was introduced. Let's get over NoSQL fad, and support SQL in web browser!