Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I for one am excited to hear about this move and was wondering if it would ever happen. This will help keep SQL Server alive and growing (I think) and who knows, maybe bits and pieces will even become open source some day (that's optimistic perhaps).

I am an equal opportunity consumer of database technology - something we all should be (e.g. use the right tool for the job when appropriate). I have used MANY different databases, open source and commercial, and there are better ones and worse ones across the spectrum. I think it is naive to criticize this move in any way - it is giving more choice so should be supported.

That said, SQL Server has a low barrier to entry, a very good implementation of the SQL language, an advanced optimizer, strong storage engine, management tools and a lot of nice BI features. Although you don't hear about it much, SQL Server also has Parallel Data Warehouse (PDW) which is their MPP database - ironically which used to be built using Ingres and Java of all things.

In my experience, SQL Server <-> PostgreSQL is the closest commercial to open source comparison (core database engine-wise) I can think of. This makes sense when you consider - both of them started as Ingres! Of course, Postgres and SQL Server have totally different user experiences (database management, tooling, etc.) and many different features.



Would you yourself actually choose SQL Server on Linux for a project?

I'm a big fan of SQL Server but the main thing that hurts it is the cost. Licensing our two servers today would cost a quarter of a million dollars. The OS cost is a rounding error. If I'm paying that much money for SQL Server I don't think I'd choose the Linux version for at least another 5 years from now. It needs to prove itself and there's almost no benefit to offset the risk.


I would never use MS SQL Server unless it is open source. Otherwise you get into vendor lock-in. Microsoft is cunning - they want to tie Linux-based businesses to their (expensive) products for good. But fortunately there is a viable open source alternative - PostgreSQL. http://www.pg-versus-ms.com


How are the clustering options coming along with PostgreSQL? The last time I looked into there weren't any multi-master solutions out of the box, but there were a bunch of separate implementations that mostly seemed to be half-baked.

https://wiki.postgresql.org/wiki/Replication,_Clustering,_an...

It's my understanding that the Postgres team has decided to change their historical stance of considering clustering solutions outside the scope of the core project, but I'm unclear exactly how far that has come.

The variety of different 3rd party implementations in various states of maturity make this a bit tedious to sort through.

That said, I always pick PG for my personal projects. I just have a hard time getting sign-off on it from my corporate overlords for anything at work.


http://www.postgresql.org/docs/9.4/static/different-replicat...

Async Multimaster is in the list of replication solutions.

I have not personally implemented it myself.

Edit, As I am reading this more closely it may be listing third party implementations and is not 'out of box'


Not that url again. It's RIDICULOUSLY biased and comes up every time SQL Server comes up on HN.


Out of curiosity, how does OSS prevent vendor lock-in? Are you talking more in terms of support?


It's very simple. With FLOSS you are not tied to one software distributor and you can freely modify and distribute the software by yourself. It is not about support, it is about freedom to run, copy, distribute, study, change and improve the software.


Anyone can support the OSS product to utmost level by studying the source code (and possibly fixing/modifying it).

> Are you talking more in terms of support?

Support is not tied to the product's license (which is common for proprietary software).


The sad truth is that "anyone can" is frequently still "nobody did, so you're on your own". This well applies to even the most popular pieces of FLOSS.

That's still significantly better than "noone but a single vendor can and you're at their mercy", but just being FLOSS doesn't mean it can't suffer from product lock-in, just that with the hurdle being complexity and not copyright.


There's always friction, yes, but rarely a lock or you'd simply go into the code and remove it.

Natural complexity, when the domain is simply complex to model, can't be gotten rid of regardless of the product.


Ah, the lock is not something that check the license and tells "oh no you're not allowed". It's a combination of not having something, and complexity to add it. Sometimes, that's because you need something tricky. Sometimes, that's because the project lacks any documentation, is written in a pretty low-level (verbose) language that obscures the high-level picture, and it's hard to figure out how it works. Sometimes, that's because FLOSS != KISS, and there is a lot of FLOSS software that has its own unique proprietary protocols and data formats, and does things in a manner that's completely incompatible and non-interoperable with any other system. I have no clue about MySQL or PostgreSQL internals (actually, I heard a word they're good in this regard), but is the case for various FLOSS projects - I've dug into some and cursed their developers.

If you overcame that and made a patch, but it's not accepted by the upstream - there are maintenance costs. How long would you be able to maintain your own fork of a large software project, keeping up with the upstream? I've tried with a few small ones (say, at a small ISP company, I just needed some custom pppd patches for in-house billing logic), and found it to be not a pleasant experience. I believe it's not just because I'm a lazy ass - I saw tons of forks on GitHub that were slowly rotting away, maintained for some time but eventually forsaken by their authors and far behind their upstreams.


That's what a lock is - something intentional. Everything else is natural complexity. Nobody made their code complex because they felt like it.

> If you overcame that and made a patch, but it's not accepted by the upstream

As for how well the FLOSS projects are managed, how's your luck at debugging let alone getting a patch accepted in MS-SQL, for instance?

> I saw tons of forks on GitHub that were slowly rotting away

I saw people with food, who weren't eating it because they were full, and I decided that food was stupid. Right?

> How long would you be able to maintain your own fork of a large software project, keeping up with the upstream?

Far longer than a binary patch... You know, to block a certain cipher suite or something in a closed-source product.


> how's your luck at debugging let alone getting a patch accepted in MS-SQL, for instance?

Duh, none or nearly none, of course.

I'm not saying that proprietary binary blobs are better. There are exceptions, but generally they're significantly worse in this regard.

I'm only saying that just because software is FLOSS doesn't mean one can successfully hack on it and adapt it to their needs, or easily migrate to other FLOSS solutions. Sometimes, they'll be stuck for a while - I think that's also can be considered as a lock-in.

> Nobody made their code complex because they felt like it.

Maybe. But there's a term "over-engineering" too. It's not that someone decides to make things unnecessarily complex, but sometimes they really do.


> Would you yourself actually choose SQL Server on Linux for a project?

It will significantly reduce the licensing implications of using SQL Server for a small project. I know people who use MSSQL in their day job but postgres/other for personal projects and one of the reasons is needing to license Windows if a personal project becomes public and needs hosting. This way a small project can use SQL Server Express on Linux just as cost free as progres (or mysql if you must) on Linux.

The choice of OS will no longer dictates the choice of DB, that decision can be entirely made on the needs of the project and its target users.

Small projects using Express edition may grow into larger ones needed standard/enterprise features, and people exposed to SQL Server through projects using it on Linux are less likely to dismiss ti later because the know postgres/other only. I'm not sure how much we'll see brand new larger projects use SQL Server on Linux any time soon though, but I imagine "get them in at the ground floor and create familiarity as early as possible" is the intention here much like cheap educational licensing for Windows, Office, & full VS and the availability of VS Express.


BizSpark is the way Microsoft encourages side projects/startups to use MS SQL, but yes, this removes a hurdle, and separating the choice of OS from the choice of DB is good.

That said, I use MS SQL both at work and on my personal projects thanks to BizSpark.

There's also Azure, which, to be fair, is still more expensive than an equivalent AWS stack with Postgre or something.


Many of the stability issues I've seen with SQL server have been the fault of the OS or Windows drivers

Conversations I had with some senior SQL server team members in the last two years suggested they were ready for no reboot patching but were limited by the OS and felt it was holding them back. If it wasn't for Windows, SQL server uptime would be much better (was the implication)

I'm looking forward to trialing it.


For smaller businesses they can join Bizspark and download SQL Server free from msdn and keep it even after the 3 year program ends.


In theory. In practice our hosting provider couldn't do this for us because of their own licencing (or it was too difficult for them to work out).

And if you are a start-up, you will upgrade your DB, in which case you might only get a few years free anyway.


Host companies have to license you a copy through the Microsoft SPLA program.


I think this will be a great option for existing SQL Server users wanting to move to Linux. Looks like MSFT is slowly giving up on Windows Server to some extent. They might be seeing more money in the Cloud business compared to the Server OS business.


Or rather hope or predict to see more money. There is a lot of wishful thinking involved in the current shift to the cloud, both from providers and their customers.


> Would you yourself actually choose SQL Server on Linux for a project?

Well, I might KEEP my SQL Server, but switch from Windows to Linux as the host OS - to save $$ on licensing and hardware.


Wasn't the core of SQL Server actually Sybase?


Kind of. Sybase and Microsoft actually partnered for a while but parted ways in the mid-90's. https://en.wikipedia.org/wiki/Adaptive_Server_Enterprise

(I actually remember using Sybase SQL server for a while until the tool that would eventually be called Enterprise Manager was released.)


> I think it is naive to criticize this move in any way - it is giving more choice so should be supported.

It's naive to support it just because of the assumption that more choice must be good.

On the contrary the past few years on Windows have been full of data corruption bugs in SQL 2012 and 2014; and while 2016 looks amazing with new features Microsoft has continued to push ahead instead of just - you know - fixing the fucking bugs we all keep reporting and upvoting on Connect and getting ignored.

More choice is IMHO not a good thing. It's a bad thing. In the same way people who "multi task" are more often than not churning out shit work and running on luck - I'd prefer Microsoft focus on what they know best.


You think that allowing corporations to move their SQL server investment off of Windows and onto Linux as a precursor to moving off of SQL server, rather than forcing them to change DBMS and OS at the same time is bad? That's what choice means.


Those who are moving off of MS SQL are "lost customers" and I don't think MS would be spending extra resources on facilitating their transition. More likely they had customers willing to switch to/stay on MS SQL but needing Linux for some unrelated reasons.


While I do think this is a good thing, I've run very good MySQL servers (well, in so far as MySQL is ever very good) on Windows, so it was already possible to do that kind of migration.


Coming from mostly a Linux/MySQL background, I've had reason to use MSSQL for the last few years. It's really solid, and seems to optimize queries well, which is good, because there's a few things that I still find really annoying, such as no GROUP_CONCAT. There is a more general method to do the same thing, but wow is that method cumbersome in comparison (although undoubtedly more powerful).


> optimize queries well

There are a few obvious cases which don't work (e.g. scalar UDFs aren't inlined - always use table UDFs), but the performance battle Oracle and Microsoft had circa 2000 really shows. The vast majority of the time you can code your intent and the query optimizer will simply do the right thing.

The main reason I would choose SQL over any of the competition is DataDude/VSSQL. We started using it where I work (~5100 schema objects) and it's revolutionary. T-SQL becomes a first-class language in Visual Studio:

- Full editor integration: intellisense and errors as you type.

- Build system: build (including static analysis) and deploy from Visual Studio. MSBuild-based integration for CI.

- Schema delta: you write/update your schema as though you are writing it for the first time and, as part of the build, the ALTER script will be generated for you. You rarely have to write migration scripts yourself (I've only had to when migrating data across columns/tables).

I have been thinking about solving this for PostgreSQL because I'm now completely unable to work without it - I just wish I had the time. Any other form of SQL development now feels like VBScript development in Notepad.


Links to more information about this? I've found some information about VSDBPro but it all is dated 2008. I like the sounds of this, but it seems dated...

Update: Seems like this has become SQL Server Data Tools (but not the BI tools of the same name?). I'll have to look into this more.


As far as I remember it has only ever been optional in VS2010. It was introduced by default in 2013 (.dbproj) and then reworked in 2015 (.sqlproj). All you need to get this going is: Visual Studio > File > New > Project > Other Languages > SQL Server.

You can also get it to reverse-engineer an existing database into a project[2].

Getting it to work for CI isn't turn-key. When you build a SQL/DB project it results in both a CREATE/ALTER script (depending on whether you do a diff build or a CREATE build) as well as a schema file (.dbschema). You need to track previous versions of the dbschema yourself (we use branching to track this file across releases), so that it can it has the base for diffing during the build.

It's a good idea to add a database reference to [sys] and [master] in all your projects (which is not done by default).

Knowing the right keywords I Googled a bit for you:

[1]: https://visualstudiomagazine.com/articles/2015/01/01/visual-...

[2]: https://www.youtube.com/watch?v=kWKllVyozOg (talks about SQLAzure, but these tools do work on-premise)

[3]: http://www.codeproject.com/Tips/998465/Creating-a-SQL-Server...

[4]: https://blogs.msdn.microsoft.com/ssdt/2014/07/24/sql-server-... (if you use TFS CI, otherwise you can invoke MSBUILD directly from whatever CI you use)


This is also my big complaint when it comes to MS SQL -- there has been precious little development on their SQL language. It's almost like they're in a complete feature-freeze when it comes to the SQL language.

MS SQL is missing some very basic SQL functionality.


SQL Server was based on Sybase SQL Server. https://en.wikipedia.org/wiki/Microsoft_SQL_Server

Microsoft rewrote the code so there was no more Sybase code, but it still followed Transact SQL standards. T-SQL is not as complete as PL/SQL and others. It is just that Microsoft is that 800 pound gorilla so people use their products.

My last job I did migration from Excel and MS-Access to SQL Server using VB 6.0 code and ADO recordsets to copy and sync up data. They key was to use a column named date modified that had a time stamp when it was last modified to figure out which record was the newest and back up the old record to a history table in case it needed to be reverted later on.


I thought Oracle or even db2 would be the Gorillas for serious corporate/goverment databases, and MySql would be the Gorilla for cheap + web stuff.

I suppose Microsoft is pretty popular in beetween those 2 markets.


Oracle IS the gorilla, but with their recent pricing/licensing moves, more large orgs are exploring moving to SQL Server, especially ones that are already a Microsoft shop. My company is doing just that.


Why not explore PostgreSQL? You would be free of licensing/pricing moves.


Not GP, but I've found SQL Server significantly easier to both install and manage. Performance-wise I've yet to encounter a significant difference.


What in particular do you miss from other SQL languages?


Date handling, simple things like postgres date_trunc('month', now()) and especially looping over records in stored procedures without using cursors (that are prone to crashes).

I work with both on a daily basis and there is a mile of difference. Yes I can accomplish whatever I want in sql server but there are so many small things that could improve.

Some of other pet peevees are: You can't use a lot of left joins, it slows down the query dramatically. If there is only one to one relation, use subselects instead.

Thou shall not do queries like this: select foo into baz from bar where fooid not in (select fooid from bar) They are the death of the engine.

And many "unwritten rules" like that.

Postgres just accepts the queries and run them..


SELECT DATEPART(month,GETUTCDATE()) ? Either you are using a very old version or not using SQL Server at all cause I pretty much do everything you are complaining about on a daily basis and on a very large set of data.


Your query only extracts the month integer from a date, it does not floor it to the month. The equivalent for sql server is this ugly thing: select dateadd(month,datediff(month,0,getdate()),0)

We are on SQL server 2014. I have had a query go from 2-3 hours down to 7 minutes by replacing left joins with subselects within the query for instance (and yes all the indices are in place as recommended by sql server query planer)

The one with a insert based on a query on the same table I have never managed to run at all and I have to use a temporary table as an intermediary instead.


The last one might be heped by trace flag 4199. You can apply it to individual queries with option (querytraceon 4199):

https://blogs.msdn.microsoft.com/psssql/2010/09/01/slow-quer...


I have seen Sql server query planer do stupid shit too. Almost always Statistics where to blame. Are you sure those where up to date? For some reason the "auto update statistics" functionality doesn't seem to work reliably, you have to update them in a maintenance job to fix this.


UPSERT (or similar logic, not merge).


Oh, then that's your problem. MERGE is standard, and has clear semantics. UPSERT is nonstandard, and semantics vary among implementations.

Microsoft doesn't need to make SQL Server look like MySQL. Quite the other way around.


There's not really a standard anymore for UPSERT to be added to. SQL has fragmented and since the last few iterations of the standard allowed for "additional non standard features", adding an UPSERT is perfectly compliant with the SQL standards.


UPSERT should be a standard, why isn't it?

And it's very useful in a ton of situations which is why so many other SQL dialects have added it. It's not like SQL Server is completely standards compliant so I'd rather have the functionality then have to use MERGE because it's "standard".


Agreed on GROUP_CONCAT. MSSQL has been missing that for a long time and it is a very reasonable and handy function to have.

It sounds like you are talking about the XML conversion method for concatenating groups. There is one other way around it using custom .NET CLR functions, but it is definitely cumbersome as well.


Yes, I'm referring to the XML conversion method. I'm aware you can register your own .NET CLR functions, but truth be told, it's a database for our POS application (which we purchased from a vendor), so I try to mess with the database as little as possible beyond reading the data through our own internal application so we can do useful things with it. I try to stay away from messing with the schema or even providing extra routines (I can't be sure anything would survive upgrades, which are regular and I don't want to deal with that).


Can you share any links which cover the more general and powerful approach in detail?


>as Ingres

I've read that SQL Server code was originally based on Sybase code they licensed. Not sure.


MS SQL Server was based on Sybase 4.2 as I recall, which is the Sybase release that predated introduction of SMP support. (Internally known as the "boat anchor" though I doubt that name appeared on the contract to sell the code.) Microsoft basically rewrote it, so I would guess there is little original Sybase code left. You can still see the the heritage clearly in features like T-SQL dialect and the Tabular Data Stream wire protocol, both of which were developed at Sybase.

Sybase is not in anyway related to Ingres that I know of other than the fact that Bob Epstein and Mike Stonebraker worked on the Ingres project at Cal. Sybase in fact was considerably different from Ingres in many ways such as use of SQL instead of QUEL, network access to the SQL Server, and specialized features like stored procedures. The Sybase founders did not come directly from Cal or even Ingres but rather from Britton Lee.


Yes, true, and Sybase is a descendant of Ingres.


Intellectually, somewhat, perhaps. Sybase was a distinct implementation. When it came on the market, Ingres IIRC was still based on QUEL. Sybase had its new T-SQL language and a novel design that made it feasible to use for OLTP. At that time, Ingres and Oracle were usually devoted to decision support, and OLTP was backed by things like VSAM.


Noted, thanks. I did think that might be the case, but I was thinking of direct parent. Anyway, good to know. Stonebraker et al?


You are actually not far off. Sybase was founded by Bob Epstein, who was Stonebraker's student at Berkeley and helped build the original INGRES system.


And Microsoft SQL Server is a descendant on Sybase since SQL Server 6, after Microsoft purchased the source code.


> SQL Server <-> PostgreSQL is the closest commercial to open source comparison (core database engine-wise) I can think of.

Can you store / query JSON-like data in SQL server? I'd like to be able to store a structured object and query a keypath.


MS SQL does support XML types / allow you to use XPath in your queries. To my knowledge it doesn't do JSON; but its easy to convert between the two.

http://blog.sqlauthority.com/2012/04/27/sql-server-introduct...



I wish people would stop saying this. The JSON support in SQL Server 2016 is far from being comparable to the JSON support in pgsql.


Well actually it "supports" JSON as NVARCHAR. There is no JSON type and no JSONB type that supports indexing like in Postgres.


Nice to know, they query syntax is, unfortunately only slightly nicer than the XML options... at least at first glance.


As the two other comments note, SQL Server 2016 will support JSON, however the extent of the support is significantly less compared to Postgresql.

This is unfortunate as I'd love to have better document-db style functionality in SQL Server.


2016 supports JSON.


It has terrible support.


Dumping a load of JSON into an RDBMS makes CJ Date cry


And it probably makes Fabian Pascal go postal :)


I too am excited, as I was faced with the challenge of building a rails app with a SQL server database backend. Sure there is a solid adapter, but it wasn't perfect, and we couldn't effectively run tests on it.


It won't be generally available until mid-2017.

I've built a Rails app atop a large legacy SQL Server, and I really haven't run into anything that wouldn't run. (aside from `rails db`)


Is this the same DB that backs the StackOverFlow/Exchange network of sites?


Yes


Looks like basically everything will be on Linux at SE now. ;-)


Stack Overflow dev here.

We don't really care that much about the OS -- never did, never will. A windows license is very cheap, and we only need maybe 20 windows licenses to run the whole network.

The much bigger cost are SQL Server licenses, for which we care very much about!

All in all we'll keep on using SQL, but on the most performant platform -- my bet is that it's going to be Windows for a while though.

Regarding .Net Core: the platform is not ready yet (e.g. lack of support for Security IIRC), but we'll adopt it when it is -- again on the fastest platform.

Both in the case of SQL Server and .Net core, we are working publicly on GitHub with Microsoft by testing and providing advice where needed to make sure the next version of our ecosystem is the best possible.


Their web/service tier are still windows, doesn't seem that likely that they'd rewrite them in .netcore fast enough to be off windows completely in the near future.


Any idea how MSSQL was deployed on RDS in AWS prior to this? Was AWS spinning up some custom Windows instances behind RDS?

Or was the RDS implementation part of how MS figured out how to release this??


SQL Server RDS is built on Windows. Albeit a scripted to the 9's adaptation to make it act more Linux like (e.g., RDS can't connect to your AD, so they had to make some sacrifices).

The big difference with RDS is that you don't get the full gambit of clustering capabilities that SQL server supports. This is mainly due to the fact they don't have a shared storage model that SQL cluster failover requires.


EBS is certainly capable of implementing the shared storage that the database requires, and the quorum driver for wolfpack ought to implementable without a physical scsi device backing it.


At least until recently the MS SQL Server clustering was based on mirroring, which is capable but deprecated in future SQL Server releases in favor of AlwaysOn clustering.


The syntax of MSSQL is very behind the competition and still based on Sybase. Even basic string functions aren't available or have weird arguments.

If you aren't locked into their eco-system stay clear away from it. Otherwise you will regret it, as the licensing costs are very expensive and per CPU core (think Oracle). Better switch to MySQL or Postgres or Lucene, etc.


There is more to a SQL server than the language.

For example, OLAP, distributed transactions, cluster scalability and so on.


You get the same features from other databases.


If we are speaking about Oracle, Informix, DB2, Sybase, yes I do agree.


yes, but there are others too. And MySQL and Postgres support some of those functionality as well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: