Update: Aaron responded to my post (please see comments below) so I've made a minor edit to the post.
At this morning's first Azure talk at VSLive Las Vegas, I thought I heard Aaron Skonnard state that Azure Table Storage was irrelevant, to be replaced by SQL Data Services (his views are in the comments). I have to say, that probably didn't really help with attendance at my after lunch talk on ATS, but it did get me thinking even more about why ATS is important and why SQL Server has scalability issues.
Let me state, first and for the record, SQL Server scales. There's a limit to how much it can scale before you need to add clustering (which is a whole different can of worms) but SQL Server scalability is all related to cost, throwing more and faster hardware at the problem. The number of transactions you can process in a second, the number of disk reads, etc. SQL Server can handle a lot of transactions if it is properly designed. But SQL Server scales through hardware. Azure Table Storage scales through simplicity, meaning you can do more transactions on with ATS than SQL Server on equivalent hardware.
All or most of the management that's a part of the SQL Server RDBMS is done away with. Even if they are using SQL Server on the back end, they are still using a lot less of the management that accompanies a traditional SQL Server architecture.
Consider, if you will, the true cost of an insert into a table in a well-normalized database. Insert a new row of data into a well-normalized database and you have executed a transaction. You only need to use BEGIN TRANS when you're executing a set of SQL Statements. Even without BEGIN TRANS, SQL Server treats that single INSERT as a transaction that begins when you execute the insert and ends when SQL Server has finished. Have you ever considered what goes into that single transaction? If we dig beneath the covers, you'll find something like this:
- Verify the Primary Key is unique
- Check the physical page where you need to insert the row and insert the data
- If there isn't enough room on the page, allocate a new page, possibly splitting the data across the old and new pages
- For each index associated with the table, add a new entry in the appropriate location
- If there isn't enough room on the page where the index entry is going to go, allocate a new page, possibly splitting data across the old and new pages
- For each foreign key relationship defined for the table, verify that the foreign key information is correct
At any stage in this process, for any point of failure, SQL Server needs to ROLLBACK the transaction. According to Microsoft, this means:
ROLLBACK TRANSACTION erases all data modifications made from the start of the transaction or to a savepoint. It also frees resources held by the transaction.
In other words, erase the insert as though it never happened. Which really means that you can't do anything with the table that I'm inserting to (and possibly some tables that I have foreign key relationships with) until SQL Server either commits or rolls back the transaction. Which is to say, SQL Server blocks other processes. Now, SQL Server can do all of the above steps blazingly quickly and DBAs and SQL Developers know a lot of tricks for tweaking the last bit of work out of SQL Server. But that's indicative of why SQL Server, which may work fine for 80% of the cases, can have scalability issues. Sure, it will scale, but the cost in hardware and extremely talented consultants and DBAs who have to monitor the server cluster like a mother hen brooding over her eggs can be exorbitant.
Azure Table Storage, on the other hand, is simpler. It doesn't have the management overhead. It puts data in, updates data, deletes data, merges data. But it doesn't index, it doesn't validate, it doesn't handle many of the things that SQL Server does so well for us. Not doing that, however, allows it to scale so that the same hardware does more. If I can get more out of less hardware, there's a cost component involved. Granted, there is a cost component for development that's not required for SQL Server, but I suspect that there will be a set of classes that folks develop to handle that as best practices are discovered.
I don't think ATS will replace SQL Server, but I also don't think SQL Data Services will replace the need for ATS. I think both have their place in the proper situations and the near future will help determine which is appropriate for different scenarios.