Nov 172010

I was at the Melbourne SQL Server user group last night, listening to Greg Linwood (of MyDBA) present on solid state disks (SSDs).  It was a very informative session, with Greg going into detail on how SSDs work, and the specific things you have to keep in mind when using them, such as providing sufficient OS memory to manage the SSD’s allocations, and over-provisioning to allow sufficient space to support the fact that updates involve writing to a new location and the space is not freed synchronously.

The first of three main points of the session were that SQL Server is a read oriented system, in that to perform any write operation, the rows to be updated/deleted must first be located and read in from disk.  In particular, indexes are the most likely objects to benefit from being on SSDs as the access pattern is completely random – an insert into a clustered index may be predictably via an IDENTITY value, but the non-clustered indexes will experience an insert into the middle of the B-tree structure.

The second main point is that enterprise-level SSDs, such as Fusion IO’s offerings, provide a large reduction in space requirements (the devices can be much more densely packed), cooling requirements (there’s no moving parts, so things don’t get hot), and subsequently lower power requirements.  There is also less of a need to RAID SSDs together – individual hard disks fail frequently due to their mechanical nature, but SSDs are less likely to experience this due to their more robust nature.  Of course, the entire SSD could fail, but this is akin to losing your RAID controller.  If your transaction logs are mirrored on a hard disk array (which is ideally suited to sequential write operations), then the error checking and correction (Hamming codes, so single error correction, double error detection, which is superior to RAID-5’s single error detection) capabilities may allow you to trust your data files to a single device.  I should point out that this is not a recommendation – it is up to you to perform risk assessment and contingency.

In terms of cost, hard disks beat SSDs when comparing single units, but the superior performance of SSDs, coupled with the lower space, cooling, power  and advantages make SSDs significantly cheaper.  Compare the TPC’s current TPC-C benchmarks – the best performing hard disk solution comes in at 6 million transactions per minute with $20 milllion for storage costs, while the best performing SSD solution comes in at 10 million transactions per minute with $3 million for storage costs. (It’s worth noting that both of these are DB2 systems, and not SQL Server, but the RDBMS isn’t as important here.)

SSDs are definitely the next big thing, and are now considered stable and mainstream enough to improve what can be considered the biggest bottleneck in database systems today – disk I/O.  Greg presented a captivating session and definitely cleared up a lot of misconceptions around SSD technology and their uses with database systems.