Barry Burke, the Storage Anarchist, has written an interesting roundup (“don’t miss the amazing vendor flash dance”) covering the flash strategies of some players in the server and storage spaces. Sun’s position on flash comes out a bit mangled, but Barry can certainly be forgiven for missing the mark since Sun hasn’t always communicated its position well. Allow me to clarify our version of the flash dance.
Barry’s conclusion that Sun sees flash as well-suited for the server isn’t wrong — of course it’s harder to drive high IOPS and low latency outside a single box. However we’ve also proven not only that we see a big role for flash in storage, but that we’re innovating in that realm with the Hybrid Storage Pool (HSP) an architecture that seamlessly integrates flash into the storage hierarchy. Rather than a Ron Popeil-esque sales pitch, let me take you through the genesis of the HSP.
The HSP is something we started to develop a bit over two years ago. By January of 2007, we had identified that a ZFS intent-log device using flash would greatly improve the performance of the nascent Sun Storage 7000 series in a way that was simpler and more efficient that some other options. We started getting our first flash SSD samples in February of that year. With SSDs on the brain, we started contemplating other uses and soon came up with the idea of using flash as a secondary caching tier between the DRAM cache (the ZFS ARC) and disk. We dubbed this the L2ARC.
At that time we knew that we’d be using mostly 7200 RPM disks in the 7000 series. Our primary goal with flash was to greatly improve the performance of synchronous writes and we addressed this with the flash log device that we call Logzilla. With the L2ARC we solved the other side of the performance equation by improving read IOPS by leaps and bounds over what hard drives of any rotational speed could provide. By August of 2007, Brendan had put together the initial implementation of the L2ARC, and, combined with some early SSD samples — Readzillas — our initial enthusiasm was borne out. Yes, it’s a caching tier so some workloads will do better than others, but customers have been very pleased with their results.
These two distinct uses of flash comprise the Hybrid Storage Pool. In April 2008 we gave our first public talk about the HSP at the IDF in Shanghai, and a year and a bit after Brendan’s proof of concept we shipped the 7410 with Logzilla and Readzilla. It’s important to note that this system achieves remarkable price/performance through its marriage of commodity disks with flash. Brendan has done a terrific job of demonstrating the performance enabled by the HSP on that system.
While we were finishing the product, the WSJ reported that EMC was starting to use flash drives into their products. I was somewhat deflated initially until it became clear that EMC’s solution didn’t integrate flash into the storage hierarchy nearly as seamlessly or elegantly as we had with the HSP; instead they had merely replaced their fastest, most expensive drives with faster and even more expensive SSDs. I’ll disagree with the Storage Anarchist’s conclusion: EMC did not start the flash revolution nor are they leading the way (though I don’t doubt they are, as Barry writes, “Taking Our Passion, And Making It Happen”). EMC though has done a great service to the industry by extolling the virtues of SSDs and, presumably, to EMC customers by providing a faster tier for HSM.
In the same article, Barry alludes to some of the problems with EMC’s approach using SSDs from STEC:
STEC rates their ZeusIOPS drives at something north of 50,000 read IOPS each, but as I have explained before, this is a misleading number because it’s for 512-byte blocks, read-only, without the overhead of RAID protection. A more realistic expectation is that the drives will deliver somewhere around 5-6000 4K IOPS (4K is a more typical I/O block size).
The Hybrid Storage Pool avoids the bottlenecks associated with a tier 0 approach, drives much higher IOPS, scales, and makes highly efficient economical use of the resources from flash to DRAM and disk. Further, I think we’ll be able to debunk this notion that the enterprise needs its own class of flash devices by architecting commodity flash to build an enterprise solution. There are a lot of horses in this race; Barry has clearly already picked his, but the rest of you may want survey the field.
6 Responses
From reading Barry’s roundup it would appear that not only has he chosen his horse, but that he’s planted squarely on its back wearing an EMC jersey.
Adam is (understandably) being a tad modest here about both of the key observations, but just to set the record straight: it was Adam who initially suggested the use of flash in lieu of NVRAM for ZIL performance, and it was he again who (after we had decided to use write-optimized flash for the ZIL) wondered if we couldn’t use flash on the read side as well. There is also a degree to which necessity (or at least, opportunity) was the mother of this invention: in toro (a.k.a. the 7410), we had six empty 2.5" SATA drive bays that were just begging to be filled with something — and it was Adam who asked "why not flash-based SSDs?" Now that flash is in vogue, these innovations may seem obvious or self-evident in retrospect, but having been there at the moment, I can say with confidence that they were much subtler than they might now appear!
@Chris Thanks for the tip; I posted an update:
http://dtrace.org/blogs/ahl/more_anarchy
@Bryan Thank you as always for helping to write an accurate history, and for the kind words.
I wrote a blog entry a while ago about how smart it is to replace Hard Disks Drives with SSDs in traditional Arrays.
http://blogs.sun.com/studler/entry/why_you_should_avoid_placing
@Anatol That’s an interesting post, but I’m not sure I agree with your assertions. In particular, I think there’s a place for flash both in servers and in storage boxes. The former may provide better performance for some applications, but the latter gives seamless and OS agnostic integration and the usual benefits of consolidation and virtualization.
@Adam
I think we don’t disagree with each other. The key message in my blog entry was to avoid using the traditional Raid Controller Technology to run SSD, they are not build for this. Controllers can benefit from Flash, but not by just replacing traditional Disks with Flash Disks, it needs more.
My most recent blog about Flash mentions a few ideas where and how you could use Flash. But, you need to build new "systems/solutions" to use the complete advantage of Flash technology.
http://blogs.sun.com/studler/entry/open_flash