Adam Leventhal's blog

Search
Close this search box.

Category: Fishworks

Today we shipped our 2009.Q3 release. Amidst the many great new features, enhancements and bug fixes, we’ve added new storage profiles for triple-parity RAID and three-way mirroring. Here’s an example on a 9 JBOD system of what you’ll see in the updated storage configuration screen:

Note that the new Triple parity RAID, wide stripes option replaces the old Double parity RAID, wide stripes configuration. With RAID stripes that can easily be more than 40 disks wide, and resilver times that can be quite long as a result, we decided that the additional protection of triple-parity RAID trumped the very small space efficiency of double-parity RAID.

Ryan Matthews has updated the space calculator for the 7310 and 7410 to include the new profiles. Download the new update and give it a shot.

At the Flash Memory Summit today, Sun’s own Michael Cornwell delivered a keynote excoriating the overall direction of NAND flash and SSDs. In particular, he spoke of the “lithography death march” as NAND vendors push to deliver the most cost-efficient solution while making huge sacrifices in reliability and performance.

On Wednesday, August 12, I’ll be giving two short talks as part of sessions on flash-enabled power savings and data center applications:

In the evening from 7:30 to 9:00, I’ll be hosting a table discussion of software as it pertains to flash. We’ll be talking about uses of flash such as the Hybrid Storage Pool, how software can enable the use of MLC flash in the enterprise, the role of the flash translation layer, and anything else that comes up.

Today we’re introducing a new member to the Sun Unified Storage family: the Sun Storage 7310. The 7310 is a scalable system from 12TB with a single half-populated J4400 JBOD up to 96TB with 4 JBODs. You can combine two 7310 head units to form a cluster. The base configuration includes a single quad-core CPU, 16GB of DRAM, a SAS HBA, and two available PCIe slots for NICs, backup cards, or the Fishworks cluster card. The 7310 can be thought of as a smaller capacity, lower cost version of the Sun Storage 7410. Like the 7410 it uses high density, low power disks as primary storage and can be enhanced with Readzilla and Logzilla flash accelerators for high performance. Like all the 7000 series products, the 7310 includes all protocols and software features without license fees.

The 7310 is an entry-level clusterable, scalable storage server, but the performance is hardly entry-level. Brendan Gregg from the Fishworks team has detailed the performance of the 7410, and has published the results of those tests on the new 7310. Our key metrics are cached reads from DRAM, uncached reads from disk, and writes to disk all over two 10GbE links with 20 client systems. As shown in the graph, the 7310 is an absolute champ, punching well above its weight. The numbers listed are in units of MB/s. Notice that the recent 2009.Q2 software update brought significant performance improvements to the 7410, and that the 7310 holds its own. For owners of entry-level systems from other vendors, check for yourself, but the 7310 is a fire-breather.

Added to the low-end 7110, the dense, expandable 7210, the high-end clusterable, expandable 7410, the 7310 fills an important role in the 7000 series product line: an entry-level clusterable, expandable system, with impressive performance, and an attractive price. If the specs and performance have piqued your interest, try out the user interface on the 7000 series with the Sun Storage 7000 simulator.

As flash memory has become more and more prevalent in storage from the consumer to theenterprise people have been charmed by the performance characteristics, but get stuck on the longevity. SSDs based on SLC flash are typically rated at 100,000 to 1,000,000 write/erase cycles while MLC-based SSDs are rated for significantly less. For conventional hard drives, the distinct yet similar increase in failures over time has long been solved by mirroring (or other redundancy techniques). When applying this same solution to SSDs, a common concern is that two identical SSDs with identical firmware storing identical data would run out of write/erase cycles for a given cell at the same moment and thus data reliability would not be increased via mirroring. While the logic might seem reasonable, permit me to dispel that specious argument.

The operating system and filesystem

From the level of most operating systems or filesystems, an SSD appears like a conventional hard drive and is treated more or less identically (Solaris’ ZFS being a notable exception). As with hard drives, SSDs can report predicted failures though SMART. For reasons described below, SSDs already keep track of the wear of cells, but one could imagine even the most trivial SSD firmware keeping track of the rapidly approaching write/erase cycle limit and notifying the OS or FS via SMART which would in turn the user. Well in advance of actual data loss, the user would have an opportunity to replace either or both sides of the mirror as needed.

SSD firmware

Proceeding down the stack to the level of the SSD firmware, there are two relevant features to understand: wear-leveling, and excess capacity. There is not a static mapping between the virtual offset of an I/O to an SSD and the physical flash cells that are chosen by the firmware to record the data. For a variety of reasons — flash call early mortality, write performance, bad cell remapping — it is necessary for the SSD firmware to remap data all over its physical flash cells. In fact, hard drives have a similar mechanism by which they hold sectors in reserve and remap them to fill in for defective sectors. SSDs have the added twist that they want to maximize the longevity of their cells each of which will ultimately decay over time. To do this, the firmware ensures that a given cell isn’t written far more frequently than any other cell, a process called wear-leveling for obvious reasons.

To summarize, subsequent writes to the same LBA, the same virtual location, on an SSD could land on different physical cells for the several reasons listed. The firmware is, more often than not, deterministic thus two identical SSDs with the exact same physical media and I/O stream (as in a mirror) would behave identically, but minor timing variations in the commands from operating software, and differences in the media (described below) ensure that the identical SSDs will behave differently. As time passes, those differences are magnified such that two SSDs that started with the same mapping between virtual offsets and physical media will quickly and completely diverge.

Flash hardware and physics

Identical SSDs with identical firmware, still have their own physical flash memory which can vary in quality. To break the problem apart a bit, an SSD is composed of many cells, and each cell’s ability to retain data slowly degrades as it’s exercised. Each cell is in fact a physical component of an integrated circuit composed. Flash memory differs from many other integrated circuits in that it requires far higher voltages than others. It is this high voltage that causes the oxide layer to gradually degrade over time. Further, all cells are not created equal — microscopic variations in the thickness and consistency of the physical medium can make some cells more resilient and others less; some cells might be DOA, while others might last significantly longer than the norm. By analogy, if you install new light bulbs in a fixture, they might burn out in the same month, but how often do they fail on the same day? The variability of flash cells impacts the firmware’s management of the underlying cells, but more trivially it means that two SSDs in a mirror would experience dataloss of corrsponding regions at different rates.

Wrapping up

As with conventional hard drives, mirroring SSDs is a good idea to preserve data integrity. The operating system, filesystem, SSD firmware, and physical properties of the flash medium make this approach sound both in theory and in practice. Flash is a new exciting technology and changes many of the assumptions derived from decades of experience with hard drives. As always proceed with care — especially when your data is at stake — but get the facts, and in this case the wisdom of conventional hard drives still applies.

On the heels of the 2009.Q2.0.0 release, we’ve posted an update to the Sun Storage 7000 simulator. The simulator contains the exact same software as the other members of the 7000 series, but runs inside a VM rather than on actual hardware. It supports all the same features, and has all the same UI components; just remember that an actual 7000 series appliance is going to perform significantly better than a VM running a puny laptop CPU. Download the simulator here.

The new version of the simulator contains two enhancements. First, it comes with the 2009.Q2.0.0 release pre-installed. The Q2 release is the first to provide full support for the simulator, and as I wrote here you can simply upgrade your old simulator. In addition, while the original release of the simulator could only be run on VMware we now support both VMware and VirtualBox (version 2.2.2 or later). When we first launched the 7000 series back in November, we intended to support the simulator on VirtualBox, but a couple of issues thwarted us, in particular lack of OVF support and host-only networking. The recent 2.2.2 release of VirtualBox brought those missing features, so we’re pleased to be able to support both virtualization platforms.

As OVF support is new in VirtualBox, here’s a quick installation guide for the simulator. After uncompressing the SunStorageVBox.zip archive, select “Import Appliance…”, and select “Sun Storage VirtualBox.ovf”. Clicking through will bring up a progress bar. Be warned: this can take a while depending on the speed of your CPU and hard drive.

When that completes, you will see the “Sun Storage VirtualBox” VM in the VirtualBox UI. You may need to adjust settings such as the amount of allocated memory, or extended CPU features. Run the VM and follow the instructions when it boots up. You’ll be prompted for some simple network information. If you’re unsure how to fill in some of the fields, here are some pointers:

  • Host Name – whatever you want
  • DNS Domain – “localdomain”
  • Default Router – the same as the IP address but put 1 as the final octet
  • DNS Server – the same as the IP address but put 1 as the final octet
  • Password – whatever you want and something you can remember

When you complete that form, wait until you’re given a URL to copy into a web browser. Note that you’ll need to use the version of the URL with the IP address (unless you’ve added an entry to your DNS server). In the above example, that would be: https://192.168.56.101:215/. From the web browser, complete the appliance configuration, and then you can start serving up data, observing activity with Storage Analytics, and kicking the tires on a functional replica of a 7000 series appliance.

Today we released version 2009.Q2.0.0, the first major software update for the Sun Storage 7000 series. It includes a bunch of new features, bug fixes, and improvements. Significantly for users of the Sun Storage 7000 simulator, the virtual machine version of the 7000 series, this is the first update that supports the VMs. As with a physical 7000 series appliance, upgrade by navigating to Maintenance > System, and click the + icon next to Available Updates. Remember not to ungzip the update binary — the appliance will do that itself. We’ll be releasing an update VM preinstalled with the new bits so stay tuned.

Note: There were actually two releases of the VMware simulator. The first one came right around our initial launch, and the version string is ak-2008.11.07. This version cannot be upgraded so you’ll need to download the updated simulator whose version is ak-2008.11.21. As noted above, we’ll soon be releasing an updated VM with 2009.Q2.0.0 (ak-2009.04.10.0.0) preinstalled.

Recent Posts

January 22, 2024
January 13, 2024
December 29, 2023
February 12, 2017
December 18, 2016
August 9, 2016

Archives

Archives