August 2010 – Adam Leventhal's blog

The future of Solaris

August 27, 2010

In 2005, Sun released the source code to Solaris, described then as the company’s crown jewel. Why do this? The simplest answer is that Solaris had been losing ground to an open source competitor in Linux. Losing ground was a symptom of economics. Students who had once been raised on Solaris were being inculcated with Linux knowlege. The combination of Linux and x86 were good enough and significantly cheaper; new companies for whom the default had once been Sun/Solaris/SPARC were instead building on x86/Linux. OpenSolaris along with x86 support were specifically intended to address this trend. Indeed, the codename for OpenSolaris was “tonic” — the tonic for Solaris’ problems.

To that end, OpenSolaris was on reasonably stable footing: open source had become expected for an operating system, source code availability was a benefit to traditional enterprise users (especially with the advent of DTrace), and the community would attract new users. But then Solaris lost the plot. Users chose Solaris because it is a — or perhaps the — enterprise operating system. OpenSolaris was intended to broaden the appeal, but that notion was taken to such extremes as to lose sight of the traditional customers of Solaris, and, indeed, the focus that makes Solaris both unique and great.

OpenSolaris June 14, 2005 – August 13, 2010

We launched Solaris 10 in 2004 with an impressive list of features — ZFS, DTrace, Zones, SMF, FMA, Fire Engine — all highly relevant for enterprise users. You can find a company that has bet its business on the success of each of those features. In the wake of OpenSolaris, the decision was made (and here I can no longer use the active voice because by then I had left to start Fishworks elsewhere at Sun) to have an explicit focus on building an operating system for developers — which is to say, for their laptops. This was an error, but a predictable one. Once Solaris was free to download and use, revenue recognition for the Solaris organization which has always been difficult to measure became even more indirect. The metrics were changed: the targets for management bonuses became not revenue, or enterprise users, but downloads. Directly or indirectly much of the focus for the Solaris organization shifted to address that straightforward goal. The mistake was that OpenSolaris didn’t need to find users, they found Solaris. In trying to build a community, the new direction for OpenSolaris weakened the very principles upon which a thriving community would have been based.

The very name “OpenSolaris” got confused, diluted, and poluted. OpenSolaris was a source repository, a community, and a distro (although purists still insist that Indiana is the appropriate name for that part) intended to “close the familiarity gap” with Linux. Moreover, new projects that shifted efforts away from enterprise uses (read: paying customers) to focus on the laptop also rallied under the banner of “OpenSolaris”. In a way Oracle’s acquisition of Sun saved Solaris from itself; the marching orders became much clearer: address enterprise users, ship Solaris 11 (something that should not have taken 6 years). As for OpenSolaris, that decision too was likely simple for Oracle, never an overt fan of open source. Had “OpenSolaris” simply meant a code base and user community, I think there’s a good chance it would have been allowed to live. Burdened, however, with the baggage of the Indiana distro and sundry projects incomprehensible to Oracle management, OpenSolaris was in a politically untenable position. Mike‘s “Friday the 13th memo” merely made it official — Solaris was to be closed source once more.

Sun’s efforts with OpenSolaris were, at best, a mixed success. Quietly, however, an ecosystem of companies grew out of the technologies in OpenSolaris. Notably Joyent uses Zones and DTrace as significant differentiators; Nexenta builds very heavily on ZFS; as I’ve mentioned, Delphix, my new employer, builds on OpenSolaris as well. There are many more that I know about, and still more that I don’t. These companies chose OpenSolaris so they could use the innovative technologies that simply aren’t available anywhere else. And they did so in spite of a common trend towards Linux with its familiarity, and broad compatibility — the innovation in Solaris was more valuable and, in some cases, enabling for the company’s business.

illumos August 3, 2010 –

The danger for those companies has long been that Oracle would pull the rug out from under them; only the foolish had no contingency plan. The options were to give up on Solaris or maintain a fork. Happily illumos has stepped in to offer a third path. Garrett D’Amore and Nexenta graciously started the illumos project to carry the OpenSolaris torch. It is an ostensible fork of OpenSolaris (can you fork a dead project?), but more importantly a mechanism by which companies building on those component technologies can pool their resources, amortize their costs, and build a community by and for the downstream users who are investing in those same technologies. Rather than being operated by a single corporate interest, its steward will be a 501(c)(3) non-profit in the model of the Mozilla Foundation.

I was pleased to announce at tonight’s SVOSUG meeting that I’ll be joining the illumos developer council, I was delighted to accept when Garrett offered me the position. My bias for illumos is that the main repository will focus on reliability, performance, and compatibility while taking a conservative approach to new features and functionality. As much as possible, I’d like the downstream users — the distributions, appliances, and platforms — to make the decisions appropriate to their uses and only adopt large-scale changes into the trunk when there’s broad consensus among them. The goal must be to build a project that is readily useful to everyone and to allow our collective efforts to be shared as easily as possible.

What’s the future of Solaris? For many it will be Solaris 11 in late 2011. But for others, it will be illumos either as the firmware for an appliance (not unlike what we built at Fishworks in the 7000 series), the platform for your web applications, or as a general purpose operating system. The innovation in Solaris has always flowed from the creative individuals working on the project. Keep your eyes on illumos; Oracle ending OpenSolaris is no surprise, but in doing so they have broken their own monopoly on Solaris and Solaris talent.

Joining Delphix

August 24, 2010

As I wrote about last time, I’ve left Oracle. What I was looking for in my next gig was technology that excites me, excellent management, and a chance to build something significant and successful. I’m confident that I’ve found those things with Delphix.

In the established database market, Delphix creates a virtualization layer that simplifies the management of data and reduces duplication and waste. Why’s that interesting? The most important data is in databases, so building a layer between data and storage is incredibly powerful. The software to achieve that can then grow in a variety of directions, from data analysis and tuning, optimization at the level of the operating system and file system, to integration up the stack. The notion of storage virtualization is popular albeit vague one. Delphix brings both a concrete definition and value as well as a unique, application-centric focus.

Delphix builds on top of OpenSolaris which was, of course, another compelling reason for me to join. The Solaris group constructed a platform unique in its facilities for developers and in its comprehensive manageability. As I looked at various prospective employers I came to an even better appreciation of how tough it would be to work in an environment without DTrace, and mdb, and pstack, and libumem, and SMF, and FMA, etc. etc. Of course now Oracle has withdrawn support for OpenSolaris, but we won’t be going it alone (stay tuned for more on that).

It’s that combination of technology that’s interesting both at a high level and in the details, a management team that’s experienced and hungry, innovation in a market where we can have a lasting impact, and an initial product that proves the potential yet with many hard problems still left to solve. But it’s the people who build a company; Delphix both has a great team and a commitment to assembling talent second to none. I’m excited to get started (… after a couple of weeks of much needed decompression).

Leaving Oracle

August 18, 2010

I joined the Solaris Kernel Group in 2001 at what turned out to be a remarkable place and time for the industry. More by luck and intuition than by premonition, I found myself surrounded by superlative engineers working on revolutionary technologies that were the products of their own experience and imagination rather than managerial fiat. I feel very lucky to have worked with Bryan and Mike on DTrace; it was amazing that just down the hall our colleagues reinvented the operating system with Zones, ZFS, FMA, SMF and other innovations.

With Solaris 10 behind us, lauded by customers and pundits, I was looking for that next remarkable place and time, and found it with Fishworks. The core dozen or so are some of the finest engineers I could hope to work with, but there were so many who contributed to the success of the 7000 series. From the executives who enabled our fledgling skunkworks in its nascent days, to our Solaris colleagues building fundamental technologies like ZFS, COMSTAR, SMF, networking, I/O, and IPS, and the OpenStorage team who toiled to bring a product to market, educating us without deflating our optimism in the process.

I would not trade the last 9 years for anything. There are many engineers who never experience a single such confluence of talent, organizational will, and success; I’m grateful to my colleagues and to Sun for those two opportunities. Now I’m off to look for my next remarkable place and time beyond the walls of Oracle. My last day will be August 20th, 2010.

Thank you to the many readers of this blog. After six years and 130 posts I’d never think of giving it up. You’ll be able to find my new blog at dtrace.org/blogs/ahl (comments to this post are open there); I can’t wait to begin chronicling my next endeavors. You can reach me by email here: my initials at alumni dot brown dot edu. I look forward to your continued to comments and emails. Thanks again!

Fishworks history of SSDs

August 17, 2010

This year’s flash memory summit got me thinking about our use of SSDs over the years at Fishworks. The picture of our left is a visual history of SSD evals in rough chronological order from the oldest at the bottom to the newest at the top (including some that have yet to see the light of day).

Early Days

When we started Fishworks, we were inspired by the possibilities presented by ZFS and Thumper. Those components would be key building blocks in the enterprise storage solution that became the 7000 series. An immediate deficiency we needed to address was how to deliver competitive performance using 7,200 RPM disks. Folks like NetApp and EMC use PCI-attached NV-DRAM as a write accelerator. We evaluated something similar, but found the solution lacking because it had limited scalability (the biggest NV-DRAM cards at the time were 4GB), consumed our limited PCIe slots, and required a high-speed connection between nodes in a cluster (e.g. IB, further eating into our PCIe slot budget).

The idea we had was to use flash. None of us had any experience with flash beyond cell phones and USB sticks, but we had the vague notion that flash was fast and getting cheaper. By luck, flash SSDs were just about to be where we needed them. In late 2006 I started evaluating SSDs on behalf of the group, looking for what we would eventually call Logzilla. At that time, SSDs were getting affordable, but were designed primarily for environments such as military use where ruggedness was critical. The performance of those early SSDs was typically awful.

Logzilla

STEC — still Simpletech in those days — realized that their early samples didn’t really suit our needs, but they had a new device (partly due to the acquisition of Gnutech) that would be a good match. That first sample was fibre-channel and took some finagling to get working (memorably it required metric screw of an odd depth), but the Zeus IOPS, an 18GB 3.5″ SATA SSD using SLC NAND, eventually became our Logzilla (we’ve recently updated it with a SAS version for our updated SAS-2 JBODs). Logzilla addressed write performance economically, and scalably in a way that also simplified clustering; the next challenge was read performance.

Readzilla

Intent on using commodity 7,200 RPM drives, we realized that our random read latency would be about twice that of 15K RPM drives (duh). Fortunately, most users don’t access all of their data randomly (regardless of how certain benchmarks are designed). We already had much more DRAM cache than other storage products in our market segment, but we thought that we could extend that cache further by using SSDs. In fact, the invention of the L2ARC followed a slightly different thought process: seeing the empty drive bays in the front of our system (just two were used as our boot disks) and the piles of SSDs laying around, I stuck the SSDs in the empty bays and figured out how we’d use them.

It was again STEC who stepped up to provide our Readzilla, a 100GB 2.5″ SATA SSD using SLC flash.

Next Generation

Logzilla and Readzilla are important features of the Hybrid Storage Pool. For the next generation expect the 7000 series to move away from SLC NAND flash. It was great for the first generation, but other technologies provide better $/IOPS for Logzilla and better $/GB for Readzilla (while maintaining low latency). For Logzilla we think that NV-DRAM is a better solution (I reviewed one such solution here), and for Readzilla MLC flash has sufficient performance at much lower cost and ZFS will be able to ensure the longevity.

Farewell to Bryan Cantrill

August 12, 2010

I’ve been expecting this automated mail for a while now, but it was disheartening nonetheless:

List:       dtrace-discuss
Member:     bryan.cantrill@eng.sun.com
Action:     Subscription disabled.
Reason:     Excessive or fatal bounces.

bmc_v_onion — Bryan Cantrill, VP of Engineering at Joyent, earning $15.

As one of the moderators of the DTrace discussion list, I see people subscribe and unsubscribe. Bryan has, of course, left Oracle and joined Joyent to be their VP of engineering.

Bryan is a terrific engineer, and I count myself lucky to have worked with him for the past nine years first on DTrace and then on Fishworks. He taught me many things, but perhaps most important was his holistic view of engineering that encompasses all aspects of making a product successful including docs, pricing, talks, papers, and, of course, excellent code. Now Bryan is off to cut through the layers software that make up the cloud. Far from leaving the DTrace community, he’s going to take DTrace to new places and I look forward to seeing the fruits of his labor as he sinks his teeth into a new onion of abstractions.

… and, Robin, Bryan’s certainly a smart guy, but “the smart guy behind Dtrace [sic]”?? Just don’t refer to me and Mike as “the dumb guys behind DTrace” okay?

Adam Leventhal's blog

Month: August 2010

The future of Solaris

OpenSolaris June 14, 2005 – August 13, 2010

illumos August 3, 2010 –

Joining Delphix

Leaving Oracle

Fishworks history of SSDs

Early Days

Logzilla

Readzilla

Next Generation

Farewell to Bryan Cantrill

Recent Posts

Austin API Summit Wrap-up

Rust and JSON Schema: odd couple or perfect strangers

Oxide and Friends Season 4

DTrace probes in Rust

From Prometheus to Sisyphus

DTrace at Home

Archives

Archives