Adam Leventhal's blog

APFS in Detail: Encryption, Snapshots, and Backup

June 19, 2016

This series of posts covers APFS, Apple’s new filesystem announced at WWDC 2016. See the first post for the table of contents.

Encryption

Encryption is clearly a core feature of APFS. This comes from diverse requirements from the various devices, for example multiple keys within file systems on the iPhone or per-user keys on laptops. I heard the term “innovative” quite a bit at WWDC, but here the term is aptly applied to APFS. It supports several different encryption choices for a file system:

Unencrypted
Single-key for metadata and user data
Multi-key with different choices for metadata, files, and even sections of a file (“extents”)

Multi-key encryption is particularly relevant for portables where all data might be encrypted, but unlocking your phone provides access to an additional key and therefore additional data. Unfortunately this doesn’t seem to be working in the first beta of macOS Sierra (specifying fileEncryption when creating a new volume with diskutil results in a file system that reports “Is Encrypted” as “No”).

Related to encryption, I noticed an undocumented feature while playing around with diskutil (which prompts you for interactive confirmation of the destructive power of APFS unless this is added to the command-line: -IHaveBeenWarnedThatAPFSIsPreReleaseAndThatIMayLoseData; I’m not making this up). APFS (apparently) supports constant time cryptographic file system erase, called “effaceable” in the diskutil output. This presumably builds a secret key that cannot be extracted from APFS and encrypts the file system with it. A secure erase then need only delete the key rather than needing to scramble and re-scramble the full disk to ensure total eradication. Various iOS docs refer to this capability requiring some specialized hardware; it will be interesting to see what the option means on macOS. Either way, let’s not mention this to the FBI or NSA, agreed?

Snapshots and Backup

APFS brings a much-desired file system feature: snapshots. A snapshot lets you freeze the state of a file system at a particular moment and continue to use and modify that file system while preserving the old data. It does so in a space-efficient fashion where, effectively, changes are tracked and only new data takes up additional space. This has the potential to be extremely valuable for backup by efficiently tracking the data that has changed since the last backup.

ZFS includes snapshots and serialization mechanisms that make it efficient to backup file systems or transfer file systems to a remote location. Will APFS work like that? Probably not, answered Dominic Giampaolo, APFS lead developer. ZFS sends all changed data while Time Machine can have exclusion lists and the like. That seems surmountable, but we’ll see what Apple does. APFS right now is incompatible with Time Machine due to the lack of directory hard links, a fairly disgusting implementation that likely contributes to Time Machine’s questionable reliability. Hopefully APFS will create some efficient serialization for Time Machine backup.

While Eric Tamura, APFS dev manager, demonstrated snapshots at WWDC, the required utilities aren’t included in the macOS Sierra beta. I used DTrace (technology I’m increasingly amazed that Apple ported from OpenSolaris) to find a tantalizingly-named new system call fs_snapshot; I’ll leave it to others to reverse engineer its proper use.

Management

APFS brings another new feature known as space sharing. A single APFS “container” that spans a device can have multiple “volumes” (file systems) within it. Apple contrasts this with the static allocation of disk space to support multiple HFS+ instances, which seems both specious and an uncommon use case. Both ZFS and btrfs have a similar concept of a shared pool of storage with nested file systems for administration and management.

Speaking with Dominic and other members of the APFS team, we discussed how volumes are the unit by which users can control things like snapshots and encryption. You’d want multiple volumes to correspond with different policies around those settings. For example while you might want to snapshot and backup your system each day, the massive /private/var/vm/sleepimage (for saving memory when hibernating) should live on its own and not be backed up.

Space sharing is more like an operational detail than a game changing feature. You can think of it like special folders with snapshot and encryption controls… which is probably why Apple’s marketing department has yet to make me a job offer. ~~Unfortunately this feature isn’t working in the macOS Sierra beta, so I wasn’t able to have more than one volume per container.~~ Adding new volumes can fail with an opaque error (-69625 mean anything to you?), but using a larger disk image resolve the problem.

Next in this series: Space Efficiency and Clones

APFS in Detail: Overview

June 19, 2016

Apple announced a new file system that will make its way into all of its OS variants (macOS, tvOS, iOS, watchOS) in the coming years. Media coverage to this point has been mostly breathless elongations of Apple’s developer documentation. With a dearth of detail I decided to attend the presentation and Q&A with the APFS team at WWDC. Dominic Giampaolo and Eric Tamura, two members of the APFS team, gave an overview to a packed room; along with other members of the team, they patiently answered questions later in the day. With those data points and some first hand usage I wanted to provide an overview and analysis both as a user of Apple-ecosystem products and as a long-time operating system and file system developer.

I’ve divided my review into several sections that span a few posts. I’d encourage you to jump around to topics of interest or skip right to the conclusion (or to the tweet summary). Highest praise goes to encryption; ire to data integrity.

Basics

APFS, the Apple File System, was itself started in 2014 with Dominic as its lead engineer. It’s a stand-alone, from-scratch implementation (an earlier version of this post noted a dependency on Core Storage, but Dominic set me straight). I asked him about looking for inspiration in other modern file systems such as BSD’s HAMMER, Linux’s btrfs, or OpenZFS (Solaris, illumos, FreeBSD, Mac OS X, Ubuntu Linux, etc.), all of which have features similar to what APFS intends to deliver. (And note that Apple built a fairly complete port of ZFS, though Dominic was not apparently part of the group advocating for it.) Dominic explained that while, as a self-described file system guy (he built the file system in BeOS, unfairly relegated to obscurity when Apple opted to purchase NeXTSTEP instead), he was aware of them, but didn’t delve too deeply for fear, he said, of tainting himself.

Dominic praised the APFS testing team as being exemplary. This is absolutely critical. A common adage is that it takes a decade to mature a file system. And my experience with ZFS more or less confirms this. Apple will be delivering APFS broadly with 3-4 years of development so will need to accelerate quickly to maturity.

Paying Down Debt

HFS was introduced in 1985 when the Mac 512K (of memory! Holy smokes!) was Apple’s flagship. HFS+, a significant iteration, shipped in 1998 on the G3 PowerMacs with 4GB hard drives. Since then storage capacities have increased by factors of 1,000,000 and 1,000 respectively. HFS+ has been pulled in a bunch of competing directions with different forks for different devices (e.g. the iOS team created their own HFS variant, working so covertly that not even the Mac OS team knew) and different features (e.g. journaling, case insensitive). It’s old; it’s a mess; and, critically, it’s missing a bunch of features that are really considered the basic cost of doing business for most operating systems. Wikipedia lists nanosecond timestamps, checksums, snapshots, and sparse file support among those missing features. Add to that the obvious gap of large device support and you’ve got a big chunk of the APFS feature list.

APFS first and foremost pays down the unsustainable technical debt that Apple has been carrying in HFS+. (In 2001 ZFS grew from a similar need where UFS had been evolved since 1977.) It unifies the multifarious forks. It introduces the expected features. In general it first brings the derelict building up to code.

Compression is an obvious gap in the APFS feature list that is common in many file systems. It’s conceptually quite easy, I told the development team (we had it in ZFS from the outset), so why not include it? To appeal to Dominic’s BeOS nostalgia I even recalled my job interview with Be in 2000 when they talked about how compression actually improved overall performance since data I/O is far more expensive than computation (obvious now, but novel then). The Apple folks agreed, and—in typical Apple fashion—neither confirmed nor denied while strongly implying that it’s definitely a feature we can expect in APFS. I’ll be surprised if compression isn’t included in its public launch.

Next in this series: Encryption, Snapshots, and Backup

ZFS: Apple's New Filesystem That Wasn't

June 15, 2016

Prologue (2006)

I attended my first WWDC in 2006 to participate in Apple’s launch of their DTrace port to the next version of Mac OS X (Leopard). Apple completed all but the fiddliest finishing touches without help from the DTrace team. Even when they did meet with us we had no idea that they were mere weeks away from the finished product being announced to the world. It was a testament both to Apple’s engineering acumen as well as their storied secrecy.

At that same WWDC Apple announced Time Machine, a product that would record file system versions through time for backup and recovery. How were they doing this? We were energized by the idea that there might be another piece of adopted Solaris technology. When we launched Solaris 10, DTrace shared the marquee with ZFS, a new filesystem that was to become the standard against which other filesystems are compared. Key among the many features of ZFS were snapshots that made it simple to capture the state of a filesystem, send the changes around, recover data, etc. Time Machine looked for all the world like a GUI on ZFS (indeed the GUI that we had imagined but knew to be well beyond the capabilities of Sun).

Of course Time Machine had nothing to do with ZFS. After the keynote we rushed to an Apple engineer we knew. With shame in his voice he admitted that it was really just a bunch of hard links to directories. For those who don’t know a symlink from a symtab this is the moral equivalent of using newspaper as insulation: it’s fine until the completely anticipated calamity destroys everything you hold dear.

So there was no ZFS in Mac OS X, at least not yet.

Not So Fast (2007)

A few weeks before WWDC 2007 nerds like me started to lose their minds: Apple really was going to port ZFS to Mac OS X. It was actually going to happen! Beyond the snapshots that would make backing up a cinch, ZFS would dramatically advance the state of data storage for Apple users. HFS was introduced in System 2.1 (“System” being what we called “Mac OS” in the days before operating systems gained their broad and ubiquitous sex appeal). HFS improved upon the Macintosh File System by adding—wait for it—hierarchy! No longer would files accumulate in a single pile; you could organize them in folders. Not that there were many to organize on those 400K floppies, but progress is progress. And that filesystem has limped along for more than 30 years, nudged forward, rewritten to avoid in-kernel Pascal code (though retaining Pascal-style, length-prefixed strings), but never reimagined or reinvented. Even in its most modern form, HFS lacks the most basic functionality around data integrity. Bugs, power failures, and expected and inevitable media failures all mean that data is silently altered. Pray that your old photos are still intact. When’s the last time you backed up your Mac? I’m backing up right now just like I do every time I think about the neglectful stewardship of HFS.

ZFS was to bring to Mac OS X data integrity, compression, checksums, redundancy, snapshots, etc, etc etc. But while energizing Mac/ZFS fans, Sun CEO, Jonathan Schwartz, had clumsily disrupted the momentum that ZFS had been gathering in Apple’s walled garden. Apple had been working on a port of ZFS to Mac OS X. They were planning on mentioning it at the upcoming WWDC. Jonathan, brought into the loop either out of courtesy or legal necessity, violated the cardinal rule of the Steve Jobs-era Apple. Only one person at Steve Job’s company announces new products: Steve Jobs. “In fact, this week you’ll see that Apple is announcing at their Worldwide Developer Conference that ZFS has become the file system in Mac OS 10,” mused Jonathan at a press event, apparently to bolster Sun’s own credibility.

Less than a week later, Apple spoke about ZFS only when it became clear that a port was indeed present in a developer version of Leopard albeit in a nascent form. Yes, ZFS would be there, sort of, but it would be read-only and no one should get their hopes up.

Ray of Hope (2008)

By the next WWDC it seemed that Sun had been forgiven. ZFS was featured in the keynotes, it was on the developer disc handed out to attendees, and it was even mentioned on the Mac OS X Server website. Apple had been working on their port since 2006 and now it was functional enough to be put on full display. I took it for a spin myself; it was really real. The feature that everyone wanted (but most couldn’t say why) was coming!

The Little Engine That Couldn’t (2009)

By the time Snow Leopard shipped only a careful examination of the Apple web site would turn up the odd reference to ZFS left unscrubbed. Whatever momentum ZFS had enjoyed within the Mac OS X product team was gone. I’ve heard a couple of theories and anecdotes from people familiar with the situation; first some relevant background.

Sun was dying. After failed love affairs with IBM and HP (the latter formed, according to former Sun CEO, Scott McNealy, by two garbage trucks colliding), Oracle scooped up the aging dame with dim prospects. The nearly yearlong process of closing the acquisition was particularly hard on Sun, creating uncertainty around its future and damaging its bottom line. Despite the well-documented personal friendship between Steve Jobs and Oracle CEO, Larry Ellison (more on this later), I’m sure this uncertainty had some impact on the decision to continue with ZFS.

In the meantime Sun and NetApp had been locked in a lawsuit over ZFS and other storage technologies since mid-2007. While Jonathan Schwartz had blogged about protecting Apple and its users (as well as Sun customers of course), this likely lead to further uncertainly. On top of that, filesystem transitions are far from simple. When Apple included DTrace in Mac OS X a point in favor was that it could be yanked out should any sort of legal issue arise. Once user data hit ZFS it would take years to fully reverse the decision. While the NetApp lawsuit never seemed to have merit (ZFS uses unique and from-scratch mechanisms for snapshots), it indisputably represented risk for Apple.

Finally, and perhaps most significantly, personal egos and NIH (not invented here) syndrome certainly played a part. I’m told by folks in Apple at the time that certain leads and managers preferred to build their own rather adopting external technology—even technology that was best of breed. They pitched their own project, an Apple project, that would bring modern filesystem technologies to Mac OS X. The design center for ZFS was servers, not laptops—and certainly not phones, tablets, and watches—his argument was likely that it would be better to start from scratch than adapt ZFS. Combined with the uncertainty above and, I’m told, no shortage of political savvy their arguments carried the day. Licensing FUD was thrown into the mix; even today folks at Apple see the ZFS license as nefarious and toxic in some way whereas the DTrace license works just fine for them. Note that both use the same license with the same grants and same restrictions. Maybe the technical arguments really were overwhelming (note however that ZFS was working internally on the iPhone), and maybe the risks really were insurmountable. I obviously have my own opinions, and think this was a great missed opportunity for the industry, but I never had the burden of weighing the totality of the facts and deciding. Nevertheless Apple put an end to its ZFS work; Apple’s from-scratch filesystem efforts were underway.

The Little Engine That Still Couldn’t (2010)

Amazingly that wasn’t quite the end for ZFS at Apple. The architect for ZFS at Apple had left, the project had been shelved, but there were high-level conversations between Sun and Apple about reviving the port. Apple would get indemnification and support for their use of ZFS. Sun would get access to the Apple File Protocol (AFP—which, ironically, seems to have been collateral damage with the new APFS), and, more critically, Sun’s new ZFS-based storage appliance (which I helped develop) would be a natural server and backup agent for millions of Apple devices. It seemed to make some sort of sense.

The excruciatingly debilitatingly slow acquisition of Sun finally closed. The Apple-ZFS deal was brought for Larry Ellison’s approval, the first born child of the conquered land brought to be blessed by the new king. “I’ll tell you about doing business with my best friend Steve Jobs,” he apparently said, “I don’t do business with my best friend Steve Jobs.”

(Amusingly the version of the story told quietly at WWDC 2016 had the friends reversed with Steve saying that he wouldn’t do business with Larry. Still another version I’ve heard calls into question the veracity of their purported friendship, and has Steve instead suggesting that Larry go f*ck himself. Normally the iconoclast, that would, if true, represent Steve’s most mainstream opinion.)

And that was the end.

Epilogue (2016)

In the 7 years since ZFS development halted at Apple, they’ve worked on a variety of improvements in HFS and Core Storage, and hacked at at least two replacements for HFS that didn’t make it out the door. This week Apple announced their new filesystem, APFS, after 2 years in development. It’s not done; some features are still in development, and they’ve announced the ambitious goal of rolling it out to laptop, phone, watch, and tv within the next 18 months. At Sun we started ZFS in 2001. It shipped in 2005 and that was really the starting line, not the finish line. Since then I’ve shipped the ZFS Storage Appliance in 2008 and Delphix in 2010 and each has required investment in ZFS / OpenZFS to make them ready for prime time. A broadly featured, highly functional filesystem takes a long time.

APFS has merits (more in my next post), but it will always disappoint me that Apple didn’t adopt ZFS irrespective of how and why that decision was made. Dedicated members of the OpenZFS community have built and maintain a port. It’s not quite the same as having Apple as a member of that community, embracing and extending ZFS rather than building their own incipient alternative.

Finding What's Next

May 13, 2016

After nearly nine years at Sun and then six at Delphix I’m looking for the next technology, team, and market to dive into. I’ve had the extremely good fortune of working with three groups—the DTrace team, Fishworks at Sun, and Delphix—that featured top-tier technologists working on differentiated products, each of them a wonderful place and time. Most recently, as CTO of Delphix, I grew the engineering team from a tiny seed, and was fortunate enough to be joined by so many people from my past including some of my best best friends and long-term colleagues. I’m forever grateful to the team and to the founder, Jed Yueh, who convinced me to join, knowing better than I the team, company, and products we could build.

CTO is a full-time job; there weren’t many spare cycles to contemplate much beyond the exigencies of the business. So I’m excited to join Sutter Hill Ventures as an Entrepreneur in Residence, an amazing vantage point to find the next thing.

EIR

There doesn’t seem to be a single, clear answer about the job of an EIR. In general though it seems to involve assessing companies and markets, and exploring new possibilities. Unsurprisingly, the Wikipedia page for EIR lists “Artist in Residence” under the “See Also” section: the specific deliverables are of lesser emphasis than the collaboration and community of investors, entrepreneurs and technologists. My focus is to evaluate a variety of technologies (and there’s certainly no shortage), find patterns of business pain, and to land on a problem that excites me (and can excite others) to charge headlong into a new venture.

Sutter Hill

I’m particularly excited to be at Sutter Hill working with Mike Speiser. Sutter has produced companies I admire such as Pure Storage, Snowflake, and SiFive. Mike has a great understanding of what it takes to build enterprise and systems software where I’ve spent my career, and he has a remarkably compelling and sanguine view of company building. At Sutter I’ve found a great, diverse set of technologists from product, to tech, recruiting, and design. I’ve never learned as much or been as impressed as in these initial days at Sutter.

As much as I’m enjoying Sutter Hill I hope to be on my way to the next thing—soon, but not too soon. For now though I’ve been thoroughly enjoying my new colleagues and unique vantage point. You can reach me at ahl at shv dot com.

Big News for ZFS on Linux

March 7, 2016

Canonical announced a few weeks ago that ZFS will be included in the next release of Ubuntu Linux, on by default and fully supported. And it’s no exaggeration when Dustin Kirkland describes ZFS as “one of the most exciting new features Linux has seen in a very long time.” In the words of our 47th Vice President, this is a big F—ing deal. Ubuntu is an extremely popular Linux distribution, particularly so for servers, and while the Linux ecosystem doesn’t want for variety when it comes to filesystem choices, there is not a clear champion when it comes to stability, functionality, and performance. By throwing their full weight behind ZFS, Canonical brings the Linux community an enterprise-class, modern filesystem, stable, but still under active development, designed to perform well for a variety of workloads.

What’s ZFS?

ZFS was originally developed at Sun Microsystem for the Solaris operating system. Some of the most demanding production environments have depended on ZFS for well over a decade. At its core are the principles of data integrity, ease of use, and performance.[1] Most notably, ZFS has first-class support for arbitrary numbers of snapshots and writable clones, serialization for replication, compression, and data repair. I’ve contributed code to ZFS at Sun, then Oracle, and to OpenZFS after Oracle abandoned the project in 2010. I’ve also built two products built around ZFS, the ZFS Storage Appliance, a NAS box, and Delphix, a copy data virtual appliance.

Why ZFS?

While the distinguishing features of ZFS are broadly useful, they have become specifically relevant in a containerizing world. Users need to save, clone, and replicate containers at will; ZFS provides key facilities for doing so. Containers and ZFS are a fantastic match, something I’ve seen my friends at Joyent demonstrate decisively for the past decade. Ubuntu has selected the most capable technology for our modern computing ecosystem.

No good deed…

So high fives and bro hugs all around, right? Not quite. Enter the licensing boogie man. The Linux kernel is licensed under the GPL v2; OpenZFS is licensed under the CDDL. Both are open source, true, but some contend that they are incompatible. Most folks in the tech world—myself among them—spend somewhere in the vicinity of no time at all considering the topic. Far from ignoring it, Canonical had their lawyers review the licenses and deemed their use of Linux and OpenZFS to be in compliance. I’m not a lawyer; I don’t have an informed opinion. But there are those who vehemently disagree with Canonical. Notably the Software Freedom Conservancy whose mission is to “promote, improve, develop, and defend Free, Libre, and Open Source Software (FLOSS) projects” has posted a lengthy wag of the finger at Canonical. Note that none of this has been specifically tested in the courts so it’s currently just a theoretical disagreement between lawyers (and in many cases, people who engage in lawyerly cosplay).

The wisdom of the crowd has proposed a couple of solutions:

“What if we ask Oracle super nicely?”

Oracle holds a copyright on most of OpenZFS since it was forked from the original ZFS code base. It would be within their rights to decide to relicense ZFS under the GPL. Problem solved! No way and not quite. Starting with the easier problem there are many other copyright holders in OpenZFS. It’s not an impossibly large list, but why would they bother? What benefits would they reap when even goodwill isn’t noticeably on offer? And it is the height of delusion to think that Oracle would grow ears to hear, a heart to care, and a brain to decide. Oracle explicitly backed away from OpenSolaris, shutting down the project in 2010. They do not want to encourage open source use of its component technologies. While open source is arguably the most significant shift in technology over the last decade (Stephen O’Grady’s The Software Paradox is a must-read), large companies and startups continue to be terrified, confused, and irrational when it comes to open source. Oracle ain’t coming to help.[2]

“Let’s re-write it! How hard could it be?”

It’s hard to dignify this with an explanation, but OpenZFS is the product of 100s of person-years. It contains some of the most sophisticated mechanisms I’ve seen designed, by some of the world’s most capable engineers. Re-writing it would probably be no easier than writing it the first time. By way of commentary, this is what makes NIH so distressing. Too often technologies are copied poorly instead of being used and improved, or understood and replaced with something truly superior.

Now what?

Now that you understand a bit of the context here’s my suggestion: consider the licenses, but focus on the technology. Canonical has (one would presume) chosen to include OpenZFS because it offers the best solution to Ubuntu users. Containers and ZFS are highly complementary with further room to grow together. As with anything, evaluate the technology, evaluate the risks, and move on. Ignore pedants who would deride your pragmatic use of technology as heretical or immoral.

I personally could not be more excited by the announcement. The Ubuntu community is going to have built-in support for a filesystem that’s better and more capable than anything they’ve had in the past. The OpenZFS community is going to have a ton more users, more interest, and more drivers for innovation. Both are going to be stronger together.

[1] “ZFS crashed on me once!” Me too, more than once. “ZFS was slow for me!” That happens. “[some other Linux filesystem] is better!” Could be, but I doubt it. I’m not denying the events, but this kind of Inhofian logic doesn’t nudge ZFS from its perch.

[2] In 2011 the source code for Oracle’s new Solaris 11 operating system appeared on the web replete with CDDL (open source) license notification. For all appearances this looked like open source code, a new version of OpenSolaris. The community asked for clarification: were these stolen goods or something given away intentionally? Was Solaris 11 free and open? Even then Oracle declined to comment.