The big boys better be flash friendly

An interesting article came up in the news this week. The article, from the ever popular The Register, mentioned 3 up and rising storage stars, Nimble Storage, Tintri and Tegile, and their assault on a flash strategy “blind spot” of the big boys, notably EMC and NetApp.

I have known about Nimble Storage and Tintri for a couple of years now, and I did take some time to read up on their storage technology offering. Tegile is new to me when it appeared on my radar after SearchStorage.com announced as the Gold Winner of the enterprise storage category for 2012.

The Register article intriqued me because it implied that these traditional storage vendors such as EMC and NetApp are probably doing a “band-aid” when putting together their flash storage strategy. And typically, I see these strategic concepts introduced by these 2 vendors:

  1. Have a server-side cache strategy by putting a PCIe card on the hosting server
  2. Have a network-based all-flash caching area
  3. Have a PCIe-based flash card on the storage system
  4. Have solid state drives (SSDs) in its disk shelves enclosures

In (1), EMC has VFCache (the server side caching software has been renamed to XtremSW Cache and under repackaging with the Xtrem brand name) and NetApp has it FlashAccel solution. Previously, as I was informed, FlashAccel was using the FusionIO ioTurbine solution but just days ago, NetApp expanded the LSI Nytro WarpDrive into its FlashAccel solution as well. The main objective of a server-side caching strategy using flash is to accelerate mostly read-based I/O operations for specific application workloads at the server side.

We are also beginning to see all-flash-based appearing as a component in the networking in item (2). EMC called this Caching Area Network (CAN) to accelerate read and write I/Os downstream from the servers. EMC Project Thunder will be addressing just that as seen in an old PPT slide below:

I have not been touch about the availability of EMC’s Project Thunder, and it is likely that the solution will be out this year. In the same beat, NetApp has quietly acquired CacheIQ to accelerate NAS performance and could probably built upon CacheIQ RapidCache’s technology to address the non-NAS requirements in the future.

In item (3) above, NetApp has been beating its drums for years with the FlashCache solution. Initially known as PAM cards, this storage-side PCIe card accelerates read I/O operations and they are part of NetApp’s Virtual Storage Tier (VST) strategy.

The last item, item (4) basically introduces SSDs to the disk shelves of the NetApp and EMC, as well as the other storage vendors. They are typically used as either an extension of the storage system’s memory, as seen in EMC VNX FastCache technology or as a Tier 0 level in an automated storage tiering technology offering, such as EMC FAST2.

But the argument of the article is that both NetApp and EMC are in quandary. Both their storage operating system and its corresponding filesystem are “ancient” from a flash perspective. They were designed and developed for block-based disk devices to make a set of disks look like a disk to the hosting operating systems or applications. Technically, these storage filesystems are “not optimized” for flash.

NAND Flash is at present the most popular of all the solid state storage technologies. And NAND Flash has a few idiosyncrasies of its own that is probably not anticipated by traditional block-based filesystems. These idiosyncrasies include:

  • Blocks in the flash memory cells have to be explicitly erased before they can be written to. This means that if a write is directed to a byte, the entire block to be written to 1 before the byte can be overwritten with 1s or 0s. This Block Erasure behaviour affects write performance of the NAND Flash.
  • Flash memory does not impose seek time but most block-based file systems infer that disk seek time as part of its design. Therefore, the random access in flash memory can be affected because of the Block Erasure behaviour (see previous point) as well as the copy-on-write nature of flash friendly filesystems by writing updates to new fresh blocks rather than returning to modify the old blocks. I wrote about the effects of copy-on-write filesystems with SSDs more than a year ago.
  • Wear-leveling effect where filesystems considerations have to be made to address the possibility of dying flash memory cells due to overwriting to the same location repeatedly. Wear leveling algorithms include spreading and balancing writes across the mappable surface of the flash memory, additional flash memory buffer for spare blocks, error correction and so on.

With the rapid rise in the adoption of flash-based storage devices, the article argues that it would be hard for EMC or NetApp to rip out and replace their (“ancient and less flash-friendly”) filesystems from their enterprise storage arrays. EMC has quickly addressed this gap by acquiring XtremIO last year while NetApp has announced their flash strategy with the FlashRay architecture back in November 2012.

Side note: I have not spend my time studying about EMC XtremIO or NetApp FlashRay although I am happy to have known Shahar Frank, one of the founders of XtremIO, through my blog. Note to self -> Reading something there 😉

As I scour the web for flash-friendly filesystems, all signs point to Log-based Filesystems. It appears that log-based filesystems have most of the attractive properties for a flash-friendly filesystem. While I do not understand the design of log-based file system in depth, it appears to have the characteristics that fit into the idiosyncrasies of NAND Flash as described above.

The log-based filesystem treats the all the freely writable storage capacity as a circular log, which has a head and a tail. The log-based filesystem writes forward in a sequential continuous stream (known as a log) to the head of the log but must reclaim the used but obsolete block at the tail of the circular log. This behaviour of the log-based filesystem mimics part of the wear-leveling implementation by writing to new blocks in the flash memory and doing block erasure and garbage collection at the beginning of the flash memory space.

The heavy lifting of wear-leveling, error checking, erasing unused or obsolete blocks, garbage collection can be addressed by the flash controller. The read I/O operations can be accelerated with having all flash memory caching mechanisms such as storage-side PCIe and in storage SSDs, leaving the log-based filesystem to concentrate on optimizing sequential writes to the flash storage blocks.

Linux is already building a flash-friendly kernel with the implementation of a Flash Translation Layer (FTL) to address the idiosyncrasies of flash memory. Here’s a diagram I found from Usenix that shows the FTL implementation:

There is no doubt that flash-based storage will continue to grow and flash-friendly filesystems should not be too far behind. And the big boys are definitely not going to let the new boys like Nimble Storage, Tintri and Tegile to eat into their flash storage pie!

NOTE: The understanding of the Log-based filesystem is based on my comprehension and interpretation. I do not claim to be an expert in this field and this is an opportunity for me to learn and share. 

Tagged , , , , , , , , , . Bookmark the permalink.

About cfheoh

I am a technology blogger with 30 years of IT experience. I write heavily on technologies related to storage networking and data management because those are my areas of interest and expertise. I introduce technologies with the objectives to get readers to know the facts and use that knowledge to cut through the marketing hypes, FUD (fear, uncertainty and doubt) and other fancy stuff. Only then, there will be progress. I am involved in SNIA (Storage Networking Industry Association) and between 2013-2015, I was SNIA South Asia & SNIA Malaysia non-voting representation to SNIA Technical Council. I currently employed at iXsystems as their General Manager for Asia Pacific Japan.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.