Xtreme future?

EMC acquisition of XtremIO sent shockwaves across the industry. The news of the acquisition, reported costing EMC USD$430 million can be found here, here and here.

The news of EMC’s would be acquisition a few weeks ago was an open secret and rumour has it that NetApp was eyeing XtremIO as well. Looks like EMC has beaten NetApp to it yet again.

The interesting part was of course, the price. USD$430 million is a very high price to pay for a stealthy, 2-year old company which has 2 rounds of funding totaling USD$25 million. Why such a large amount?

XtremIO has a talented team of engineers; the notable ones being Yaron Segev and Shahar Frank. They have their background in InfiniBand, and Shahar Frank was the chief architect of Exanet scale-out NAS (which was acquired by Dell). However, as quoted by 451Group, XtremeIO is building an all-flash SAN array that “provides consistently high performance, high levels of flash endurance, and advanced functionality around thin provisioning, de-dupe and space-efficient snapshots“.

Furthermore, XtremeIO has developed a real-time inline deduplication engine that does not degrade performance. It does this by spreading the write I/Os over the entire array. There is little information about this deduplication engine, but I bet XtremIO has developed a real-time, inherent deduplication file system that spreads all the I/Os to balance the wear-leveling as well as having scaling performance. I bet XtremIO will dedupe everything that it stores, has a B+ tree, copy-on-write file system with a super-duper efficient hashing algorithm for address mapping (pointers) with this deduplication file system. Ok, ok, I am getting carried away here, because it is likely that I will be wrong, but I can imagine, can’t I?

I am reading the XtremeIO whitepaper “Flash Implications in Enterprise Storage Array Designs”. I am writing this as I read, therefore, comments that I have written before this paragraph (especially the deduplication file system in the previous paragraph) has no bearing of what I am writing from now on.

The whitepaper doesn’t give away what the XtremIO technology is. In fact, there’s nothing about XtremeIO’s super secretive secret sauce. The whitepaper paints scenarios of what are the common features (and problems) of enterprise storage systems, and the issues that confront us as storage professionals, as we explain our storage technology to potential customers.

I like that whitepaper a lot, because it exposes our conscience of speaking out of what are the pros and cons; the rights and the wrongs.

When NetApp tells you how great primary deduplication is with capacity savings, did they tell you about the performance hit during the deduplication post-processing? When EMC said that their Copy-on-First-Access (COFA) snapshots is great, did they mention how they had to do 2 x Write I/Os in order to complete the snapshot creation? When NetApp touts their Copy-on-Write snapshots (btw, NetApp’s snapshots should be Redirect-On-Write – read here), did they explain that their snapshots would result in subsequent fragmented I/Os for reads?

The conclusive paragraph in the whitepaper speaks of how these present enterprise storage designs impacts the need for performance. Most of them don’t, preferring to go for storage efficiency and capacity savings over performance. The development of XtremIO wants to change that and they have listed the following that would be in their solution (copied in verbatim from the whitepaper):

  • It must be a scale-out design. Flash simply delivers performance levels beyond the capabilities of any scale-up (i.e. dual controller) architecture.
  • The system must automatically balance itself. If balancing does not take place then adding capacity will not increase performance.
  • Attempting to sequentialize workloads no longer makes sense since flash media easily handles random I/O. Rather, the unique ability of flash to perform random I/O should be exploited to provide new capabilities.
  • Cache-centric architectures should be rethought since today’s data patterns are increasingly random and inherently do not cache well. Likewise, tiering is largely ineffective for any active data set.
  • Any storage feature that performs multiple writes to function must be completely rethought for two reasons.
    • First, the extra writes steal available I/O operations from serving hosts. And second, with flash’s finite write cycles, extra writes must be avoided to extend the usable life of the array.
    • Array features that have been implemented over time are typically functionally separate. For example, in most arrays snapshots have existed for a long time and deduplication (if available) is fairly new. The two features do not overlap or leverage each other in any way. But there are significant benefits to be realized through having a unified metadata model for deduplication, snapshots, thin provisioning, replication, and other advanced array features.

I have underlined the key arguments above. If we take our time to understand the existing storage technology out there, be it NetApp FAS, EMC CX and VNX, Dell Compellent, HDS and compare it with the points above, a lot of features that these guys offer will be blown out of the picture.

There will be plenty of arguments about whose technology is better and it is a no-brainer that many of the traditional enterprise storage vendors are throwing Flash SSDs to solve the I/O performance issue now. The problem is, they probably had dug a very deep hole going after storage efficiency (thin provisioning, snapshots, deduplication, RAID, cloning and many others) in the past and they did not give priority to I/O performance. Now that SSDs is tilting the scale towards I/O performance, these guys will have a tough time un-doing all that they have done.

This becomes a perfect excuse to buy smaller all-Flash start-ups. EMC has acquired XtremIO. Violin Memory, Virident, Kaminario, SolidFire, PureStorage and NexGen are all bridesmaids waiting to be swooped off their feet by a larger suitor. And it is the one with the most financial muscle that is most likely to win.

Let the bridesmaids bidding wars begin!


About cfheoh

I am a technology blogger with 20+ years of IT experience. I write heavily on technologies related to storage networking and data management because that is my area of interest and expertise. I introduce technologies with the objectives to get readers to *know the facts*, and use that knowledge to cut through the marketing hypes, FUD (fear, uncertainty and doubt) and other fancy stuff. Only then, there will be progress. I am involved in SNIA (Storage Networking Industry Association) and as of October 2013, I have been appointed as SNIA South Asia & SNIA Malaysia non-voting representation to SNIA Technical Council. I was previously the Chairman of SNIA Malaysia until Dec 2012. I have recently joined Hitachi Data Systems as an Industry Manager for Oil & Gas in Asia Pacific. The position does not require me to be super-technical (which is what I love) but it helps develop another facet of my career, which is building communities and partnership. I think this is crucial and more wholesome than just being technical alone. Given my present position, I am not obligated to write about HDS and its technology, but I am indeed subjected to Social Media Guidelines of the company. Therefore, I would like to make a disclaimer that what I write is my personal opinion, and mine alone. Therefore, I am responsible for what I say and write and this statement indemnify my employer from any damages.
Tagged , , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>