The rise of the specialized appliance

Compute and storage are 2 components within the IT infrastructure which are surely converging. SAN and NAS are facing their greatest adversary yet, and could be made insignificant if the cloud and virtualization game had their way. This is giving rise to the a new breed of solution, a specialized appliance where both compute and storage are ONE. Rising from the ashes of shared storage (SAN and NAS, take note), we are beginning to see things going back to way of direct, internal storage.

There were some scuffles in the bushes about 5 years, where Sun (now Oracle) was ahead of its game. The Sun Fire X4500 (aka Thumper) was one of the strong candidates to challenge the SAN/NAS duopoly in this networked storage period. X4500 integrated both the server and the storage components together, using ZFS as a file system and volume manager to deliver a very high throughput on all the JBOD disks very efficiently. ZFS acted as the RAID, so there was no need to have specialized RAID hardware. This proved that a very high performance storage solution can be easily integrated using standard off-the-shelf infrastructure components and the x86 architecture. By combining both compute and storage together, there were hints that the industry was about to rise up to Direct-Attached Storage (DAS) again, despite its perceived weakness against SAN and NAS.

Unfortunately, the applications were not ready for DAS then. Besides ZFS, applications such as databases, emails and file servers were not ready to jump into the DAS bandwagon and watch them ride into the sunset. But the fairy tale seems to be retold again, and this time, the evidence that DAS could rise again is much stronger.

The catalyst to this disruptive force? Virtualization!

I mentioned that VMware is the silent storage killer a few blogs ago. Needless to say, that ruffled a few featheres among the readers. I have no doubt that virtualization is changing how we storage guys look at SAN and NAS. In a traditional setup, the SAN or NAS is setup to provision LUNs or mount points to the data storage for VMFS volumes in the VMware environment. It will then be the storage array to provide snapshots, replications, thin provisioning and so on.

Perhaps VMware is nit picking that managing storage arrays for VMFS volumes is difficult. From the VMware administrators view, they are right. They don’t want to know what’s going on below the VM-level. All they want is storage, any kind of storage and VMware will manage the volumes, snapshots, replication and thin  provisioning. Indeed they were already doing that since vStorage API was introduced. In the new release of VMware version 5.0, the ante has been upped even higher, making networked storage less and less significant.

If you want to know about vStorage API and stuff, below is a diagram of the integration of the various components at the VMware API level.

 

VMware can now use direct, internal storage look like shared storage. The Virtual Storage Appliance (VSA) does just that. VMware already has a thriving market from the community and hobbists for VMware Appliances.

The appliance market has now evolved into new infrastructure too. Using x86 architecture, off-the-shelf infrastructure components (sounds familiar?), companies such as Nutanix and Tintri are taking advantage of this booming trend to introduce specialized VMware appliances as shown in their advertisements on their respective web sites.

Here’s the Nutanix Ad:

 

Here’s the Tintri Ad:

 

Both Tintri and Nutanix are a new breed of appliances – specialized appliances for VMware.

At the same time, other applications are building these specialized appliances as well. I have mentioned Oracle Exadata many times in the past and Oracle Exadata is the perfect example an a fine-tuned, hardcore database engine to make the Oracle run at the best performance possible.

Likewise HP has announced their E5000 Messaging System for Microsoft Exchange. The E5000 is a specialized appliance optimized and well-tuned for the Microsoft Exchange Server 2010. From the words of HP,

“HP E5000 Messaging System is the industry’s first fully self-contained platform built for the next-generation of Microsoft Exchange to deliver enterprise-class messaging to businesses of all sizes. Built as a turnkey solution that can be up and running in a few hours vs. days, the HP E5000 Messaging System gives business users the experience they want most: large mailboxes, centralized archiving of mailboxes files and 24×7 access from any device. IT staffs benefit the solutions simplicity to setup, scale and manage and to meet new demands affordably. Ideal for multi-site enterprises as well as branch office and remote office environments, each HP Messaging System delivers greater simplicity and accelerates deployment with preconfigured solutions starting at 500 mailboxes up to 3000 mailboxes, while delivering large, 1 to 2.5GB mailbox sizes. Clients can grow by adding storage capacity or more appliances within the environment up from hundreds to thousands of mailboxes.”

What are the specs of this E5000 box, you say? Here you go:

 

And look at Row#2 in the table above … Direct, Internal Disks! Look at Row #4, Xeon CPUs! Both Compute and Storage in the same appliance!

While the HP E5000 announcement was recently, Hitachi Data Systems were already in the game early with their Unified Compute Platform and their Converged Platform for Microsoft Exchange with relatively the same idea – specialized appliances.

Perhaps the HDS solutions aren’t exactly direct, internal storage but the concept is still the same – specialized appliance. HDS Unified Compute Platform (UCP) has these components.

 

HDS Converged Platform for MS Exchange provides their specialized “appliance” with Reference Architectures that can support up to 68,000 Microsoft Exchange mailboxes. Here’s an architecture diagram of their “appliance”

 

There’s no denying that the networked storage landscape is changing. So are the computing platforms. We are already seeing the compute and storage components being integrated together, tighter than ever. The wave is rising for specialized appliances and it can only get more intense from now on.

No wonder HP’s Converged Infrastructure vision is betting on x86 architecture, simple storage platforms with SAS/SATA disks and Virtualization. Other vendors are doing the same as well – Cisco, NetApp and VMware with their FlexPod solution and EMC with their VBlocks of VMware, Cisco and EMC Storage.

Hail to the Rise of the Specialized Appliance!

HP has a new CEO (again!)

It is past midnight and I can’t sleep. I haven’t been sleeping well lately, so I thought I catch up with some US news. And lo and behold, another big one showed up on Google News.

HP has fired Leo Apotheker and appointed Meg Whitman, the former boss of EBay, to become the new CEO and President of HP. Leo Apotheker was on the job for about 10 months (Damn!). Such actions shake investors confidence and not good for the image of the company. If Leo Apotheker wasn’t the right guy, why take him in the first place?

Leo is responsible for HP’s purchase of Autonomous just a  month ago and now, the HP vision and direction have to be realigned again.

Here’s one of the news from Reuters.

Wait! There’s more confidence shattering news. Excerpt from one of the online news:

 

HP has laid off hundreds of employees in its ill-fated foray 
into the mobile ecosystem. HP is trying to spin off its PC unit
and, at the same time, find a home for its fast deteriorating 
mobile assets, having spent billions of dollars trying to break 
into phones and tablets ($1 billion+ to buy Palm, investments 
in the business and write off of inventory) and then yanking 
the cord approximately 60 days into the adventure.

It is not hard to write not so good news about HP. They keep making such discouraging news on their own.

Deduplication – a fancy form of Compression?

One of the things that peeved at the HP D2D Workshop a few days ago was this heading in the HP PowerPoint slides – “Deduplication – a fancy form of Compression”. Somehow it bothered me.

I have always placed both deduplication and compression into a bucket I called “Data Reduction“. Some vendors might call it Storage Economics, spinning it in a cooler manner. Either way, both attempt and succeed to reduce the capacity required to store the amount of data and this translates into benefits in storage management and network. With a smaller data set, lesser processing and capacity are required, likely speeding up the performance of the storage array. At the same time, the primary data backup set (you know, the data that you back up every night?) becomes smaller, making backup and restore faster (not necessarily, but you have to rehydrate the data from its reduced state). Another obvious benefit is the ability to transfer the smaller data set over the network more efficiently, compared to its original state and size, making Disaster Recovery more possible and so on.

I have always known that deduplication works with data objects using a differential method. Whether the data object is a file or a chunk of the file, deduplication attempts to differentiate similarities (duplicates), and store one copy of that object and have others referencing to the single object. The differentiation methods commonly used are hashing and delta differential. In hashing, MD-5 and SHA-1 are the popular hashing algorithms used, while in delta differentials, the data objects are compared (usually in a scrutinizing manner) to find the differences. The duplicates or similarities are discarded.

There are many factors involved in deduplication. It could be the types of data, the processing power required to do the deduplication task, and throughput of processing and so on and resulting in the different deduplication ratio and time required to complete the process. I am not going to delve into that as there are many vendors who will be able to articulate this, such as EMC Data Domain, HP D2D/VLS with its StoreOnce technology, Exagrid, Sepaton, Dell Ocarina Networks, NetApp, EMC Centera, CommVault Simpana, Symantec PureDisk, Symantec NetBackup, EMC Avamar and many more.

Meanwhile, compression (especially most commercial compression technology) are based on dictionary coding, a lossless data reduction algorithm. Note that I am using the term encoding rather than compression because factually, encoding is the right word. You can’t squeeze the data into a smaller size like you do with a real life object.

The technique works like this.

  1. When being encoded, a bit/byte or a set of bytes are compared to a “dictionary” which is a pool of  “words”  in a data structure maintained by the encoding technology
  2. If a match is found, the bit/byte or set of bytes is substituted by an “word”, usually a much shorter (hence smaller size) representation form of the bytes being encoded.
  3. As the encoding process continues, more “dictionary words” are built into the “dictionary” based on the bytes already encoded. This is popularly known as the sliding window implementation.
  4. The end result is the data is highly encoded (heavily replaced) by “dictionary words”  and of a much smaller size.

One of the heavily implemented compression technique is based on the theory and methodology introduced by Lempel-Ziv and further enhanced by the Lempel-Ziv-Welch trio. A very good explanation of LZ method can be found here.

Both deduplication and compression have the same objective – that is to reduce the data size for more efficient storage. But both approach it from a different angle but they are by no means, exclusive. Both can be used to complement each other and further reduce the capacity required to store the data.

Deduplication usually works with larger data objects (chunks, files etc) while compression works harder at the lower level (byte range level). Deduplication is heavily deployed in secondary data sets (or backup) because you can find plenty of duplicates while in primary data sets (the data in production), deduplication and compression are deployed, either in a singular fashion or one after another. Deduplication is usually run as Step 1 and then Compression is run in Step 2.

So far, the only one that has impressed me for the primary data reduction is Ocarina Networks, which uses a 3 step approach in dedupe, compress and using specialized compactors to reduce the data even more. I have seen the ability of Ocarina reducing Schlumberger Geoframe and Petrel seismic data to more than 50%. That was impressive!

Having my bothered state satisfied, I guess having the say of “Deduplication – a fancy form of Compression” is someone else’s cup of tea. I would rather say “Deduplication – a fancy form of Data Reduction Technology” but I am not complaining as much I did before.

HP StoreOnce technology – job done!

I had the privilege to attend HP’s D2D workshop yesterday, thanks to the invitation of my old friend, Mr. CC Chung. He is Malaysia’s HP StorageWorks Division Country Manager

I am allowed to assess their D2D solution without fear or favour (I think) and the plush sling bag door gift has nothing to do with my assessment (what do you think? Ha, ha) So here goes.

I based my assessment from these criteria (something I picked up when I was mucking around with Data Domain for 3 months at MTech Security some years ago). The criteria are

  • Hash-based chunking granularity vs Single Instance Store (ala-EMC Centera)
  • Inline or post-processing
  • Source-based or target-based deduplication
  • Forward or reverse referencing (though it has little significance – for now)
  • Global or Local Deduplication

First of all, most people would ask about how well it dedupes and the technical guy’s answer would be “It depends …“. The sales would probably say “YMMV” (can anyone tell me what this acronym is for?). I believe the advertised rate is 20:1, pretty realistic because as we know in the deduplication world, the longer the data is retained, the higher the ratio can get. It also depends on the type of data to be deduped.

And of course, one of the participants (there are always skeptics) was bickering about how his customer was complaining that the deduplication ratio for a SQL database was lower than what was advertised. My take on this matter – Both the customer and the reseller are at fault! The customer happily took what the sales/pre-sales guy said in verbatim and expected fantastic results. The reseller was ill-equipped to know the D2D solution well and therefore, screwed the customer with realistic numbers for the wrong data type.

To me, as Justin (the HP Solution Architect) was presenting the HP D2D solution, I was ticking my check boxes for these criteria. And in my opinion, the HP D2D solution does the job. HP was telling the attendees that they will be surprised to know the end pricing for the D2D solution. I never got to know the figures and I never asked. But when compared to the king of the deduplication devices, Data Domain, it is likely to be lower.

So, here are the ticks to the HP D2D solution

  • In-line deduplication
  • Target-based (of course)
  • Hash-based chunking with variable length for deduplication granularity
  • Local Deduplication

They have several models ranging from the entry-level 2500 series to the 4100 and the 4300 series. After that, HP has another disparate deduplication solution meant for the higher end market called the VLS, and it was not presented in the workshop.

The D2D can be both a VTL and a NAS target dedupe device and the browser-based management GUI was simple and uncluttered. But what interested me was the HP StoreOnce technology, but I did not dig deeper into it. I found a nice video (below) to show a whiteboarding session for HP StoreOnce.

I promised to look deeper into it in a few days time. This week has been such a muck for me but overall, it has been turning up well at the end of the day.

Another thing that was interesting was its sparse indexing for the hashes and there were some dedupe vendors already doing the same thing. But, if you know me, I will research this for knowledge and benefit of all.

After the workshop, HP was so kind to give me an update about their Converged vision, how LeftHand, IBRIX, and 3PAR fit into their strategy and more importantly, their story to the storage market. I will speak more about this in the future. Of course, I will not reveal what’s in store for the future of the D2D solution, but all I can say is, I left the workshop feeling that the solution will do what it is supposed to, nothing more, nothing less. And I meant it in a good way.

I still reserve my opinions about HP because a lot of their storage business are still attached to the server side but hopefully with the upcoming P4000 and P6000 workshops coming up, my opinions may change a little.

VMware – the silent storage killer

When VMware 5.0 was launched last month, I heard the feature called Virtual Storage Appliance (VSA) was finally out and is now being offered as an SMB/SME “storage” solution. In my mind, alarm bells were ringing because in its own stealthy manner, VMware had just become a storage player.

What VMware is offering is “Hey! If you don’t have money to buy your enterprise storage array, don’t worry. Make your own shared storage with our very own VMware VSA“. VSA utilizes the internal disks of the ESX/ESXi host as its shared storage.

VSA is nothing new. For years, LeftHand Networks had one for its engineers to do demo and show the functionality of their solution. EMC had it too, and recently I found out that NetApp has its own VSA, but only resell through its partner, Fujitsu. I am not 100% sure about the NetApp thing and I need a NetApp guy to verify this.

Smaller players, but not insignificant, such as Nutanix, Nexenta and Tintri are already offering their own versions and implementation of VSA to their customers, each with its own uniqueness and differences. With the release of the VMware VSA into the open, we shall see all the big storage players offering their VSAs to VMware, like natives offering sacrifices to VMware God. Or perhaps, it has already begun. It is ala-Nexus 1000v all over again.

VMware has become a huge juggernaut and it is merely using its advantage to consolidate the storage component under its control. When VMware version 4.0 came out, vStorage API was introduced along with VAAI (vStorage API for Array Integration). VAAI was created to enhance the storage experience by offloading specific storage operations to the native features of that supported storage platform. That’s all I know about VAAI at this moment, but with this feature, the storage array is tightly integrating its platform to VMware, or should I say … quietly ensnared by VMware tentacles of doom! (Evil laugh in the background! Mua ha ha ha ….!)

In the recently past VMworld, this storage story is slowly being unfurled even more to the world. VASA (vStorage API for Storage Awareness) was recently announced and EMC’s COO Pat Gelsinger spoke about the tighter integration (that word again!) that blurs the administration domain of the VMware admin and the storage admin. Below is a video of Pat Gelsinger talking about VASA below (this is long 55 minute video – Click only if you have the time).

Mind you, the entire vStorage API is still evolving as VMware 5.0 rolls out but here’s the thing. VMware has come out and say that the storage world about LUNs, RAID groups and mount points are a level below what the VMware admin should be concerned about. VMware admins handles their storage at the VM level or as VMDK and therefore, anything below it is of little significance to them. Again, you can see that VMware is using its muscle to say “If you guys want to play, you have to play by my rules“.

So, some new announcements came out from VMworld for storage such as Capacity Pools, I/O Multiplexer, and Storage DRS (Storage Distributed Resource Management) and also an enhanced version (probably more storage resilient) SRM (Site Recovery Manager). All these are being managed at a level above the traditional storage admin level and VMware has said that the VMware admin would be able to carve out a VM volume with its own set of default storage properties, defined snapshot retentions, replication and perhaps even compression and deduplication. But all these will be happening at the VM volume or VMDK level, not a level below that.

Details are still sketchy at this point in time and we probably won’t see these GA until probably VMware version 6.0. But the inertia has been rocked quietly and the VMware storage momentum will gain strength as time passes by. We could see that VMware would just need JBOD (just a bunch of disks) because it has its own enterprise storage features through its vStorage APIs or its future storage specifications. We have seen it happening in VSA with VMware offering its own storage.

From the similar news, what surprised me was what was quoted as shown below.

The presenters said VMware developed the APIs with EMC, NetApp, Dell,
IBM and Hewlett-Packard,but they began the session with a disclaimer
that none of those vendors has committed to support the APIs in
their arrays.

Why the hell would EMC, NetApp, Dell, IBM and HP do something like that?!! Don’t they know that this could contribute to their insignificance in the future?

I am still perplexed but as the whole thing is still evolving, VMware seems to be only obvious winner here.

The demise of the IT engineer?

Scott Lowe is one of my favourite virtualization experts. I have 2 of his VMware books and his latest book on VMware 5.0 will be out next month. He is currently the CTO of EMC’s vSpecialist team and in one of his blog entries, he spoke about “The End of the Infrastructure Engineer” or IT Engineer in our local speak.

I wrote about having the Cloud will be forcing many of us to be out of our jobs last month. I mentioned that the emergence of Cloud Computing will be superceding the roles of system integrators and resellers, because the Cloud Computing Service Provider will bypass these 2 layers and goes direct to the end user or customer. This will render the role of the IT engineer less significant when they are working for the reseller or partner. Scott’s blog goes a step further saying the the IT engineer role will be gone and they could be forced to be in the application development space for Cloud Computing.

The gist of my blog last month was to get the IT engineer to think deeper and think how they should evolve to adapt and to adopt to this new Cloud paradigm. In Malaysia, in my almost 20-years of IT in the Malaysian IT scene, I have seen the decline of IT engineer. I don’t see many of the younger generation to taking a passionate and enthusiastic fire to enhance their skills and learn even more than it is required for their job. This is a sad thing and through my voluntary work with SNIA Malaysia, I hope to get some of the senior engineers (despite all the fancy titles, we are still pretty much engineers) to get off the fence to start a strong IT community on storage networking and data management technologies. I am strong believer of “If you build it, they will come”.

I agree with what Scott has mentioned, that the role of an IT Engineer will not go away because you will always need an IT Engineer (or Infrastructure Engineer) to manage the infra. But the jobs available for these positions will get scarcer and lesser. So, to those IT engineers who are just so-so, (ooops), you are not good enough anymore.

Perhaps it is a chicken-and-egg thing to say that if there’s no market, why should the IT engineer learn something more to be different and enhance himself/herself. But if this chicken-and-egg debate thing was to continue, then we will forever be trapped in a loop that does not change our status in IT. We will be forever in a rut while others continue to pass us by.

I am always amazed by the amount of intelligent people drawn to the Silicon Valley and with the reknown technology universities such as Stanford, UC Berkeley, MIT and Carnegie Mellon continue to innovate, we continue to see the birth of better, greater and disruptive ideas coming out from Silicon Valley. The IT community in Silicon Valley is very strong and we continue to get IT people challenging the status quo and be different. And more and more “Silicon Valley”-like communities are birthing around the world. Malaysia, in my frank opinion, spends too much time glamourizing (if there’s such a word) IT (or ICT in local Malaysian terminology) and does little to address the core of IT. Our IT people are too complacent and too obedient to be different.

So, here’s my argument to the skeptics of this chicken-and-egg thing. Yes, we only do what we must do to earn our pay for the bread-and-butter stuff in our Malaysian IT, but it is also time to break out from this loop. It’s time to be different, and it’s time get deeper into IT.

Nothing gives me the creeps to see an IT engineer going out to the customer and start pitching speeds and feeds. Come on, any customer could read that off a brochure or a datasheet! So there is absolutely no value in the IT engineer if they only know how to pitch speeds and feeds. Get to know in depth of the solution. Get down into the hardcore of things like the philosophy of the design of the solution. Learn deeper about technology and even better, start thinking of new ways to challenge what’s already out there.

I spend a lot of time learning about file systems in storage networks and that’s my passion. I hope that more IT engineers would break away from the norm to do more. Believe me, as Cloud Computing becomes more prevalent in the Malaysia IT scene, there will be demand for damn good IT engineers, not the ones who knows only speeds and feeds.

Using simple MTBF to determine reliability to Finance

The other day, a prospect was requesting quotations after quotations from a friend of mine to make so-called “apple-to-apple” comparison with another storage vendor. But it was difficult to have that sort of comparisons because one guy would propose SAS, and the other SATA and so on. I was roped in by my friend to help. So in the end I asked this prospect, which 3 of these criteria matters to him most – Performance, Capacity or Reliability.

He gave me an answer and the reliability criteria was leading his requirement. Then he asked me if I could help determine in a “quick-and-dirty manner” by using MTBF (Mean Time Between Failure) of the disks to convince his finance about the question of reliability.

Well, most HDD vendors published their MTBF as a measuring stick to determine the reliability of their manufactured disks. MTBF is by no means accurate but it is useful to define HDD reliability in a crude manner. If you have seen the components that goes into a HDD, you would be amazed that the HDD components go through a tremendously stressed environment. The Read/Write head operating at a flight height (head gap)  between the platters thinner than a human hair and the servo-controlled technology maintains the constant, never-lagging 7200/10,000/15,000 RPM days-after-days, months-after-months, years-after-years. And it yet, we seem to take the HDD for granted, rarely thinking how much technology goes into it on a nanoscale. That’s technology at its best – bringing something so complex to make it so simple for all of us.

I found that the Seagate Constellation.2 Enterprise-class 3TB 7200 RPM disk MTBF is 1.2 million hours while the Seagate Cheetah 600GB 10,000 RPM disk MTBF is 1.5 million hours. So, the Cheetah is about 30% more reliable than the Constellation.2, right?

Wrong! There are other factors involved. In order to achieve 3TB usable, a RAID 1 (average write performance, very good read performance) would require 2 units of 3TB 7200 RPM disks. On the other hand, using a 10, 000 RPM disks, with the largest shipping capacity of 600GB, you would need 10 units of such HDDs. RAID-DP (this is NetApp by the way) would give average write performance (better than RAID 1 in some cases) and very good read performance (for sequential access).

So, I broke down the above 2 examples to this prospect (to achieve 3TB usable)

  1. Seagate Constellation.2 3TB 7200 RPM HDD MTBF is 1.2 million hours x 2 units
  2. Seagate Cheetah 600GB 10,000 RPM HDD MTBF is 1.5 million hours x 10 units

By using a simple calculation of

    RF (Reliability Factor) = MTBF/#HDDs

the prospect will be able to determine which of the 2 HDD types above could be more reliable.

In case #1, RF is 600,000 hours and in case #2, the RF is 125,000 hours. Suddenly you can see that the Constellation.2 HDDs which has a lower MTBF has a higher RF compared to the Cheetah HDDs. Quick and simple, isn’t it?

Note that I did not use the SAS versus SATA technology into the mixture because they don’t matter. SAS and SATA are merely data channels that drives data in and out of the spinning HDDs. So, folks, don’t be fooled that a SAS drive is more reliable than a SATA drive. Sometimes, they are just the same old spinning HDDs. In fact, the mentioned Seagate Constellation.2 HDD (3TB, 7200 RPM) has both SAS and SATA interface.

Of course, this is just one factor in the whole Reliability universe. Other factors such as RAID-level, checksum, CRC, single or dual-controller also determines the reliability of the entire storage array.

In conclusion, we all know that the MTBF alone does not determine the reliability of the solution the prospect is about to purchase. But this is one way you can use to help the finance people to get the idea of reliability.

Gartner figures about the storage market – Half year report

After the IDC report a couple of weeks back, Gartner released their Worldwide External Controller-Based (ECB) Disk Storage Market report last week. The Gartner reports mirrors the IDC report, which confirms the situation in the storage market, and it’s good news!

Asia Pacific and Latin America are 2 regions which are experiencing tremendous growth, with 27.9% and 22.4% respectively. This means that the demand of storage networking and data management professionals is greater than ever. I have always maintained that it is important for professionals like us to enhance our technical and technology know-how to ride on the storage growth momentum.

So from the report, there are no surprises. Below is a table to summarizes the Gartner report.

 

As you can see, HP lost market share together with Dell, Fujitsu and Oracle. Oracle is focusing its energies on its Exadata platform (and it’s all about driving more database license sales), and hence their 7000-series is suffering. Despite Fujitsu partnership with NetApp and EMC, and also with its Eternus storage, lost ground as well.

Dell seems to be losing ground too, but that could be the after effects of divorcing EMC after picking up Compellent early this year. Dell should be able to bounce back as there are reports stating that Compellent is picking up a good pace for Dell. One of the reports is here.

The biggest loser of the last quarter is HP. Even though it has a 0.3% of a market drop, things does not seem so rosy as I have been observing their integration of 3PAR since the purchase late last year. No doubt they are firing all cylinders, but 3PAR does not seem to be helping HP to gain market share (yet). The mid-tier has to be addressed as well and having the old-timer EVA at the helm is beginning to show split ends. Good for the hairdresser; not good for HP. IBRIX and LeftHand complete most of HP storage line-up.

HDS is gaining ground as their storage story is beginning to gel quite well. Coupled with some great moves consolidating their services business and also their Deal Operations Center (DOC) in Kuala Lumpur, simplifies the customers doing business with them. Every company has its challenges but I am beginning to see quite a bit of traction from HDS in the local business scene.

IBM also increased market share with a 0.2% jump. Rather tepid overall but I was informed by an IBMer that their DS8000s and XIVs are doing great in the South East Asia Region. Kudos but again IBM still has to transform its mid-tier DS4000/5000 business, which IBM OEMs the storage backend from NetApp Engenio.

EMC and NetApp are the 2 juggernauts. EMC has been king of the hill for many quarters, and I have been always surprised how nimble EMC is, despite being an 800 pound gorilla. NetApp has proven its critics wrong. For many quarters it has been taking market share and that is reflected in the Gartner Half Year Report below:

 

There you have it folks. The Gartner WW ECB Disk Storage Report. Again, I just want to mention that this is a wonderful opportunity for us doing storage and data management solutions. The demand is there for experienced and skilled professionals but we have to be good, really good to compete with the rest.

NFS deserves more credit from guys doing virtualization

I was at the RedHat Forum last week when I chanced upon a conversation between an attendee and one of the ECS engineers. The conversation went like this

Attendee: Is the RHEV running on SAN or NAS?

ECS Engineer: Oh, for this demo, it is running NFS but in production, you should run iSCSI or Fibre Channel. NFS is only for labs only, not good for production.

Attendee: I see … (and he went off)

I was standing next to them munching my mini-pizza and in my mind, “Oh, come on, NFS is better than that!”

NAS has always played a smaller brother to SAN but usually for the wrong reasons. Perhaps it is the perception that NAS is low-end and not good enough for high-end production systems. However, this is very wrong because NAS has been growing at a faster rate than Fibre Channel, and at the same time Fibre Channel growth has been tapering and possibly on the wane. And I have always said that NAS is a better suited protocol when it comes to unstructured data and files because the NAS protocol is the new storage networking currency of Internet storage and the Cloud (this could change very soon with the REST protocol, but that’s another story). Where else can you find a protocol where sharing is key. iSCSI, even though it has been growing at a faster pace in production storage, cannot be shared easily because it is block-based.

Now back to NFS. NFS version 3 has been around for more than 15 years and has taken its share of bad raps. I agree that this protocol is still very much in the landscape of most NFS installations. But NFS version 4 is changing all that taking on the better parts of the CIFS protocol, notably the equivalent of opportunistic locking or oplocks. In addition to that it has greatly enhanced its security, incorporating Kerberos-type of authentication. As for performance, NFS v4 added in a compounded in a COMPOUND operations for aggregating operations into a single request.

Today, most virtualization solutions from VMware and RedHat works with NFS natively. Note that the Windows CIFS protocol is not supported, only NFS.

This blog entry is not stating that NFS is better than iSCSI or FC but to give NFS credit where credit is due. NFS is not inferior to these block-based protocols. In fact, there are situations where NFS is better, like for instance, expanding the NFS-based datastore on the fly in a VMware implementation. I will use several performance related examples since performance is often used as a yardstick when these protocols are compared.

In an experiment conducted by VMware based on a version 4.0, with all things being equal, below is a series of graphs that compares these 3 protocols (NFS, iSCSI and FC). Note the comparison between NFS and iSCSI rather than FC because NFS and iSCSI run on Gigabit Ethernet, whereas FC is on a different networking platform (hey, if you got the money, go ahead and buy FC!)

Based a one virtual machine (VM), the Read throughput statistics (higher is better) are:

 

The red circle shows that NFS is up there with iSCSI in terms of read throughput from 4K blocks to 512K blocks. As for write throughput for 1 VM, the graph is shown below:


Even though NFS suffers in write throughput in the smaller blocks less than 16KB, NFS performance write throughput improves over iSCSI when between 16K and 32K range and is equal when it is in 64K, 128K and 512K block tests.

The 2 graphs above are of a single VM. But in most real production environment, a single ESX host will run multiple VMs and here is the throughput graph for multiple VMs.

Again, you can see that in a multiple VMs environment, NFS and iSCSI are equal in throughput, dispelling the notion that NFS is not as good in performance as iSCSI.

Oh, you might say that this is just VMs without any OSes or any applications running in these VMs. Next, I want to share with you another performance testing conducted by VMware for an Microsoft Exchange environment.

The next statistics are produced from an Exchange Load Generator (popularly known as LoadGen) to simulate the load of 16,000 Exchange users running in 8 VMs. With all things being equal again, you will be surprised after you see these graphs.

The graph above shows the average send mail latency of the 3 protocols (lower is better). On the average, NFS has lower latency than iSCSI, better than what most people might think. Another graph shows the 95th percentile of send mail latency below:

 

Again, you can see that the NFS’s latency is lower than iSCSI. Interesting isn’t it?

What about IOPS then? In another test with an 8-hour DoubleHeavy LoadGen simulator, the IOPS graphs for all 3 protocols are shown below:

In the graph above (higher is better), NFS performed reasonably well compared to the other 2 block-based protocols, and even outperforming iSCSI in this 8-hour load testing. Surprising huh?

As I have shown, NFS is not inferior compared to the block-based protocols such as iSCSI. In fact, VMware in version 4.1 has improved all 3 storage protocols significantly as mentioned in the VMware paper. The following are quoted in the paper for NFS and iSCSI.

  1. Using storage microbenchmarks, we observe that vSphere 4.1 NFS shows improvements in the range of 12–40% for Reads,and improvements in the range of 32–124% for Writes, over 10GbE.
  2. Using storage microbenchmarks, we observe that vSphere 4.1 Software iSCSI shows improvements in the range of 6–23% for Reads, and improvements in the range of 8–19% for Writes, over 10GbE

The performance improvement for NFS is significant when the network infrastructure was 10GbE. The percentage jump between 32-124%! That’s a whopping figure compared to iSCSI which ranged from 8-19%. Since both protocols are neck-to-neck in version 4.0, NFS seems to be taking a bigger lead in version 4.1. With the release of VMware version 5.0 a few weeks ago, we shall know the performance of both NFS and iSCSI soon.

To be fair, NFS does take a higher CPU performance hit compared to iSCSI as the graph below shows:

Also note that the load testing are based on NFS version 3. If version 4 was used, I am sure the performance statistics above will take a whole new plateau.

Therefore, NFS isn’t inferior at all compared to iSCSI, even in a 10GbE environment. We just got to know the facts instead of brushing off NFS.