VMware – the silent storage killer

When VMware 5.0 was launched last month, I heard the feature called Virtual Storage Appliance (VSA) was finally out and is now being offered as an SMB/SME “storage” solution. In my mind, alarm bells were ringing because in its own stealthy manner, VMware had just become a storage player.

What VMware is offering is “Hey! If you don’t have money to buy your enterprise storage array, don’t worry. Make your own shared storage with our very own VMware VSA“. VSA utilizes the internal disks of the ESX/ESXi host as its shared storage.

VSA is nothing new. For years, LeftHand Networks had one for its engineers to do demo and show the functionality of their solution. EMC had it too, and recently I found out that NetApp has its own VSA, but only resell through its partner, Fujitsu. I am not 100% sure about the NetApp thing and I need a NetApp guy to verify this.

Smaller players, but not insignificant, such as Nutanix, Nexenta and Tintri are already offering their own versions and implementation of VSA to their customers, each with its own uniqueness and differences. With the release of the VMware VSA into the open, we shall see all the big storage players offering their VSAs to VMware, like natives offering sacrifices to VMware God. Or perhaps, it has already begun. It is ala-Nexus 1000v all over again.

VMware has become a huge juggernaut and it is merely using its advantage to consolidate the storage component under its control. When VMware version 4.0 came out, vStorage API was introduced along with VAAI (vStorage API for Array Integration). VAAI was created to enhance the storage experience by offloading specific storage operations to the native features of that supported storage platform. That’s all I know about VAAI at this moment, but with this feature, the storage array is tightly integrating its platform to VMware, or should I say … quietly ensnared by VMware tentacles of doom! (Evil laugh in the background! Mua ha ha ha ….!)

In the recently past VMworld, this storage story is slowly being unfurled even more to the world. VASA (vStorage API for Storage Awareness) was recently announced and EMC’s COO Pat Gelsinger spoke about the tighter integration (that word again!) that blurs the administration domain of the VMware admin and the storage admin. Below is a video of Pat Gelsinger talking about VASA below (this is long 55 minute video – Click only if you have the time).

Mind you, the entire vStorage API is still evolving as VMware 5.0 rolls out but here’s the thing. VMware has come out and say that the storage world about LUNs, RAID groups and mount points are a level below what the VMware admin should be concerned about. VMware admins handles their storage at the VM level or as VMDK and therefore, anything below it is of little significance to them. Again, you can see that VMware is using its muscle to say “If you guys want to play, you have to play by my rules“.

So, some new announcements came out from VMworld for storage such as Capacity Pools, I/O Multiplexer, and Storage DRS (Storage Distributed Resource Management) and also an enhanced version (probably more storage resilient) SRM (Site Recovery Manager). All these are being managed at a level above the traditional storage admin level and VMware has said that the VMware admin would be able to carve out a VM volume with its own set of default storage properties, defined snapshot retentions, replication and perhaps even compression and deduplication. But all these will be happening at the VM volume or VMDK level, not a level below that.

Details are still sketchy at this point in time and we probably won’t see these GA until probably VMware version 6.0. But the inertia has been rocked quietly and the VMware storage momentum will gain strength as time passes by. We could see that VMware would just need JBOD (just a bunch of disks) because it has its own enterprise storage features through its vStorage APIs or its future storage specifications. We have seen it happening in VSA with VMware offering its own storage.

From the similar news, what surprised me was what was quoted as shown below.

The presenters said VMware developed the APIs with EMC, NetApp, Dell,
IBM and Hewlett-Packard,but they began the session with a disclaimer
that none of those vendors has committed to support the APIs in
their arrays.

Why the hell would EMC, NetApp, Dell, IBM and HP do something like that?!! Don’t they know that this could contribute to their insignificance in the future?

I am still perplexed but as the whole thing is still evolving, VMware seems to be only obvious winner here.

The demise of the IT engineer?

Scott Lowe is one of my favourite virtualization experts. I have 2 of his VMware books and his latest book on VMware 5.0 will be out next month. He is currently the CTO of EMC’s vSpecialist team and in one of his blog entries, he spoke about “The End of the Infrastructure Engineer” or IT Engineer in our local speak.

I wrote about having the Cloud will be forcing many of us to be out of our jobs last month. I mentioned that the emergence of Cloud Computing will be superceding the roles of system integrators and resellers, because the Cloud Computing Service Provider will bypass these 2 layers and goes direct to the end user or customer. This will render the role of the IT engineer less significant when they are working for the reseller or partner. Scott’s blog goes a step further saying the the IT engineer role will be gone and they could be forced to be in the application development space for Cloud Computing.

The gist of my blog last month was to get the IT engineer to think deeper and think how they should evolve to adapt and to adopt to this new Cloud paradigm. In Malaysia, in my almost 20-years of IT in the Malaysian IT scene, I have seen the decline of IT engineer. I don’t see many of the younger generation to taking a passionate and enthusiastic fire to enhance their skills and learn even more than it is required for their job. This is a sad thing and through my voluntary work with SNIA Malaysia, I hope to get some of the senior engineers (despite all the fancy titles, we are still pretty much engineers) to get off the fence to start a strong IT community on storage networking and data management technologies. I am strong believer of “If you build it, they will come”.

I agree with what Scott has mentioned, that the role of an IT Engineer will not go away because you will always need an IT Engineer (or Infrastructure Engineer) to manage the infra. But the jobs available for these positions will get scarcer and lesser. So, to those IT engineers who are just so-so, (ooops), you are not good enough anymore.

Perhaps it is a chicken-and-egg thing to say that if there’s no market, why should the IT engineer learn something more to be different and enhance himself/herself. But if this chicken-and-egg debate thing was to continue, then we will forever be trapped in a loop that does not change our status in IT. We will be forever in a rut while others continue to pass us by.

I am always amazed by the amount of intelligent people drawn to the Silicon Valley and with the reknown technology universities such as Stanford, UC Berkeley, MIT and Carnegie Mellon continue to innovate, we continue to see the birth of better, greater and disruptive ideas coming out from Silicon Valley. The IT community in Silicon Valley is very strong and we continue to get IT people challenging the status quo and be different. And more and more “Silicon Valley”-like communities are birthing around the world. Malaysia, in my frank opinion, spends too much time glamourizing (if there’s such a word) IT (or ICT in local Malaysian terminology) and does little to address the core of IT. Our IT people are too complacent and too obedient to be different.

So, here’s my argument to the skeptics of this chicken-and-egg thing. Yes, we only do what we must do to earn our pay for the bread-and-butter stuff in our Malaysian IT, but it is also time to break out from this loop. It’s time to be different, and it’s time get deeper into IT.

Nothing gives me the creeps to see an IT engineer going out to the customer and start pitching speeds and feeds. Come on, any customer could read that off a brochure or a datasheet! So there is absolutely no value in the IT engineer if they only know how to pitch speeds and feeds. Get to know in depth of the solution. Get down into the hardcore of things like the philosophy of the design of the solution. Learn deeper about technology and even better, start thinking of new ways to challenge what’s already out there.

I spend a lot of time learning about file systems in storage networks and that’s my passion. I hope that more IT engineers would break away from the norm to do more. Believe me, as Cloud Computing becomes more prevalent in the Malaysia IT scene, there will be demand for damn good IT engineers, not the ones who knows only speeds and feeds.

Using simple MTBF to determine reliability to Finance

The other day, a prospect was requesting quotations after quotations from a friend of mine to make so-called “apple-to-apple” comparison with another storage vendor. But it was difficult to have that sort of comparisons because one guy would propose SAS, and the other SATA and so on. I was roped in by my friend to help. So in the end I asked this prospect, which 3 of these criteria matters to him most – Performance, Capacity or Reliability.

He gave me an answer and the reliability criteria was leading his requirement. Then he asked me if I could help determine in a “quick-and-dirty manner” by using MTBF (Mean Time Between Failure) of the disks to convince his finance about the question of reliability.

Well, most HDD vendors published their MTBF as a measuring stick to determine the reliability of their manufactured disks. MTBF is by no means accurate but it is useful to define HDD reliability in a crude manner. If you have seen the components that goes into a HDD, you would be amazed that the HDD components go through a tremendously stressed environment. The Read/Write head operating at a flight height (head gap)  between the platters thinner than a human hair and the servo-controlled technology maintains the constant, never-lagging 7200/10,000/15,000 RPM days-after-days, months-after-months, years-after-years. And it yet, we seem to take the HDD for granted, rarely thinking how much technology goes into it on a nanoscale. That’s technology at its best – bringing something so complex to make it so simple for all of us.

I found that the Seagate Constellation.2 Enterprise-class 3TB 7200 RPM disk MTBF is 1.2 million hours while the Seagate Cheetah 600GB 10,000 RPM disk MTBF is 1.5 million hours. So, the Cheetah is about 30% more reliable than the Constellation.2, right?

Wrong! There are other factors involved. In order to achieve 3TB usable, a RAID 1 (average write performance, very good read performance) would require 2 units of 3TB 7200 RPM disks. On the other hand, using a 10, 000 RPM disks, with the largest shipping capacity of 600GB, you would need 10 units of such HDDs. RAID-DP (this is NetApp by the way) would give average write performance (better than RAID 1 in some cases) and very good read performance (for sequential access).

So, I broke down the above 2 examples to this prospect (to achieve 3TB usable)

  1. Seagate Constellation.2 3TB 7200 RPM HDD MTBF is 1.2 million hours x 2 units
  2. Seagate Cheetah 600GB 10,000 RPM HDD MTBF is 1.5 million hours x 10 units

By using a simple calculation of

    RF (Reliability Factor) = MTBF/#HDDs

the prospect will be able to determine which of the 2 HDD types above could be more reliable.

In case #1, RF is 600,000 hours and in case #2, the RF is 125,000 hours. Suddenly you can see that the Constellation.2 HDDs which has a lower MTBF has a higher RF compared to the Cheetah HDDs. Quick and simple, isn’t it?

Note that I did not use the SAS versus SATA technology into the mixture because they don’t matter. SAS and SATA are merely data channels that drives data in and out of the spinning HDDs. So, folks, don’t be fooled that a SAS drive is more reliable than a SATA drive. Sometimes, they are just the same old spinning HDDs. In fact, the mentioned Seagate Constellation.2 HDD (3TB, 7200 RPM) has both SAS and SATA interface.

Of course, this is just one factor in the whole Reliability universe. Other factors such as RAID-level, checksum, CRC, single or dual-controller also determines the reliability of the entire storage array.

In conclusion, we all know that the MTBF alone does not determine the reliability of the solution the prospect is about to purchase. But this is one way you can use to help the finance people to get the idea of reliability.

Gartner figures about the storage market – Half year report

After the IDC report a couple of weeks back, Gartner released their Worldwide External Controller-Based (ECB) Disk Storage Market report last week. The Gartner reports mirrors the IDC report, which confirms the situation in the storage market, and it’s good news!

Asia Pacific and Latin America are 2 regions which are experiencing tremendous growth, with 27.9% and 22.4% respectively. This means that the demand of storage networking and data management professionals is greater than ever. I have always maintained that it is important for professionals like us to enhance our technical and technology know-how to ride on the storage growth momentum.

So from the report, there are no surprises. Below is a table to summarizes the Gartner report.

 

As you can see, HP lost market share together with Dell, Fujitsu and Oracle. Oracle is focusing its energies on its Exadata platform (and it’s all about driving more database license sales), and hence their 7000-series is suffering. Despite Fujitsu partnership with NetApp and EMC, and also with its Eternus storage, lost ground as well.

Dell seems to be losing ground too, but that could be the after effects of divorcing EMC after picking up Compellent early this year. Dell should be able to bounce back as there are reports stating that Compellent is picking up a good pace for Dell. One of the reports is here.

The biggest loser of the last quarter is HP. Even though it has a 0.3% of a market drop, things does not seem so rosy as I have been observing their integration of 3PAR since the purchase late last year. No doubt they are firing all cylinders, but 3PAR does not seem to be helping HP to gain market share (yet). The mid-tier has to be addressed as well and having the old-timer EVA at the helm is beginning to show split ends. Good for the hairdresser; not good for HP. IBRIX and LeftHand complete most of HP storage line-up.

HDS is gaining ground as their storage story is beginning to gel quite well. Coupled with some great moves consolidating their services business and also their Deal Operations Center (DOC) in Kuala Lumpur, simplifies the customers doing business with them. Every company has its challenges but I am beginning to see quite a bit of traction from HDS in the local business scene.

IBM also increased market share with a 0.2% jump. Rather tepid overall but I was informed by an IBMer that their DS8000s and XIVs are doing great in the South East Asia Region. Kudos but again IBM still has to transform its mid-tier DS4000/5000 business, which IBM OEMs the storage backend from NetApp Engenio.

EMC and NetApp are the 2 juggernauts. EMC has been king of the hill for many quarters, and I have been always surprised how nimble EMC is, despite being an 800 pound gorilla. NetApp has proven its critics wrong. For many quarters it has been taking market share and that is reflected in the Gartner Half Year Report below:

 

There you have it folks. The Gartner WW ECB Disk Storage Report. Again, I just want to mention that this is a wonderful opportunity for us doing storage and data management solutions. The demand is there for experienced and skilled professionals but we have to be good, really good to compete with the rest.

NFS deserves more credit from guys doing virtualization

I was at the RedHat Forum last week when I chanced upon a conversation between an attendee and one of the ECS engineers. The conversation went like this

Attendee: Is the RHEV running on SAN or NAS?

ECS Engineer: Oh, for this demo, it is running NFS but in production, you should run iSCSI or Fibre Channel. NFS is only for labs only, not good for production.

Attendee: I see … (and he went off)

I was standing next to them munching my mini-pizza and in my mind, “Oh, come on, NFS is better than that!”

NAS has always played a smaller brother to SAN but usually for the wrong reasons. Perhaps it is the perception that NAS is low-end and not good enough for high-end production systems. However, this is very wrong because NAS has been growing at a faster rate than Fibre Channel, and at the same time Fibre Channel growth has been tapering and possibly on the wane. And I have always said that NAS is a better suited protocol when it comes to unstructured data and files because the NAS protocol is the new storage networking currency of Internet storage and the Cloud (this could change very soon with the REST protocol, but that’s another story). Where else can you find a protocol where sharing is key. iSCSI, even though it has been growing at a faster pace in production storage, cannot be shared easily because it is block-based.

Now back to NFS. NFS version 3 has been around for more than 15 years and has taken its share of bad raps. I agree that this protocol is still very much in the landscape of most NFS installations. But NFS version 4 is changing all that taking on the better parts of the CIFS protocol, notably the equivalent of opportunistic locking or oplocks. In addition to that it has greatly enhanced its security, incorporating Kerberos-type of authentication. As for performance, NFS v4 added in a compounded in a COMPOUND operations for aggregating operations into a single request.

Today, most virtualization solutions from VMware and RedHat works with NFS natively. Note that the Windows CIFS protocol is not supported, only NFS.

This blog entry is not stating that NFS is better than iSCSI or FC but to give NFS credit where credit is due. NFS is not inferior to these block-based protocols. In fact, there are situations where NFS is better, like for instance, expanding the NFS-based datastore on the fly in a VMware implementation. I will use several performance related examples since performance is often used as a yardstick when these protocols are compared.

In an experiment conducted by VMware based on a version 4.0, with all things being equal, below is a series of graphs that compares these 3 protocols (NFS, iSCSI and FC). Note the comparison between NFS and iSCSI rather than FC because NFS and iSCSI run on Gigabit Ethernet, whereas FC is on a different networking platform (hey, if you got the money, go ahead and buy FC!)

Based a one virtual machine (VM), the Read throughput statistics (higher is better) are:

 

The red circle shows that NFS is up there with iSCSI in terms of read throughput from 4K blocks to 512K blocks. As for write throughput for 1 VM, the graph is shown below:


Even though NFS suffers in write throughput in the smaller blocks less than 16KB, NFS performance write throughput improves over iSCSI when between 16K and 32K range and is equal when it is in 64K, 128K and 512K block tests.

The 2 graphs above are of a single VM. But in most real production environment, a single ESX host will run multiple VMs and here is the throughput graph for multiple VMs.

Again, you can see that in a multiple VMs environment, NFS and iSCSI are equal in throughput, dispelling the notion that NFS is not as good in performance as iSCSI.

Oh, you might say that this is just VMs without any OSes or any applications running in these VMs. Next, I want to share with you another performance testing conducted by VMware for an Microsoft Exchange environment.

The next statistics are produced from an Exchange Load Generator (popularly known as LoadGen) to simulate the load of 16,000 Exchange users running in 8 VMs. With all things being equal again, you will be surprised after you see these graphs.

The graph above shows the average send mail latency of the 3 protocols (lower is better). On the average, NFS has lower latency than iSCSI, better than what most people might think. Another graph shows the 95th percentile of send mail latency below:

 

Again, you can see that the NFS’s latency is lower than iSCSI. Interesting isn’t it?

What about IOPS then? In another test with an 8-hour DoubleHeavy LoadGen simulator, the IOPS graphs for all 3 protocols are shown below:

In the graph above (higher is better), NFS performed reasonably well compared to the other 2 block-based protocols, and even outperforming iSCSI in this 8-hour load testing. Surprising huh?

As I have shown, NFS is not inferior compared to the block-based protocols such as iSCSI. In fact, VMware in version 4.1 has improved all 3 storage protocols significantly as mentioned in the VMware paper. The following are quoted in the paper for NFS and iSCSI.

  1. Using storage microbenchmarks, we observe that vSphere 4.1 NFS shows improvements in the range of 12–40% for Reads,and improvements in the range of 32–124% for Writes, over 10GbE.
  2. Using storage microbenchmarks, we observe that vSphere 4.1 Software iSCSI shows improvements in the range of 6–23% for Reads, and improvements in the range of 8–19% for Writes, over 10GbE

The performance improvement for NFS is significant when the network infrastructure was 10GbE. The percentage jump between 32-124%! That’s a whopping figure compared to iSCSI which ranged from 8-19%. Since both protocols are neck-to-neck in version 4.0, NFS seems to be taking a bigger lead in version 4.1. With the release of VMware version 5.0 a few weeks ago, we shall know the performance of both NFS and iSCSI soon.

To be fair, NFS does take a higher CPU performance hit compared to iSCSI as the graph below shows:

Also note that the load testing are based on NFS version 3. If version 4 was used, I am sure the performance statistics above will take a whole new plateau.

Therefore, NFS isn’t inferior at all compared to iSCSI, even in a 10GbE environment. We just got to know the facts instead of brushing off NFS.

HDS acquires BlueArc … no surprise

After my early morning exercise routine, I sat down with my laptop hoping to start a new blog entry when a certain HDS news caught my eye. Here’s one of them.

It is of no surprise to me because all along, HDS hardly had a competitive, high-end NAS to compete of their own. Their first Linux-based NAS sucks, and HNAS wasn’t really successful either. But their 5-year OEM with BlueArc gave HDS an strong option to be in the NAS space.

As usual, HDS is as cautious as ever. While the 800-pound EMC has been on a shopping spree for the past 3-4 years, NetApp acquiring a few (note Engenio, Bycast, Akkori, Onaro) along the way, the only notable acquisition made by HDS was Archivas (news here). That was waaaaaay back in 2007. However, what prompted the HDS reaction was a surprise to me. According to Network Computing, it was IBM who wanted to acquire BlueArc, hence triggering HDS to have the first right to fork out the dough for BlueArc.

Why does IBM want to acquire BlueArc? IBM is sliding and lacking the storage array technology of their own. Only XIV and StorWiz(e) are  worth mentioning because their DS-series and N-series belong to NetApp. Their SONAS is pretty much a patchwork of IBM GPFS servers.  In fact, from the same Network Computing article, IBM has terminated their Data DirectNetworks storage back-end and just initiated the sourcing of the storage back-end from NetApp. It is good money to NetApp, but bad for IBM. Their story don’t gel anymore and their platform portfolio staggers as we speak.

This will definitely prompt IBM competitors to sharpen their knives. HP is renewing their artillery with 3PAR and LeftHand, and also IBRIX while Dell is coming out with guns blazing from Compellent, EqualLogic, a bit of Exanet and pretty soon, Ocarina Networks (this is a primary storage deduplication technology). Though Dell lost market share in the last IDC figures, and most likely because of lost EMC sales, they seem to be looking good with Compellent and EqualLogic. HP, is still renewing, and perhaps when they are done ditching their PC business, they should have more focus on the enterprise. Meanwhile, HDS has been winning market share in the last IDC quarter and doing well with their own VSP and AMS series.

HP and Dell have reloaded, and EMC and NetApp coming into the market as storage juggernauts. IBM cannot afford to sit quietly. How long is IBM prepared to do that as the world passes them by?

As for HDS, they are pitching their story together. AMS on the low and mid-end, VSP on the mid to high end. BlueArc fits into the NAS and scale-out NAS space. Yup, they are getting there.

We do not hear much about BlueArc from HDS Malaysia, but be prepared to know more about them soon. Wonder how HDS would rename BlueArc? H-BLU? H-ARC?

Funny Microsoft Cloud video – has Microsoft seen a mirror lately?

I am no virtualization expert, but any IT guy would be able to tell you how far ahead VMware is in the virtualization space. (note that I am not talking about the cloud space)

Virtualization is the cornerstone of Cloud Computing and everyone is claiming they are Cloud-this or Cloud-that. So is Microsoft but when it comes to the virtualization game, Microsoft Hyper-V has much to catch up to, if it ever catches up with VMware.

As what they have done in the past to other technologies (think Netscape), they cast a industry wide suffocation strategy that snuff the lights out of their competitors. But things have changed and new proponents such as Open Source, Mobile Computing and Cloud Computing are not going to be victims of Microsoft aphyxiation strategy (come on, Ballmer, try something new). Hence when I found this video

I found it really funny. It was likely Microsoft was poking at itself, without knowing it.

From the words of Mahatma Gandhi,

“First they ignore you,

then they laugh at you,

then they fight you,

then you win”

This is what Microsoft do best … throwing dirt at the competitor. Hmmm….

Got invited to HP Malaysia’s workshop … he he!

No, HP probably didn’t read my blogs and this isn’t a knee-jerk reaction from HP about things I have been writing. OK, I didn’t write about HP because I don’t know much about them. But this came as a coincidence as well as an apt title (my bad for the shameless plug for this entry’s title).

In my previous blog entry, I wrote about HP’s future in the latest IDC Q2 market share figures. I was not too enthusiastic about HP’s storage line up. Today, my old friend Mr. CC Chung, who is HP’s Country Manager for Storage, had tea with me at Bangsar Shopping Center. We were there to discuss about HP engagement with SNIA when the topic of HP’s storage came about (obviously). Chung said I lack the understanding of HP storage solutions, which I admit, is very true. And so, my friend with his kind gesture invited me to a series of HP Storage Solutions workshops, which I accept with glee and gratitude. Thank you very much, my friend.

Here’s a screenshot of their upcoming workshops:

 

I am seriously looking forward to the workshop and learn about the vibes of HP Storage Solutions. Too bad there aren’t workshops for HP 3PAR and HP X9000 IBRIX but I am sure this will be the start of my new friendship with HP.

Incidentally, as I was waiting for Chung, I was reading the HWM Magazine August 2011 issue, and lo behold, Chung was in the news announcing the HP X9000 IBRIX and X5000 G2 Network Storage System. I couldn’t find the HWM article online but I found the next best thing. A similar article (online, of course), appeared at CIO Asia. And with a nice picture of Mr. CC Chung too!