APIs that stick in Storage

The competition in storage networking and data management is forever going to get fiercer. And there is always going to be the question of either having open standards APIs or proprietary APIs because storage networking and data management technologies constantly have to balance between gaining a competitive advantage with proprietary APIs  or getting greater market acceptance with open standards APIs.

The flip side, is having proprietary APIs could limit and stunt the growth of the solution but with much better integration and interoperability with complementary solutions. Open standards APIs could make the entire market a plain, vanilla one where there is little difference between technology A or B or C or X, and in the long run, could give lesser incentive for technology innovation.

I am not an API guy. I do not code or do development work on APIs, but I do like APIs (Application Programming Interface). I have my fair share of APIs which can be considered open or proprietary depending on who you talk to. My understanding is that an API might be more open if there are many ISVs, developers and industry supporters endorsing it and have a valid (and usually profit-related) agenda to make the API open.

I can share some work experience with some APIs I have either worked in the past or give my views of some present cool APIs that are related to storage networking and data management.

One of the API-related works I did was with the EMC Centera. I was working with Schlumberger to create a file-level archiving/lifecycle management solution for the GeoFrame seismic files with the EMC Centera. This was back in 2008.

EMC Centera does not present itself as a NAS box (even though I believe, IDC lumps Centera sales numbers to worldwide NAS market figures, unless I am no longer correct chronologically) but rather through ISVs and application-level integration with the EMC Centera API. Here’s a high-level look of how the EMC Centera talks to application with the API.

Note: EMC Centera can also present a NAS integration interface through NFS, CIFS, HTTP and FTP protocols, but the customer must involve (may have to purchase) the EMC Centera Universal Access software appliance. This is for applications that do not have the level of development and integration to interface with the EMC Centera API. 

Continue reading

Can VSA help NetApp?

Almost a year ago, I had an interview with VMware Malaysia for a Senior SE position. They wanted a pre-sales guy who knows Oil & Gas and a strong technology background. I had a strong storage background, and I was involved in Oil & Gas upstream since my NetApp and EMC days.

I thought I was their guy having being led to believe (mostly by my own self-belief) to be so. I didn’t get the job but I did not find out the reason why I lost the opportunity. But I remembered well that I brashly mentioned to the Australian interviewer over the phone that VMware could become the next “storage technology” company. At that time, VMware just launched their VMware 5.0 and along with it, their vSphere Storage Appliance (VSA). This was a turning point of the virtual storage appliance space.

My friend, whose company is a VMware partner, said that the list price for the vSphere VSA was USD5,000.00 a pop. The price wasn’t too bad to the small-medium-enterprise businesses in Malaysia, minus the hardware and storage capacity cost. But what intrigued me back then was this virtual storage appliance concept was disruptive.

VMware could potentially take large JBOD farms, each for the minimum of 3 physical ESXi nodes and build a shared storage using the vSphere Storage Appliance (VSA). Who needs shared iSCSI or Fibre Channel LUNs anymore if VMware had its way?

But VMware still pretty much depended on their storage partners, especially its master, EMC and so I believe VMware held back pushing VSA for the reason of allowing its storage partner ecosystem to thrive. And for that reason, the vSphere Storage API such as VAAI and VASA were developed since vSphere 4 to enhance the deeper integration of these storage vendor’s technology into the VMware world.

But of course, long before the VMware’s VSA venture, HP LeftHand already had one on the cards. The LeftHand Virtual SAN Appliance (also VSA) was already getting rave comments from their partners and customers, impressed with how they were able to showcase HP LeftHand storage solution and technology brilliantly. Eventually, HP recognized the prowess of the LeftHand VSA and started marketing it as HP StoreVirtual VSA. I don’t hear much about the HP LeftHand (since has been renamed as P4000) VSA nowadays, seeing the HP guys in Malaysia preferring to pitch the physical storage than the virtual storage software.

NetApp, back in Q1 of 2012, also decided to go down the path of virtual storage appliance, announcing the ONTAP-v to the world here. It was initially resold through the Fujitsu partnership, but the Q1 announcement expands the ONTAP-v to a larger set of server vendors as shown below. The key component is to have a qualified RAID controller in each of the server vendors.

Continue reading

Is Dell Fluid Enough?

Dell made a huge splash 2 weeks ago in London in their inaugural Dell Storage Forum. They dubbed their storage and management lineup as “Fluid Data Architecture” offering the ability for customers to quickly adapt and automate their business when it comes to storage networking and more importantly, data management.

In the London show, they showcased several key innovations and product development. Here’s a list of their jewels:

  • DR4000 – an inline, content optimized backup deduplication appliance (based on the acquired technology of Ocarina Networks)
  • Compellent Storage Center 6.0 – a major software release
  • Compellent key technology integration with VMware
  • Optimized object storage for Microsoft Sharepoint with the DX6000 Object Storage Platform – DX6000 is an OEM from Caringo
  • Broader support for Dell Force10, PowerConnect and their partner’s Brocade

The technology from Ocarina Networks is fantastic technology and I have always admired Ocarina. I have written about Ocarina in the past in my previous blog. But I was a bit perplexed why Dell chose to enter the secondary dedupe market with a backup dedupe appliance in the DR4000. They are already a latecomer into the secondary deduplication game and I thought HP was already late with their StoreOnce.

They could have used Ocarina’s technology to trailblaze the primary deduplication market. In my previous blog, I mentioned that primary deduplication hasn’t really taken off in a big way, and Dell with the technology from Ocarina could set the standard and establish themselves as the leader of the primary deduplication market space. I was disappointed that they didn’t, not just yet.

The Compellent Storage Center 6.0 release was a major release and it was, for better or for worse, coincided with the departure of Phil Soran, the founder and CEO of Compellent. Phil felt that he can let his baby go and Dell is certainly making the best of what they can do with Compellent as their flagship data storage product.

The major release included 64-bit support for greater performance and scalability and also include several key VMware technologies that other vendors already have. The technologies included:

  • VMware vStorage API for Array Integration (VAAI)
  • Storage Replication Adapter plug-in for VMware Site Recovery Manager (SRM)
  • VSphere 5 client plug-in
  • Integration of Enterprise Manager and VSphere

Other storage related releases (I am not going to talk about Force10 or their PowerConnect solutions here) included Dell offering 16Gbps FibreChannel switches from Brocade and also their DX6000 Object Storage Platform optimized for Microsoft Sharepoint.

I think it is fantastic that Dell is adapting and evolving into a business-oriented, enterprise solution provider and their acquisitions in the past 3 years – EqualLogic, Exanet, Ocarina Networks, Force10 and Compellent – proves that Dell aims to take market share in the storage networking and data management market. They have key initiatives with CommVault, Symantec, VMware and Microsoft as well. And Michael Dell is becoming quite a celebrity lately, giving Dell the boost it needs to battle in this market.

But the question is, “Is their Fluid Data Architecture” fluid enough?” If I were a customer, would I bite?

As a customer, I look for completeness in the total solution, and I cannot fault Dell for having most of the pieces in the solution stack. They have networking in their PowerConnect, Force10 and Brocade. They have SAN in both Compellent and EqualLogic but their unified storage story is still a bit lacking. That’s because we have not seen Dell’s NAS storage yet. Exanet was a scale-out NAS and we have seen little rah-rah about this product.

From a data management perspective, their data protection story gels well with the Commvault and Symantec partnership, but I feel that Dell sales and SEs (at least in Malaysia) spends too much time touting the Compellent Automated Storage Tiering. I have spoken to folks who have listened to Dell guys’ pitches and it’s too one-dimensional. It’s always about storage tiering and little else about other Compellent technology.

At this point of time, the story that Dell sells here in Malaysia is still disjointed, but they are getting better. And eventually, the fluidity (pun intended ;-)) of their Fluid Data Architecture will soon improve.

How will Dell fare in 2012? They had taken a beating in the past 2 IDC’s quarter storage market tracker, losing some percentage points in market share but I think Dell will continue to tinker to get it right.

2012 will be their watershed year.

Why VAAI?

This is Part 2 of my previous blog about VAAI (vStorage API for Array Integration) with more details about VAAI. VAAI offloads some of the I/O related functions to the VAAI-enable storage array, hence giving the hypervisor more compute and memory resource to do it other functions. And the storage array, upon receiving the VAAI command, will execute whatever that is required of it.

Why is VAAI important? What does it do that makes it so useful and important to the hypervisor?

VAAI is about a set of new SCSI commands. And there are 3 important ones:

  • WRITE-SAME
  • XSET
  • ATS

What exactly do these SCSI commands do?

WRITE-SAME is a SCSI command that instructs the storage array to zeroes the virtual VMDK disks or VMFS LUNs. This usually happens when guest OS require a brand new set of virtual disks and initializing the virtual disks is required. In the past (before VAAI), the hypervisor has to repetitively send 0s to the storage to perform the disks zeroing. As shown in the diagram below, you can see that each zero operation is sent from the hypervisor to the storage.

This back-and-forth of sending 0s and acknowledgments between the hypervisor and the storage is not efficient. With VAAI, the command WRITE-SAME  is sent from the hypervisor to the storage array and the storage array will do the zeroing on the disks and LUNs. The hypervisor will not intervene with the process until it gets and acknowledgment of its completion. See diagram below of how VAAI helps in bulk-zeroing of disks and LUNs in the storage array.

The animated GIFs are the taken from Luke Reed’s blog, a fantastic read.

The second VAAI SCSI command is XSET and it performs hardware accelerated full copy. This command is also known as  XCOPY and it offloads the process of copying the blocks of data that makeup a VMDK file. Such copying operations occur when the hypervisor is doing things like VM cloningStorage vMotion or VM creation from templates (bulk copying to create many similar VMs in one go).

Again with the courtesy of Luke Reed’s animated GIFs, the diagram below shows a full copy without VAAI

and after implementing VAAI, where the full, bulk copy operations is offloaded to the storage array to execute.

The third and last SCSI command of VAAI is ATS or hardware-assisted locking. ATS stands for Atomic, Test and Set and the command allow the hypervisor to lock only the required blocks rather than the entire LUN.

Without VAAI, the entire LUN temporarily could be locked by the numerous VMFS operations of one single hypervisor and this prevents other hypervisors from accessing the shared LUNs. The ATS API offloads lock management from the host to the storage array and keeps the LUN available by locking only required blocks, not the entire VMFS file system. Please see the pleasing diagrams below of

(without VAAI ATS)

(with VAAI ATS)

And if you want to see the VAAI Hardware Accelerated Full Copy (aka XSET) in action, here’s a little video showing how it is done in an EMC environment.

The primary significance and noticeable benefit is definitely performance. The secondary benefit, though not so obvious, is allowing VMware and its hypervisor to scale because it does not get bogged down by some of the I/O functions that it is not meant to do.

There were some new additions in vSphere 5.0 for VAAI. From its FAQ, it listed in ESX5.0, support for NAS Hardware Acceleration is included with support for the following primitives:

  • Full File Clone – Like the Full Copy VAAI primitive provided for block arrays, this Full File Clone primitive enables virtual disks to be cloned by the NAS device.
  • Native Snapshot Support – Allows creation of virtual machine snapshots to be offloaded the array.
  • Extended Statistics – Enables visibility to space usage on NAS datastores and is useful for Thin Provisioning.
  • Reserve Space – Enables creation of thick virtual disk files on NAS.

So, there you have it folks. Why VAAI? Here’s why.

VAAI to go!

First of all, let me apologize. I am guilty of not updating my blogs as regularly as I did in the past. Things got a bit crazy after Christmas and I had to juggle several things that demand more of my attention but I am confident things will sort itself out soon enough.

Today’s topic is about VMware’s VAAI (vSphere vStorage API for Array Integration). This feature was announced more than 3 years ago but was only introduced in vSphere 4.1 July 2010 and now with newer enhancements in the latest release of vSphere 5.0.

What is this VAAI and what does this mean from a storage perspective?

When VMware came into prominence in version 3.0/3.5 time, the whole world revolved around the ESX hypervisor. It tried to do everything on its own, in its own proprietary nature. Given its nascent existence then, ESX had to do what it had to do and control everything with its hypervisor universe. Yes, it was a good move then and it did what it was supposed to do. This was back when server virtualization was in its infancy, and resources requirements were less demanding.

Hence when VMware wants to initialize VMs, or create VMDK files on the datastore, or creating clones or snapshots, or even executing VMotion and Storage VMotion, it tends to execute it at the hypervisor level. For example, when creating virtual disks with VMFS, most of the commands to initialization of the disks were done at the VMFS level. Zeroing the virtual disks would mean sending zeroing commands to the actual physical disks on the shared storage. And this would go on back and forth, taxing the CPU cycles and memory on the hypervisor layer, and sending wasteful and unnecessary zeroes over the network to the storage array. This was very inefficient, wasteful and degrades the performance tremendously, especially at the hypervisor layer (compute and memory).

There are also other operations such as virtual disks locking that locks up the entire LUN that housed several datastores. Again, not good.

But VMware took off like a rocket, and quickly established itself as a Tier 1, enterprise server virtualization solution addressing the highest demands of the enterprise. It is also defining the future of Cloud Computing, building exorbitant requirements as it pushes forward. And VMware began to realize that if the hypervisor is to scale, it needs to leave the I/O operations to the “experts”, and the “experts” here being the respective storage array itself.

So, in version 4.1, VAAI (vStorage API for Array Integration) was introduced as an API suite, following 3 other earlier APIs – vStorage API for Site Recovery Manager (SRM), vStorage API for Data Protection and vStorage API for Multipathing.

In a nutshell, as I have mentioned before, VAAI offloads I/O and storage related operations to the VAAI-capable storage array (leave it to the experts) as shown in the diagram below:

 

Of course, the storage vendors themselves has to rework their array OS layer to integrate with the VAAI API. You can say that the VAAI are “hooks” that enhances the storage connectivity and communications with vSphere’s hypervisor. But then again, if you look at it from the other angle, vSphere need the storage vendors more in order for its universe to scale. Good thing VMware has a big, big market share. Imagine if there are no takers for the VAAI APIs. That would be a strange predicament instead.

What is the big deal that we get from VAAI? The significant and noticeable benefit is increase performance. By offloading the I/O functionality and operations to the storage array itself, the hypervisor and the compute and memory resource are not bogged down, resulting in higher performance and better response time to serve its VMs and other VM operations.

I am going off to another meeting and I shall write of VAAI in more details later. Until the next entry, adios and have a great year ahead.

VMware – the silent storage killer

When VMware 5.0 was launched last month, I heard the feature called Virtual Storage Appliance (VSA) was finally out and is now being offered as an SMB/SME “storage” solution. In my mind, alarm bells were ringing because in its own stealthy manner, VMware had just become a storage player.

What VMware is offering is “Hey! If you don’t have money to buy your enterprise storage array, don’t worry. Make your own shared storage with our very own VMware VSA“. VSA utilizes the internal disks of the ESX/ESXi host as its shared storage.

VSA is nothing new. For years, LeftHand Networks had one for its engineers to do demo and show the functionality of their solution. EMC had it too, and recently I found out that NetApp has its own VSA, but only resell through its partner, Fujitsu. I am not 100% sure about the NetApp thing and I need a NetApp guy to verify this.

Smaller players, but not insignificant, such as Nutanix, Nexenta and Tintri are already offering their own versions and implementation of VSA to their customers, each with its own uniqueness and differences. With the release of the VMware VSA into the open, we shall see all the big storage players offering their VSAs to VMware, like natives offering sacrifices to VMware God. Or perhaps, it has already begun. It is ala-Nexus 1000v all over again.

VMware has become a huge juggernaut and it is merely using its advantage to consolidate the storage component under its control. When VMware version 4.0 came out, vStorage API was introduced along with VAAI (vStorage API for Array Integration). VAAI was created to enhance the storage experience by offloading specific storage operations to the native features of that supported storage platform. That’s all I know about VAAI at this moment, but with this feature, the storage array is tightly integrating its platform to VMware, or should I say … quietly ensnared by VMware tentacles of doom! (Evil laugh in the background! Mua ha ha ha ….!)

In the recently past VMworld, this storage story is slowly being unfurled even more to the world. VASA (vStorage API for Storage Awareness) was recently announced and EMC’s COO Pat Gelsinger spoke about the tighter integration (that word again!) that blurs the administration domain of the VMware admin and the storage admin. Below is a video of Pat Gelsinger talking about VASA below (this is long 55 minute video – Click only if you have the time).

Mind you, the entire vStorage API is still evolving as VMware 5.0 rolls out but here’s the thing. VMware has come out and say that the storage world about LUNs, RAID groups and mount points are a level below what the VMware admin should be concerned about. VMware admins handles their storage at the VM level or as VMDK and therefore, anything below it is of little significance to them. Again, you can see that VMware is using its muscle to say “If you guys want to play, you have to play by my rules“.

So, some new announcements came out from VMworld for storage such as Capacity Pools, I/O Multiplexer, and Storage DRS (Storage Distributed Resource Management) and also an enhanced version (probably more storage resilient) SRM (Site Recovery Manager). All these are being managed at a level above the traditional storage admin level and VMware has said that the VMware admin would be able to carve out a VM volume with its own set of default storage properties, defined snapshot retentions, replication and perhaps even compression and deduplication. But all these will be happening at the VM volume or VMDK level, not a level below that.

Details are still sketchy at this point in time and we probably won’t see these GA until probably VMware version 6.0. But the inertia has been rocked quietly and the VMware storage momentum will gain strength as time passes by. We could see that VMware would just need JBOD (just a bunch of disks) because it has its own enterprise storage features through its vStorage APIs or its future storage specifications. We have seen it happening in VSA with VMware offering its own storage.

From the similar news, what surprised me was what was quoted as shown below.

The presenters said VMware developed the APIs with EMC, NetApp, Dell,
IBM and Hewlett-Packard,but they began the session with a disclaimer
that none of those vendors has committed to support the APIs in
their arrays.

Why the hell would EMC, NetApp, Dell, IBM and HP do something like that?!! Don’t they know that this could contribute to their insignificance in the future?

I am still perplexed but as the whole thing is still evolving, VMware seems to be only obvious winner here.