The competition in storage networking and data management is forever going to get fiercer. And there is always going to be the question of either having open standards APIs or proprietary APIs because storage networking and data management technologies constantly have to balance between gaining a competitive advantage with proprietary APIs or getting greater market acceptance with open standards APIs.
The flip side, is having proprietary APIs could limit and stunt the growth of the solution but with much better integration and interoperability with complementary solutions. Open standards APIs could make the entire market a plain, vanilla one where there is little difference between technology A or B or C or X, and in the long run, could give lesser incentive for technology innovation.
I am not an API guy. I do not code or do development work on APIs, but I do like APIs (Application Programming Interface). I have my fair share of APIs which can be considered open or proprietary depending on who you talk to. My understanding is that an API might be more open if there are many ISVs, developers and industry supporters endorsing it and have a valid (and usually profit-related) agenda to make the API open.
I can share some work experience with some APIs I have either worked in the past or give my views of some present cool APIs that are related to storage networking and data management.
One of the API-related works I did was with the EMC Centera. I was working with Schlumberger to create a file-level archiving/lifecycle management solution for the GeoFrame seismic files with the EMC Centera. This was back in 2008.
EMC Centera does not present itself as a NAS box (even though I believe, IDC lumps Centera sales numbers to worldwide NAS market figures, unless I am no longer correct chronologically) but rather through ISVs and application-level integration with the EMC Centera API. Here’s a high-level look of how the EMC Centera talks to application with the API.
Note: EMC Centera can also present a NAS integration interface through NFS, CIFS, HTTP and FTP protocols, but the customer must involve (may have to purchase) the EMC Centera Universal Access software appliance. This is for applications that do not have the level of development and integration to interface with the EMC Centera API.
The Centera API has 5 basic functions which are pretty self-explanatory:
- Store (Write)
- Retrieve (Read)
- Exists
- Delete
- Query
The Centera API is offered as a DLL file for Windows and as a library HP-UX, Solaris, AIX, and Linux. At one point, EMC claimed to have more than 250 application vendors that have developed and adopted the API to integrate their solutions with EMC Centera.
To access the object in Centera, the C-Clip Content Address (CA) is usually used by the application as the entry into the application’s request method to check out and retrieve the BLOB (binary large object) content from Centera.
Of course, Centera’s technology (acquired from a Belgian company called FilePool in 2001) became a success and created a brand new fixed content market for CAS (Content Addressable Storage). The data immutability eventually drove standardization of an open standard API called XAM (eXtensible Access Method) by SNIA.
But over time, things change and eventually, the writing is on the wall for SNIA’s XAM. With EMC Centera being the only prominent fixed content storage (there are others like HP RISS and IBM DR550), the XAM API eventually and effectively has set a course to oblivion. There is a sending away article from The Register for XAM, thereby ending an open standard API for lack of support.
The next obvious storage-related API would be VMware’s vSphere Storage API for Array Integration (VAAI). VAAI, in my opinion, is like a golden sceptre in VMware’s royal court all loyal subjects has to kiss up to. It is the symbol of power and dominance of VMware over all the storage vendors (except maybe EMC), and yet revered with gratitude as well. Over time, as we are beginning to realize in each newer version of vSphere, the power of VAAI is growing and may (and could) render the storage array vendors into insignificant JBODs to VMFS. I could be wrong with my views but then again, I hope I am wrong.
VAAI is the API that allow storage array vendors to integrate themselves with the ESXi hypervisor. VAAI will offload some I/O and storage related activities to the storage array itself. In vSphere 4.1, when VAAI was first announced, there were a few key “features” that mattered. They were
- Full Copy – the ability to request a data copy operation within the storage array and send the acknowledgment back to ESX once completed. This feature is useful for tasks such as creating storage for the VMs, data migration and LUN expansion.
- Block Zero – the ability to write data blocks to the storage repetitively and useful in cases such as creating fault tolerant VMs and thin provisioning.
- Hardware Assisted Locking (also known as Atomic Test & Set) – the ability to do granular locking of blocks to be written to in the storage rather than the entire shared volume as in the past version of ESX 3.5. That represented a significant improvement over the T10 SCSI reservation method and gave ESX a leg up for scalability and performance.
In version 5.0 and the latest 5.1 version, these “features” or primitives were further enhanced, along with new “features” associated with storage. These include
- Thin Provision Stun – the ability to suspend (stun) a VM when the thin provisioned storage is 100% full and resume VM operations after the storage capacity issue has been rectified. This prevents “no space error” which used to cause VMs and its applications to crash
- Space Reclaimation – the ability unmap the blocks related to the VMFS has been deleted and recover them as unused blocks.
- NFS enhancements such as full file clone, Nnative snapshot support, extended statistics and reserve space.
- Space Efficient Sparse Virtual Disks – a new virtual disk type that can reclaim disk space within the guest OS
- Host resiliency with All Path Down situation when there is Permanent Device Loss
- SSD Monitoring through smartd daemon. A really cool feature is the media wearout indicator that helps ESX recognize the durability and longevity of operational SSDs.
- Much, much more
The SSD monitoring CLI command output is shown below:
There are also other vSphere storage-related APIs such as
- VADP (vStorage API for Data Protection) that replaced VCB (VMware Consolidated Backup)
- VAMP (vStorage API for Multipathing) which is coordinated in the VMkernel Pluggable Storage Architecture (PSA)
- VASRM (vStorage API for Site Recovery Manager)
- VASA (vStorage API for Storage Awareness)
but we are not going to discuss them here today
WAIT! There’s more coming in future releases of VMware. Here’s a glimpse of some:
- Storage Policy-Based Management (SPBM) – the ability to virtualize any storage based on performance, availability and protection characteristics
- Virtual Volumes – the ability to allow .vmdk files to be stored in the storage array natively and thereby each VM can be snapshot at the storage level individually
- Virtual Flash – the ability to share server-side flash of several ESX hosts as a pool
- Virtual SAN – giving the previous vSphere Virtual Storage Appliance (VSA) steroids. There will be performance and availability enhancements and the ability to scale-out up to 32 ESX hosts, each supporting 36 local, direct-attached disk drives. Wow, 32 x 36 is 1,152 drives giving the scale-out ESX hosts the freedom to access a petabyte-size storage pool.
Virtual SAN is going to be the killer. A year ago, I said just that. VMware is a storage killer. A silent one but one that is slowly coming out into the open.
In the end, of course, the polarity between proprietary APIs and open standards APIs does not have to be 1 or 0, black or white. The technology vendor can always choose to adjust the grey area between the 2 polarities as well as the intensity. A vendor can choose to adopt open APIs on one end and still encourage adoption of their proprietary APIs if they have enough market leverage to do so.
We have seen one seemingly open standards API like EMC Centera API that has its life snuffed out, and another which is VAAI that is growing in dominance. Like a flame that needs oxygen, APIs need the support of the ecosystem around it, be it Open Standards or Proprietary.
NOTE: I have been meaning to write another entry for the Symantec NetBackup OST (Open STorage) API here, but I guess this entry has become waaaaaay too long. I will postpone that till another day.
Pingback: APIs that stick in Storage « Storage Gaga
The competition in storage networking and data management is forever going to get fiercer. And there is always going to be the question of either having open standards APIs or proprietary APIs because storage networking and data management technologies constantly have to balance between gaining a competitive advantage with proprietary APIs or getting greater market acceptance with open standards APIs.
Hi James
Absolutely true. The challenge is to remain as “perceived” open while maintaining the proprietary part of the interfaces. Just look at Cisco. We all know EtherChannel trunking is pretty much proprietary but they have been marketing as open for years.
Thanks for your insights.
/Chin-Fah