About cfheoh

I am a technology blogger with 30 years of IT experience. I write heavily on technologies related to storage networking and data management because those are my areas of interest and expertise. I introduce technologies with the objectives to get readers to know the facts and use that knowledge to cut through the marketing hypes, FUD (fear, uncertainty and doubt) and other fancy stuff. Only then, there will be progress. I am involved in SNIA (Storage Networking Industry Association) and between 2013-2015, I was SNIA South Asia & SNIA Malaysia non-voting representation to SNIA Technical Council. I currently employed at iXsystems as their General Manager for Asia Pacific Japan.

The future is intelligent objects

We are used to block-based approach and also the file-based approach to data. The 2 diagrams below shows the basics of how we access data in both block-based and file-based data on the storage device.

 

For block-based , the storage of the blocks is merely in arrays of unrelated contiguous blocks. For file-based, as seen below,

 

there is another layer of abstraction, and this is called the file system. But if you seen both diagrams above, there are some random numbers in light blue and that is to represent the storage device, the hard disk drive’s export of “containers” to the file system or the application that is accessing the storage device. This is usually the LBA (Logical Block Addressing), which is basically set of schematics that defines the locations on the hard disk drives. LBA tells the location of where the data is stored. For more information about LBA, check out this Wikipedia definition. But the whole idea is LBA is dumb. It is pretty much static and exported to file systems and applications so that these guys can do something with it.

There’s something brewing in the background since 1994 and it is one of the many efforts to make intelligent storage devices. This new object-based interface was part of the research project done by Carnegie-Mellon University (CMU). Initially, it was known as Network Attached Secure Disk (NASD) but eventually made its way to the working group in SNIA, and developing it for ANSI T10 INCITS standard. ANSI T10 is the guardian of all SCSI standards. This is called Object Storage Device (OSD). The SCSI architecture diagram below shows the layer where OSD resides.

 

The motivation for this simple: To make storage devices of today to do more computational work, in particularly I/O, relieving the hosts and the local systems to concentrate other computational processing work. And the same time, the local systems must have some level of interactivity and management between the storage object and the computational hosts.

In the diagram below which compares both block-based and OSD,

 

you can see the separation of file system management interface that is at the kernel-space of the local host/system and this is replaced by the OSD Management interface at the storage device.

What does this all mean? This means that using LBA type of addressing that we are familiar with in the block-based and file-based storage is no longer the way to go, because as I mentioned before, LBA is dumb.

OSD, in some way, replaces the LBA with OIDs (Object IDs). The existing local system and/or its file system will interact with the storage devices with OIDs and the OIDs links to its respective objects storage. And the object will carry a lot of metadata, that represents the object, giving it the intelligent and management capability of the object.

 

 

The prominence of the metadata in the OSD would mean that we can build much more intelligent systems in the future. The OIDs and the objects can be grouped together in a flat design or can be organized and categorized in a virtual, hierarchical model.

 

Object storage is an intelligent evolution of disk drives that can store and serve objects rather than simply place data on tracks and sectors. And it can bring the following benefits:

  • Intelligent space management in the storage layer
  • Data aware pre-fetching and caching
  • Robust shared access by multiple clients
  • Scalable performance using off-loaded data path
  • Reliability security

Several vendors such as EMC and NetApp are already supporting OSD.

No more Huawei-Symantec

Huawei-Symantec is no more. Last night, Huawei, the China telecom giant has bought the remaining 49% of Symantec shares for USD530 million.

The joint venture was initially set up in 2007 to focus on 3S – Server, Storage and Security. With the consolidation under one owner, and one name, Huawei will continue to focus on the 3S. And since Huawei owns telco as well, a lot of the 3S solutions will come into play and it is likely that Huawei will become a cloud player as well.

Highroad to Parallel Road

Unless you are working with highly, parallelized access to files in a large scale-out NAS environment, you probably don’t get to work much with Parallel NFS (pNFS). pNFS is part of the NFSv4.1 (RFC 5661) standard, and was introduced in January 2010 to address NFS protocol in the clustered, scale-out NAS environment. It is to provide parallel file access across distributed servers.

pNFS is heavily driven by Panasas, NetApp, EMC, IBM, Sun (now Oracle) among others. And funnily enough, the company that sticks out from the bunch is one that used to tout block storage as the way to go, not files. That’s EMC, the company that more well known for its SAN solutions than its NAS (remember Celerra and IP4700?). And EMC has embraced pNFS in a big, big way. Read EMC’s CTO for Global Marketing, Chuck Hollis’ blog here and here.

However, unknown to many, a lot of the thinking that goes into pNFS are very similar to an EMC product some years ago. That product is EMC Highroad, which in the later years, renamed as Multi-path File System (MPFS).

Note: If you want to know more about the history of HighRoad/MPFS, read this blog.

The cornerstone of EMC MPFS is their File Mapping Protocol or FMP, which is a robust protocol that lines the mapping of files to their addressable blocks in storage. In a nutshell, when I was made responsible for this product during my time at EMC, I used to pitch to companies that MPFS was a file request is through NFS but respond to the requester can be in blocks (iSCSI or Fibre Channel). The beauty of this was, NFSv3 was chatty and heavy but the delivery of data through blocks via iSCSI or Fibre Channel has lesser overhead compared to NFSv3.

Hence the delivery is faster and EMC touted that the performance was 2-4x faster than NFS. Indeed, I have seen some lab tests results from EMC’s work with Schlumberger High Performance Lab in Houston, and the numbers were impressive. I still have them on Powerpoint somewhere.

In circa of 2003-2004, EMC donated the FMP code to IETF and as they say the rest was history.

The picture below basically summarizes what pNFS is all about.

 

A NFSv4 client will basically check with a Metadata server via the pNFS protocol about the layout of the distributed cluster of server. The file, could be striped across multiple nodes of the cluster, and it is the metadata server that provides a map to the client to access the blocks or files from these nodes. This is exactly what the EMC HighRoad/MPFS File Mapping Protocol (FMP) did, mapping the file requests to its respective corresponding blocks. See diagram below:

 

And one of the powerful feature of pNFS is that it is not just about NFS. The green arrow you see in the above diagram is the storage-access protocol. That access protocol can be NFSv4.1, CIFS, iSCSI, Fibre Channel, FCoE, Infiniband, and Object Storage Device (OSD).

In order to have pNFS working, the NFS client must be NFSv4.1 ready and that code has been made available in Linux and OpenSolaris. Other Unix vendors, no doubt, will be coming out with their NFSv4.1 implementation soon. Oooooh, there will be a Windows NFSv4.1 client coming as well!

But I want to dispel the notion that EMC is a SAN company. EMC is a very strong NAS company and if you have seen the IDC market share (ok, ok, many of you out there will argue about it), EMC is #1 in NAS. And their contribution to pNFS is immense.

NFSv4 – Your filesystem librarian

I was up at about 1am several nights ago listening to the SNIA update of NFSv4 and it was worth losing sleep. I was both intrigued and interested to see a full scale NFSv4 adoption in the coming few years.

We all know that NFS has been around for more than 20 years already. Version 2 was released way back in 1989, with version 3 being around since 1995. That’s a long time, and it is beginning to show its age. NFS version 4 is not new as well. Believe or not, it was released in 2000, 11 years ago. But there was a significant update in 2010, with NFS version 4.1 and this came with parallel NFS (pNFS) support to address new requirements such as scale-out NAS and file striping across clusters of nodes.

I am doing my responsible bit to disseminate NFS version 4.x updates and moving the storage networking and filesystem community towards its imminent wide scaled adoption. This is one of the many entries I intend to share about NFS version 4.

So, why this librarian thingy? First of all, for those folks working on NFSv3, you would probably encounter issues about file locking and also high availability.

Don’t you just hate it when the server reboots, and your NFS mount point hangs? Depending if it was a hard mount or soft mount, the NFS retries could take forever or sometimes, the NFS clients just freezes. In additional to that, another frequent complaint is that NFSv3 has lousy file locking.

However, in the beginning years of NFS, the world was a very different place. Such issues about file locking and HA were very well addressed by NFSv2/v3 because the demands of the previous client-server world were lesser. As the world progressed in the 2nd millenium, NFS v2/v3 started to sputter.

In this lesson#1, I would like to share about 2 key features of NFSv4 to address the 2 issues I brought up – which is NFS HA and file locking. The 2 features are

  • Lease
  • Delegation

As I said, NFS high availability in version 3 was simplistic. In version 3, if an NFS client fails, the NFS server has little knowledge that the client has failed. Remember, NFSv3 is stateless. This can cause complications and issues such as ambiguity about file locking. Likewise, if an NFS server fails, the client could freeze and if recovered, could get stale NFS handles and have all sorts of problems related to file locking. Many locks have to be released before an application can restart properly.

In NFSv4, things related to either the NFS server or client failing with regards to file locking are much more simplified. The mechanism is leasing and both the client and server will know what happening to each other. NFSv4 is a stateful protocol.  This is how leasing works:

  1. The NFS client leases a file lock from the NFS server for a certain period of time, eg. N seconds. It renews the lock with after the N seconds period has expired.
  2. If the NFS client fails, the lock is reclaimed by the server and released by the NFS server to other clients after a grace period
  3. If the NFS server fails and rebooted, all the files are locked for M seconds for the incumbent NFS clients to reclaim the locks. If they are not reclaimed by the respective client, the file lock is released.

Such an agreement with the stateful communication between the clients and the NFS server, makes file locking in an high availability environment much simpler and more robust.

Another new feature of NFSv4 is delegation. Part of the exported filesystem from the NFS server can be delegated or “loaned out” to the NFS client. The NFS client which had “borrowed” this piece of the filesystem can then work on it in its local cache with little communication to the NFS server (performance gain and reduced chattiness of the previous version).  Here’s a step-by-step guide of this delegation process.

  1. NFS client request a piece of the filesystem. The NFS server “lends” this piece to the client if it is not already locked, as a lease.
  2. NFS client works on this piece in its local cache
  3. Once the NFS client has completed the writes and commits to the piece of filesystem on the local cache, it releases the “borrowing” lease back to the NFS server
  4. If other NFS clients requests for the same piece of filesystem while it out on loan, the NFS server would say “Sorry, I loaned it out”
  5. If there is a high order authority requesting for the piece of filesystem, the NFS server would say to the NFS client, “I need this back” and will send an order to recall the filesystem

In many ways, both file locking lease and delegation works like a librarian. Pieces of the filesystem are loaned to the requester for a lease period much like books are loaned to the borrower for a period of time.

Enjoy your weekend!

Stop stroking your …

A few days after I wrote about the performance benchmark bag of tricks, EMC was the first to fire the first salvo at NetApp’s SPECSfs2008 world records on NFS IOPS.

EMC is obviously using all its ammo to deflate NetApp chest thumping act, with Storagezilla‘s blog. Mark Twomey,  who is the alter ego of Storagezilla posted several observations about NetApp apparent use of disk short stroking to artificially boost its performance numbers. This puts NetApp against the wall, with Alex MacDonald (who incidentally is SNIA NFSv4 co-chairman) of the office of the CTO responding hard to Storagezilla’s observation.

The news of this appeared in The Register. Read all about it.

With no letting up, the article also mentioned EMC Isilon’s CTO, Rob Pegler, adding more fuel to the fire.

I spoke about short stroking as some of the tricks used to gain better numbers in benchmark. And I also mentioned that these numbers have little use to the real work and I would like to add that these numbers are just there for marketing reasons. So, for you readers out there, benchmark is really not big of a deal.

Have a great weekend!

Falconstor – soaring to 7th heaven

I was invited to Falconstor version 7.0 launch to the media this morning at Sunway Resort Hotel.

I must admit that I am a fan of Falconstor from a business perspective because they have nifty solutions. Many big boys OEMed Falconstor’s VTL solutions such as EMC with its CDL (CLARiiON Disk Library) and Sun Microsystems virtual tape library solutions. Things have been changing. There are still OEM partnerships with HDS (with Falconstor VTL and FDS solutions), HP (with Falconstor NSS solution) and a few others, but Falconstor has been taking up a more aggressive stance with their new business model. They are definitely more direct with their approach and hence, it is high time we in the industry recognize Falconstor’s prowess.

The launch today is Falconstor version 7.0 suite of data recovery and storage enhancement solutions. Note that while the topic of their solutions were on data protection, I used data recovery, simply because the true objective of their solutions are on data recovery, doing what matters most to business – RECOVERY.

Falconstor version 7.0 family of products is divided into 3 pillars

  • Storage Virtualization – with Falconstor Network Storage Server (NSS)
  • Backup & Recovery – with Falconstor Continuous Data Protector (CDP)
  • Deduplication – with Falconstor Virtual Tape Library (VTL) and File-Interface Deduplication System (FDS)

NSS virtualizes heterogeneous storage platforms and sits in between the application servers, or virtualized servers. It simplifies disparate storage platforms by consolidating volumes and provides features such as thin provisioning and snapshots. In the new version, NSS now supports up to 1,000 snapshots per volume from the previous number of 255 snapshots. That is a 4x increase as the demand for data protection is greater than ever. This allows the protection granularity to be in the minutes, well meeting the RPO (Recovery Point Objectives) standard of the most demanding customers.

The NSS also replicates the snapshots to a secondary NSS platform at a DR to extend the company’s data resiliency and improves the business continuance factor for the organization.

In a revamp new algorithm in version 7.0, the Microscan technology used in the replication technology is now more potent and higher in performance. For the uninformed, Microscan, as quoted in the datasheet is:

MicroScan™, a patented FalconStor technology, minimizes the
amount of data transmitted by eliminating redundancies at the
application and file system layers. Rather than arbitrarily
transmitting entire blocks or pages (as is typical of other
replication solutions), MicroScan technology maps, identifies, and
transmits only unique disk drive sectors (512 bytes), reducing
network traffic by as much as 95%, in turn reducing remote
bandwidth requirements.

Another very strong feature of the NSS is the RecoverTrac, which is an automated DR technology. In business, business continuity and disaster recovery usually go hand-in-hand. Unfortunately, triggering either BC or DR or both is an expensive and resource-consuming exercise. But organizations have to prepare and therefore, a proper DR process must be tested and tested again.

I am a certified Business Continuity Planner, so I am fully aware of the beauty RecoverTrac brings to the organization. The ability to test non-intrusive, simulated DR, and find out the weak points of recovery is crucial and RecoverTrac brings that confidence of DR testing to the table. Furthermore, well-tested automated DR processes also eliminates human errors in DR recovery. And RecoverTrac also has the ability to track the logical relationships between different applications and computing resource, making this technology an invaluable tool in the DR coordinator’s arsenal.

The diagram below shows the NSS solution:

 

And NSS touts to be one true any storage platform to any storage platform over any protocol replication solution. Most vendors will have either FC or iSCSI or NAS protocols but I believe so far, only Falconstor offers all protocols in one solution.

Item #2 in the upgrade list is Falconstor’s CDP solution. Continuous Data Protection (CDP) is a very interesting area in data protection. CDP provides almost near-zero RTO/RPO solution on disk, and yet not many people are aware of the power of CDP.

About 5-6 years ago, CDP was hot and there were many start-ups in this area. Companies such Kashya (bought by EMC to become RecoverPoint), Mendocino, Revivio (gobbled up by Symantec) and StoneFly have either gone belly up or gobbled up by bigger boys in the industry. Only a few remained, and Falconstor CDP is one of the true survivors in this area.

CDP should be given more credit because there are always demand for very granular data protection. In fact, I sincerely believe that both CDP, snapshots and snapshot replication are the real flagships of data protection today and the future because data protection using the traditional backup method, in a periodic and less frequent manner, is no longer adequate. And the fact that backup is generating more and more data to keep is truly not helping.

Falconstor CDP has the HyperTrac™ Backup Accelerator (HyperTrac) works in conjunction with FalconStor Continuous Data Protector (CDP) and FalconStor Network Storage Server (NSS) to increase tape backup speed, eliminate backup windows, and offload processing from application servers. A quick glimpse of HyperTrac technology is shown below:

 

In the Deduplication pillar, there were upgrades to both Falconstor VTL and Falconstor FDS. As I said earlier, CDP, snapshots and replication of the snapshot are already becoming the data protection of this new generation of storage solutions. Coupled with deduplication, data protection is made more significant because it makes smart noodles to keep one copy of the same old files, over and over again.

Falconstor File-Interface Deduplication Systems (FDS) addresses the requirement to storage more effectively, efficiently, economically. Its Single Instance Repository (SIR) technology has now been enhanced as a global deduplication repository, giving it the ability to truly store a single copy of the object. Previously, FDS was not able to recognize duplicated objects in a different controller. FDS also has improved its algorithms, driving performance up to 30TB/hour and is able to deliver a higher deduplication ratio.

 

In addition to the NAS interface, the FDS solution now has a tighter integration with the Symantec Open Storage Technology (OST) protocol.

The Falconstor VTL is widely OEM by many partners and remains one of the most popular VTL solutions in the market. VTL is also enhanced significantly in this upgrade and not surprisingly, the VTL solution from Falconstor is strengthened by its near-seamless integration with the other solutions in their stable. The VTL solution now supports up to 1 petabyte usable capacity.

 

Falconstor has always been very focused in the backup and data recovery space and has done well favourably with Gartner. In January of 2011, Gartner has release their Magic Quadrant report for Enterprise Disk-based Backup and Recovery, and Falconstor was positioned as one of the Visionaries in this space. Below is the magic quadrant:

 

As their business model changes to a more direct approach, it won’t be long before you seen Falconstor move into the Leader quadrant. They will be soaring, like a Falcon.

Performance benchmarks – the games that we play

First of all, congratulations to NetApp for beating EMC Isilon in the latest SPECSfs2008 benchmark for NFS IOPS. The news is everywhere and here’s one here.

EMC Isilon was blowing its horns several months ago when it  hit 1,112,705 IOPS recorded from a 140-node S200 cluster with 3,360 disk drives and a overall response time of 2.54 msecs. Last week, NetApp became top dog, pounding its chest with 1,512,784 IOPS on a 24 x FAS6240 nodes  with an overall response time of 1.53msecs. There were 1,728 450GB, 15,000rpm disk drives and the FAS6240s were fitted with Flash Cache.

And with each benchmark that you and I have seen before and after, we will see every storage vendors trying to best the other and if they did, their horns will be blaring, the fireworks are out and they will pounding their chests like Tarzan, saying “Who’s your daddy?” The euphoria usually doesn’t last long as performance records are broken all the time.

However, the performance benchmark results are not to be taken in verbatim because they are not true representations of real life, production environment. 2 years ago, the magazine, the defunct Byte and Switch (which now is part of Network Computing), did a 9-year study on File Systems and Storage Benchmarking. In a very interesting manner, it revealed that a lot of times, benchmarks results are merely reduced to single graphs which has little information about the details of how the benchmark was conducted, how long the benchmark took and so on.

The paper, published by Avishay Traeger and Erez Zadok from Stony Brook University and Nikolai Joukov and Charles P. Wright from the IBM T.J. Watson Research Center entitled, “A Nine Year Study of File System and Storage Benchmarking” studied 415 file systems from 106 published results and the article quoted:

Based on this examination the paper makes some very interesting observations and 
conclusions that are, in many ways, very critical of the way “research” papers have 
been written about storage and file systems.

 

Therefore, the paper highlighted the way the benchmark was done and the way the benchmark results were reported and judging by the strong title (It was titled “Lies, Damn Lies and File Systems Benchmarks”) of the online article that reviewed the study, benchmarks are not the pictures that says a thousand words.

Be it TPC-C, SPC1 or SPECSfs benchmarks, I have gone through some interesting experiences myself, and there are certain tricks of the trade, just like in a magic show. Some of the very common ones I come across are

  • Short stroking – a method to format a drive so that only the outer sectors of the disk platter are used to store data. This practice is done in I/O-intensive environments to increase performance.
  • Shortened test – performance tests that run for several minutes to achieve the numbers rather than prolonged periods (which mimics real life)
  • Reporting aggregated numbers – Note the number of nodes or controllers used to achieve the numbers. It is not ONE controller than can achieve the numbers, but an aggregated performance results factored by the number of controllers

Hence, to get to the published benchmark numbers in real life is usually not practical and very expensive. But unfortunately, customers are less educated about the way benchmarks are performed and published. We, as storage professionals, have to disseminate this information.

Ok, this sounds oxymoronic because if I am working for NetApp, why would I tell the truth that could actually hurt NetApp sales? But I don’t work for NetApp now and I think it is important for me do my duty to share more information. Either way, many people switch jobs every now and then, and so if you want to keep your reputation, be honest up front. It could save you a lot of work.

A cloud economy emerges … somewhat

A few hours ago, Rackspace had just announced the first “productized” Rackspace Private Cloud solution based on OpenStack. According to Openstack.org,

OpenStack OpenStack is a global collaboration of developers and cloud computing 
technologists producing the ubiquitous open source cloud computing platform for 
public and private clouds. The project aims to deliver solutions for all types of 
clouds by being simple to implement, massively scalable, and feature rich. 
The technology consists of a series of interrelated projects delivering various 
components for a cloud infrastructure solution.

Founded by Rackspace Hosting and NASA, OpenStack has grown to be a global software 
community of developers collaborating on a standard and massively scalable open 
source cloud operating system. Our mission is to enable any organization to create 
and offer cloud computing services running on standard hardware. 
Corporations, service providers, VARS, SMBs, researchers, and global data centers 
looking to deploy large-scale cloud deployments for private or public clouds 
leveraging the support and resulting technology of a global open source community.
All of the code for OpenStack is freely available under the Apache 2.0 license. 
Anyone can run it, build on it, or submit changes back to the project. We strongly 
believe that an open development model is the only way to foster badly-needed cloud 
standards, remove the fear of proprietary lock-in for cloud customers, and create a 
large ecosystem that spans cloud providers.

And Openstack just turned 1 year old.

So, what’s this Rackspace private cloud about?

In the existing cloud economy, customers subscribe from a cloud service provider. The customer pays a monthly (usually) subscription fee in a pay-as-you-use-model. And I have courageously predicted that the new cloud economy will drive the middle tier (i.e. IT distributors, resellers and system integrators) in my previous blog out of IT ecosystem. Before I lose the plot, Rackspace is now providing the ability for customers to install an Openstack-ready, Rackspace-approved private cloud architecture in their own datacenter, not in Rackspace Hosting.

This represents a tectonic shift in the cloud economy, putting the control and power back into the customers’ hands. For too long, there were questions about data integrity, security, control, cloud service provider lock-in and so on but with the new Rackspace offering, customers can build their own private cloud ecosystem or they can get professional service from Rackspace cloud systems integrators. Furthermore, once they have built their private cloud, they can either manage it themselves or get Rackspace to manage it for them.

How does Rackspace do it?

From their vast experience in building Openstack clouds, Rackspace Cloud Builders have created a free reference architecture.  Currently OpenStack focuses on two key components: OpenStack Compute, which offers computing power through virtual machine and network management, and OpenStack Object Storage, which is software for redundant, scalable object storage capacity.

In the Openstack architecture, there are 3 major components – Compute, Storage and Images.

More information about the Openstack Architecture here. And with 130 partners in the Openstack alliance (which includes Dell, HP, Cisco, Citrix and EMC), customers have plenty to choose from, and lessening the impact of lock-in.

What does this represent to storage professionals like us?

This Rackspace offering is game changing and could perhaps spark an economy for partners to work with Cloud Service Providers. It is definitely addressing some key concerns of customers related to security and freedom to choose, and even change service providers. It seems to be offering the best of both worlds (for now) but Rackspace is not looking at this for immediate gains. But we still do not know how this economic pie will grow and how it will affect the cloud economy. And this does not negate the fact that us storage professionals have to dig deeper and learn more and this not does change the fact that we have to evolve to compete against the best in the world.

Rackspace has come out beating its chest and predicted that the cloud computing API space will boil down these 3 players – Rackspace Openstack, VMware and Amazon Web Services (AWS). Interestingly, Redhat Aeolus (previously known as Deltacloud) was not worthy to mentioned by Rackspace. Some pooh-pooh going on?

Data Deduplication – Dell is first and last

A very interesting report surfaced in front of me today. It is Information Week’s IT Pro ranking of Data Deduplication vendors, just made available a few weeks ago, and it is the overview of the dedupe market so far.

It surveyed over 400 IT professionals from various industries with companies ranging from less than 50 employees to over 10,000 employees and revenues of less than USD5 million to USD1 billion. Overall, it had a good mix of respondents. But the results were quite interesting.

It surveyed 2 segments

  1. Overall performance – product reliability, product performance, acquisition costs, operations costs etc.
  2. Technical features – replication, VTL, encryption, iSCSI and FCoE support etc.

When I saw the results (shown below), surprise, surprise! Here’s the overall performance survey chart:

Dell/Compellent scored the highest in this survey while EMC/Data Domain ranked the lowest. However, the difference between the first place and the last place vendor is only 4%, and this is to suggest that EMC/Data Domain was about just as good as the Dell/Compellent solution, but it scored poorly in the areas that matters most to the customer. In fact, as we drill down into the requirements of the overall performance one-by-one, as shown below,

there is little difference among the 7 vendors.

However, when it comes to Technical Features, Dell/Compellent is ranked last, the complete opposite. As you can see from the survey chart below, IBM ProtecTier, NetApp and HP are all ranked #1.

The details, as per the technical requirements of the customers, are shown below:

These figures show that the competition between the vendors is very, very stiff, with little edge difference from one to another. But what I was more interested were the following findings, because these figures tell a story.

In the survey, only 34% of the respondents say they have implemented some data deduplication solutions, while the rest are evaluating and plan to evaluation. This means that the overall market is not saturated and there is still a window of opportunity for the vendors. However, the speed of the a maturing data deduplication market, from early adopters perhaps 4-5 years ago to overall market adoption, surprised many, because the storage industry tend to be a bit less trendy than most areas of IT. With the way the rate of data deduplication is going, it will be very much a standard feature of all storage vendors in the very near future.

The second figures that is probably not-so-surprising is, for most of the customers who have already implemented the data deduplication solution, almost 99% are satisfied or somewhat satisfied with their solutions. Therefore, the likelihood of these customer switching vendors and replacing their gear is very low, perhaps partly because of the reliability of the solution as well as those products performing as they should.

The Information Week’s IT Pro survey probably reflected well of where the deduplication market is going and there isn’t much difference in terms of technical and technology features from vendor to vendor. Customer will have to choose beyond the usual technology pitch, and look for other (and perhaps more important) subtleties such as customer service, price and flexibility of doing business with. EMC/Data Domain, being king-of-the-hill, has not been the best of vendor when it comes to price, quality of post-sales support and service innovation. Let’s hope they are not like the EMC sales folks of the past, carrying the “Take it or leave it” tag when they develop their relationship with their future customers. And it will not help if word-of-mouth goes around the industry about EMC’s arrogance of their dominance. It may not be true, and let’s hope it is not true because the EMC of today has changed plenty compared to the Symmetrix days. EMC/Data Domain is now part of their Backup Recovery Service (BRS) team, and I have good friends there at EMC Malaysia and Singapore. They are good guys but remember guys, customer is still king!

Dell, new with their acquisition of Compellent and Ocarina Networks, seems very eager to win the business and kudos to them as well. In fact, I heard from a little birdie that Dell is “giving away” several units of Compellents to selected customers in Malaysia. I did not and cannot ascertain if this is true or not but if it is, that’s what I call thinking-out-of-the-box, given Dell as a late comer into the storage game. Well done!

One thing to note is that the survey took in 17 vendors, including Exagrid, Falconstor, Quantum, Sepaton and so on, but only the top-7 shown in the charts qualified.

In the end, I believe the deduplication vendors had better scramble to grab as much as they can in the coming months, because this market will be going, going, gone pretty soon with nothing much to grab after that, unless there is a disruptive innovation to the deduplication technology

Novell Filr product insight being arranged

Hello reader,

I can see that there are a lot of interests for the Novell Filr and let me assure you that I am already speaking with Novell to introduce this solution soon when it comes available next year.

I am hoping to get a front row seat and even better, be the first in Malaysia to test this product extensively. I can’t make any promises at this point but Novell Country Manager for Malaysia and South Asia will be in Australia this month to help get my enthusiasm across to their corporate people. (Fingers crossed).

I thank you for your support.

Thank you
/storagegaga 🙂