4TB disks – the end of RAID

Seriously? 4 freaking terabyte disk drives?

The enterprise SATA/SAS disks have just grown larger, up to 4TB now. Just a few days ago, Hitachi boasted the shipment of the first 4TB HDD, the 7,200 RPM Ultrastar™ 7K4000 Enterprise-Class Hard Drive.

And just weeks ago, Seagate touted their Heat-Assisted Magnetic Recording (HAMR) technology will bring forth the 6TB hard disk drives in the near future, and 60TB HDDs not far in the horizon. 60TB is a lot of capacity but a big, big nightmare for disks availability and data backup. My NetApp Malaysia friend joked that the RAID reconstruction of 60TB HDDs would probably finish by the time his daughter finishes college, and his daughter is still in primary school!.

But the joke reflects something very serious we are facing as the capacity of the HDDs is forever growing into something that could be unmanageable if the traditional implementation of RAID does not change to meet such monstrous capacity.

Yes, RAID has changed since 1988 as every vendor approaches RAID differently. NetApp was always about RAID-4 and later RAID-DP and I remembered the days when EMC had a RAID-S. There was even a vendor in the past who marketed RAID-7 but it was proprietary and wasn’t an industry standard. But fundamentally, RAID did not change in a revolutionary way and continued to withstand the ever ballooning capacities (and pressures) of the HDDs. RAID-6 was introduced when the first 1TB HDDs first came out, to address the risk of a possible second disk failure in a parity-based RAID like RAID-4 or RAID-5. But today, the 4TB HDDs could be the last straw that will break the camel’s back, or in this case, RAID’s back.

RAID-5 obviously is dead. Even RAID-6 might be considered insufficient now. Having a 3rd parity drive (3P) is an option and the only commercial technology that I know of which has 3 parity drives support is ZFS. But having 3P will cause additional overhead in performance and usable capacity. Will the fickle customer ever accept such inadequate factors?

Note that 3P is not RAID-7. RAID-7 is a trademark of a old company called Storage Computer Corporation and RAID-7 is not a standard definition of RAID.

One of the biggest concerns is rebuild times. If a 4TB HDD fails, the average rebuild speed could take days. The failure of a second HDD could up the rebuild times to a week or so … and there is vulnerability when the disks are being rebuilt.

There are a lot of talks about declustered RAID, and I think it is about time we learn about this RAID technology. At the same time, we should demand this technology before we even consider buying storage arrays with 4TB hard disk drives!

I have said this before. I am still trying to wrap my head around declustered RAID. So I invite the gurus on this matter to comment on this concept, but I am giving my understanding on the subject of declustered RAID.

Panasas‘ founder, Dr. Garth Gibson is one of the people who proposed RAID declustering way back in 1999. He is a true visionary.

One of the issues of traditional RAID today is that we still treat the hard disk component in a RAID domain as a whole device. Traditional RAID is designed to protect whole disks with block-level redundancy.  An array of disks is treated as a RAID group, or protection domain, that can tolerate one or more failures and still recover a failed disk by the redundancy encoded on other drives. The RAID recovery requires reading all the surviving blocks on the other disks in the RAID group to recompute blocks lost on the failed disk. In short, the recovery, in the event of a disk failure, is on the whole object and therefore, a entire 4TB HDD has to be recovered. This is not good.

The concept of RAID declustering is to break away from the whole device idea. Apply RAID at a more granular scale. IBM GPFS works with logical tracks and RAID is applied at the logical track level. Here’s an overview of how is compares to the traditional RAID:

The logical tracks are spread out algorithmically spread out across all physical HDDs and the RAID protection layer is applied at the track level, not at the HDD device level. So, when a disk actually fails, the RAID rebuild is applied at the track level. This significant improves the rebuild times of the failed device, and does not affect the performance of the entire RAID volume much. The diagram below shows the declustered RAID’s time and performance impact when compared to a traditional RAID:

While the IBM GPFS approach to declustered RAID is applied at a semi-device level, the future is leaning towards OSD. OSD or object storage device is the next generation of storage and I blogged about it some time back. Panasas is the leader when it comes to OSD and their radical approach to this is applying RAID at the object level. They call this Object RAID.

With object RAID, data protection occurs at the file-level. The Panasas system integrates the file system and data protection to provide novel, robust data protection for the file system.  Each file is divided into chunks that are stored in different objects on different storage devices (OSD).  File data is written into those container objects using a RAID algorithm to produce redundant data specific to that file.  If any object is damaged for whatever reason, the system can recompute the lost object(s) using redundant information in other objects that store the rest of the file.

The above was a quote from the blog of Brent Welch, Panasas’ Director of Software Architecture. As mentioned, the RAID protection of the objects in the OSD architecture in Panasas occurs at file-level, and the file or files constitute the object. Therefore, the recovery domain in Object RAID is at the file level, confining the risk and damage of data loss within the file level and not at the entire device level. Consequently, the speed of recovery is much, much faster, even for 4TB HDDs.

Reliability is the key objective here. Without reliability, there is no availability. Without availability, there is no performance factors to consider. Therefore, the system’s reliability is paramount when it comes to having the data protected. RAID has been the guardian all these years. It’s time to have a revolutionary approach to safeguard the reliability and ensure data availability.

So, how many vendors can claim they have declustered RAID?

Panasas is a big YES, and they apply their intelligence in large HPC (high performance computing) environments. Their technology is tried and tested. IBM GPFS is another. But where are the rest?

 

Server way of locked-in storage

It is kind of interesting when every vendor out there claims that they are as open as they can be but the very reality is, the competitive nature of the game is really forcing storage vendors to speak open, but their actions are certainly not.

Confused? I am beginning to see a trend … a trend that is forcing customers to be locked-in with a certain storage vendor. I am beginning to feel that customers are given lesser choices, especially when the brand of the server they select for their applications  will have implications on the brand of storage they will be locked in into.

And surprise, surprise, SSDs are the pawns of this new cloak-and-dagger game. How? Well, I have been observing this for quite a while now, and when HP announced their SMART portfolio for their storage, it’s time for me to say something.

In the announcement, it was reported that HP is coming out with its 8th generation ProLiant servers. As quoted:

The eighth generation ProLiant is turbo-charging its storage with a Smart Array containing solid state drives and Smart Caching.

It also includes two Smart storage items: the Smart Array controllers and Smart Caching, which both feature solid state storage to solve the disk I/O bottleneck problem, as well as Smart Data Services software to use this hardware

From the outside, analysts are claiming this is a reaction to the recent EMC VFCache product. (I blogged about it here) and HP was there to put the EMC VFcache solution as a first generation product, lacking the smarts (pun intended) of what the HP products have to offer. You can read about its performance prowess in the HP Connect blog.

Similarly, Dell announced their ExpressFlash solution that ties up its 12th generation PowerEdge servers with their flagship (what else), Dell Compellent storage.

The idea is very obvious. Put in a PCIe-based flash caching card in the server, and use a condescending caching/tiering technology that ties the server to a certain brand of storage. Only with this card, that (incidentally) works only with this brand of servers, will you, Mr. Customer, be able to take advantage of the performance power of this brand of storage. Does that sound open to you?

HP is doing it with its ProLiant servers; Dell is doing it with its ExpressFlash; EMC’s VFCache, while not advocating any brand of servers, is doing it because VFCache works only with EMC storage. We have seen Oracle doing it with Oracle ExaData. Oracle Enterprise database works best with Oracle’s own storage and the intelligence is in its SmartScan layer, a proprietary technology that works exclusively with the storage layer in the Exadata. Hitachi Japan, with its Hitachi servers (yes, Hitachi servers that we rarely see in Malaysia), already has such a technology since the last 2 years. I wouldn’t be surprised that IBM and Fujitsu already have something in store (or probably I missed the announcement).

NetApp has been slow in the game, but we hope to see them coming out with their own server-based caching products soon. More pure play storage are already singing the tune of SSDs (though not necessarily server-based).

The trend is obviously too, because the messaging is almost always about storage performance.

Yes, I totally agree that storage (any storage) has a performance bottleneck, especially when it comes to IOPS, response time and throughput. And every storage vendor is claiming SSDs, in one form or another, is the knight in shining armour, ready to rid the world of lousy storage performance. Well, SSDs are not the panacea of storage performance headaches because while they solve some performance issues, they introduce new ones somewhere else.

But it is becoming an excuse to introduce storage vendor lock-in, and how has the customers responded this new “concept”? Things are fairly new right now, but I would always advise customers to find out and ask questions.

Cloud storage for no vendor lock-in? Going to the cloud also has cloud service provider lock-in as well, but that’s another story.

 

Gartner WW ECB 4Q11

The Gartner Worldwide External Controller Based Disk Storage market numbers were out last night, and perennially follows IDC Disk Storage System Tracker.

The numbers posted little surprise, after a topsy-turvy year for vendors like IBM, HP and especially NetApp. Overall, the positions did not change much, but we can see that the 3 vendors I mentioned are facing very challenging waters ahead. Here’s a look at the overall 2011 numbers:

EMC is unstoppable, and gaining 3.6% market share and IBM lost 0.2% market share despite having strong sales with their XIV and StorWize V7000 solutions. This could be due to the lower than expected numbers from their jaded DS-series. IBM needs to ramp up.

HP stayed stagnant, even though their 3PAR numbers have been growing well. They were hit by poor numbers from the EVA (now renumbered as P6000s), and surprisingly their P4000s as well. Looks like they are short-lefthanded (pun intended) and given the C-level upheavals it went through in the past year, things are not looking good for HP.

Meanwhile, Dell is unable to shake off their EMC divorce alimony, losing 0.8% market share. We know that Dell has been pushing very, very hard with their Compellent, EqualLogic, and other technologies they acquired, but somehow things are not working as well yet.

HDS has been the one to watch, with its revenue numbers growing in double digits like NetApp and EMC. Their market share gain was 0.6%, which is very good for HDS standards. NetApp gained 0.8% market share but they seem vulnerable after 2 poor quarters.

The 4th quarter for 2011 numbers are shown below:

I did not blog about IDC QView numbers, which reports the storage software market share but just to give this entry a bit of perspective from a software point of view. From the charts of The Register, EMC has been gaining marketshare at the expense of the rest of the competitors like Symantec, IBM and NetApp.

Tabulated differently, here’s another set of data:

On all fronts, EMC is firing all cylinders. Like a well-oiled V12 engine, EMC is going at it with so much momentum right now. Who is going to stop EMC?

Oracle Bested the Best in Quality

I have been an avid reader of SearchStorage Storage magazine for many years now and have been downloading their free PDF copy every month. Quietly snugged at the end of January 2012’s issue, there it was, the Storage magazine 6th annual Quality Awards for NAS.

I was pleasantly surprised with the results because in the previous annual awards, it would dominated by NetApp and EMC but this time around, a dark horse has emerged. It is Oracle who took top honours in both the Enterprise and the Mid-range categories.

The awards are the result of Storage Magazine’s survey and below is an excerpt about the survey:

 

In both categories covering the Enterprise and the Mid-Range, the overall ratings are shown below:

 

 

Surprised? You bet because I was.

The survey does not focus on speeds and feeds or comparing scalability or performance. Rather, the survey focuses on the qualitative aspects of the NAS products. There were many storage vendors who were part of the participation lists but many did not qualify to be make a dent of what the top 6 did. Here’s a list of the vendors surveyed:

 

The qualitative aspects of the survey focused on 5 main areas:

  • Sales force competency
  • Initial Quality
  • Product Features
  • Product Reliability
  • Technical Support

In each of the 5 main areas, customers were asked a series of questions. Here is a breakdown of those questions of each area.

Sales Force Competency

  1. Are the sales force knowledgeable about their products and their customer’s industries?
  2. How flexible are their sales effort?
  3. How good are they keeping the customer’s interest levels up?

Initial Product Quality

  1. Does the product need little or no vendor intervention?
  2. Ease of installation and ease of use
  3. Good value for money
  4. Reasonable requirement from Professional Service or needing little Professional Service
  5. Installation without defects
  6. Getting it right the first time

Product Features

  1. Storage management features
  2. Mirroring features
  3. Capacity scaling features
  4. Interoperable with other vendor’s products
  5. Remote replication features
  6. Snapshotting features

Product Reliability

  1. Vendor provide comprehensive upgrading procedures
  2. Ability to meet Service Level Agreement (SLA)
  3. Experiences very little downtime
  4. Patches applied non-disruptively

Technical Support

  1. Taking ownership of the customer’s problem
  2. Timely problem resolution and technical advice
  3. Documentation
  4. Vendor supplies support contractually as specified
  5. Vendor’s 3rd party partners are knowledgeable
  6. Vendor provide adequate training

These are some of the intangibles that customers are looking for when they qualify the NAS solutions from vendors. And the surprising was Oracle just became something to be reckoned with, backed by the strong legacy of customer-centric focus of Sun and StorageTek. If this is truly happening in the US, then kudos to Oracle for maximizing the Sun-Storagetek enterprise genes to put their NAS products to be best-of-breed.

However, on the local front, it seems to me that Oracle isn’t doing much justice to the human potential they have inherited from Sun. A little bird has told me that they got rid of some good customer service people in Malaysia and Singapore just last month and more could be on the way in 2012. All this for the sake of meeting some silly key performance indices (KPIs) of being measured by tasks per day.

The Sun people that I know here in Malaysia and Singapore are gurus who has gone through the fire and thrived and there is no substitute for quality. Unfortunately, in Oracle, it’s all about numbers, whether it is sales or tasks per day.

Well, back to the survey. And of course, the final question would be, “Is the product good enough that you would buy it again?” And the results are …

 

Good for Oracle in the US but the results do not fully reflect what’s on the ground here in Malaysia, which is more likely dominated by NetApp, HP, EMC and IBM.

Primary Dedupe where are you?

I am a bit surprised that primary storage deduplication has not taken off in a big way, unlike the times when the buzz of deduplication first came into being about 4 years ago.

When the first deduplication solutions first came out, it was particularly aimed at the backup data space. It is now more popularly known as secondary data deduplication, the technology has reduced the inefficiencies of backup and helped sparked the frenzy of adulation of companies like Data Domain, Exagrid, Sepaton and Quantum a few years ago. The software vendors were not left out either. Symantec, Commvault, and everyone else in town had data deduplication for backup and archiving.

It was no surprise that EMC battled NetApp and finally won the rights to acquire Data Domain for USD$2.4 billion in 2009. Today, in my opinion, the landscape of secondary data deduplication has pretty much settled and matured. Practically everyone has some sort of secondary data deduplication technology or solution in place.

But then the talk of primary data deduplication hardly cause a ripple when compared a few years ago, especially here in Malaysia. Yeah, the IT crowd is pretty fickle that way because most tend to follow the trend of the moment. Last year was Cloud Computing and now the big buzz word is Big Data.

We are here to look at technologies to solve problems, folks, and primary data deduplication technology solutions should be considered in any IT planning. And it is our job as storage networking professionals to continue to advise customers about what is relevant to their business and addressing their pain points.

I get a bit cheesed off that companies like EMC, or HDS continue to spend their marketing dollars on hyping the trends of the moment rather than using some of their funds to promote good technologies such as primary data deduplication that solve real life problems. The same goes for most IT magazines, publications and other communications mediums, rarely giving space to technologies that solves problems on the ground, and just harping on hypes, fuzz and buzz. It gets a bit too ordinary (and mundane) when they are trying too hard to be extraordinary because everyone is basically talking about the same freaking thing at the same time, over and over again. (Hmmm … I think I am speaking off topic now .. I better shut up!)

We are facing an avalanche of data. The other day, the CEO of Nexenta used the word “data tsunami” but whatever terms used do not matter. There is too much data. Secondary data deduplication solved one part of the problem and now it’s time to talk about the other part, which is data in primary storage, hence primary data deduplication.

What is out there?  Who’s doing what in term of primary data deduplication?

NetApp has their A-SIS (now NetApp Dedupe) for years and they are good in my books. They talk to customers about the benefits of deduplication on their FAS filers. (Side note: I am seeing more benefits of using data compression in primary storage but I am not going to there in this entry). EMC has primary data deduplication in their Celerra years ago but they hardly talk much about it. It’s on their VNX as well but again, nobody in EMC ever speak about their primary deduplication feature.

I have always loved Ocarina Networks ECO technology and Dell don’t give much hoot about Ocarina since the acquisition in  2010. The technology surfaced a few months ago in Dell DX6000G Storage Compression Node for its Object Storage Platform, but then again, all Dell talks about is their Fluid Data Architecture from the Compellent division. Hey Dell, you guys are so one-dimensional! Ocarina is a wonderful gem in their jewel case, and yet all their storage guys talk about are Compellent  and EqualLogic.

Moving on … I ought to knock Oracle on the head too. ZFS has great data deduplication technology that is meant for primary data and a couple of years back, Greenbytes took that and made a solution out of it. I don’t follow what Greenbytes is doing nowadays but I do hope that the big wave of primary data deduplication will rise for companies such as Greenbytes to take off in a big way. No thanks to Oracle for ignoring another gem in ZFS and wasting their resources on pre-sales (in Malaysia) and partners (in Malaysia) that hardly know much about the immense power of ZFS.

But an unexpected source coming from Microsoft could help trigger greater interest in primary data deduplication. I have just read that the next version of Windows Server OS will have primary data deduplication integrated into NTFS. The feature will be available in Windows 8 and the architectural view is shown below:

The primary data deduplication in NTFS will be a feature add-on for Windows Server users. It is implemented as a filter driver on a per volume basis, with each volume a complete, self describing unit. It is cluster aware, and fully crash consistent on all operations.

The technology is Microsoft’s own technology, built from scratch and will be working to position Hyper-V as an strong enterprise choice in its battle for the server virtualization space with VMware. Mind you, VMware already has a big, big lead and this is just something that Microsoft must do-or-die to keep Hyper-V playing catch-up. Otherwise, the gap between Microsoft and VMware in the server virtualization space will be even greater.

I don’t have the full details of this but I read that the NTFS primary deduplication chunk sizes will be between 32KB to 128KB and it will be post-processing.

With Microsoft introducing their technology soon, I hope primary data deduplication will get some deserving accolades because I think most companies are really not doing justice to the great technologies that they have in their jewel cases. And I hope Microsoft, with all its marketing savviness and adeptness, will do some justice to a technology that solves real life’s data problems.

I bid you good luck – Primary Data Deduplication! You deserved better.

IDC Worldwide Storage Software QView 3Q11

I did not miss this when the IDC report of worldwide storage software for Q3 2011 was released a couple of weeks ago. I was just too busy to work on it until just now.

The IDC QView report covers 7 functional areas of storage software:

  • Data protection and recovery software
  • Storage replication software
  • Storage infrastructure software
  • Storage management software
  • Device management software
  • Data archiving software
  • File system software

All areas are growing and Q3 grew 9.7% when compared with the figures of 3Q2010. In the overall software market, EMC holds the top position at 24.5% followed by Symantec (15.3%) and IBM (14.0%). Here’s a table to show the overall standings of the storage software vendors.

 

In fact, EMC leads in 3 areas of storage infrastructure management, storage management and device management. But the fastest growing area is data archiving software with a pace of 12.2% following by storage and device management of 11.3%.

HP is not in the table, but IDC reported that the biggest growth is coming from HP with a 38.2% growth, boosted by its acquisition of 3PAR. Watch out for HP in the coming quarters. Also worthy of note is the rate Symantec has been experiencing. Their was only 2.2% and IBM, at #3, is catching up fast. I wonder what’s happening in Symantec having seeing them losing their lofty heights in recent years.

The storage software market is a USD$3.5 billion market and it is the market that storage vendors are placing more importance. This market will grow.

Gartner 3Q2011 WW ECB Disk Storage Market

Just after IDC released their numbers of their worldwide Disk Storage System Tracker (Read my blog) 10 days ago, Gartner released their Worldwide External Controller Based (ECB) Disk Storage Market report for Q3 of 2011.

The storage market remains resilient (for now) and growing 10.4% in terms of revenue, despite the hard economic conditions. The table below shows the top 7 storage vendors and their relation to their Q2 numbers.

 

EMC remained at the top and gained a massive 3.6% jump in market share. Looks like they are firing all cylinders and chugging like an unstoppable steam train. IBM gained 0.1% in second place as its stable of DS8000, XIV and Storewize V7000 is taking shape. Even though IBM has been holding steadily, I still think that their present storage lineup is staggered and lacks that seamless upgrade path for their customers.

NetApp, which I always terms as the “little engine that could”, is slowing down. They were badly hit in the last quarter, delivering lower than expected revenue numbers according to the analysts. Their stock took a tumble too. As quoted by Gartner, “NetApp’s third-quarter results reflect an overdependence on a few large customers, limited geographic coverage in high-growth countries and increased competition from Dell, EMC, HP and IBM in the midrange modular ECB disk array market segment.

I wrote in my recent blog, that NetApp has to start evolving from a pure-play storage vendor into a total storage and data management solution vendor. The recent rumours of NetApp’s interests in Commvault and Quantum should make a lot of sense if NetApp decides to make that move. Come on, NetApp! What are you waiting for?

HP came back strong in this report. They are in 4th place with 10.4% market share and hot on NetApp’s heels. After many months of nonsensical madness – Leo Apotheker firing, trying to ditch the PC business, the killing of WebOS tablet, the very public Oracle-HP spat – things are beginning to settle a bit under their new CEO, Meg Whitman. In a recent HP Discover conference in Vienna, it was reported that the HP storage team is gung-ho of what they have in their arsenal right now. They called it “The 4 Jewels of HP Storage Crown” which includes 3PAR, Ibrix, StoreOnce and LeftHand. They also leap-frogged over HDS and Dell in the recent Gartner Magic Quadrant (See below).

Kudos to HP and team.

HDS seems to be doing well, and so is Dell. But the Gartner numbers tell a different story. HDS, lost market share and now shares 7.8% market share with Dell. Dell, despite its strong marketing on Compellent, could not make up its loss after breaking off with EMC.

Fujitsu and Oracle completes the line up.

My conclusion: HP and IBM are coming back; EMC is well and far ahead of everyone else; NetApp has to evolve; Dell still lacking in enterprise storage savviness despite having good technology; No comments about HDS. 

Magic on storage players

It’s that time of the year again where Gartner releases it Magic Quadrant for the block-access, external controller-based, mid-range and high-end modular disk arrays market. This particular is very important because it represents the mainstay of the overall storage industry, viewed from a more qualitative angle. Whereas the other charts and reports work with statistics and numbers, this is the chart that everyone in the industry flock to. Gartner Magic Quadrant (MQ) is the storage industry indicator of who’s are the leaders; who are the visionaries; who are the executive wizards and who are the laggards (also known as niche players).

So, this time around, who’s in the Leaders Quadrant?

The perennial players in the Leader’s Quadrant are EMC, IBM, NetApp, HP, Dell, and HDS. In my previous blog, I shared with you the IDC figures about market shares but the Gartner MQ shows are more subtle side, and one that perhaps carry more weight to organizations.

From the IDC numbers announced previously, we have seen Dell taking a beating. They have lost market share and similarly in this latest Gartner MQ, they have lost their significance of their influence as well. Everyone expected their Compellent solution to be robust and having EqualLogic, Ocarina and Exanet in its stable would strengthen their presence in the storage industry. Surprisingly, Dell lost on both IDC statistically charged market numbers and this Gartner MQ as well. Perhaps they were too hasty to dump EMC a few months ago?

Gartner also reported that HP has made significant leap in the Leader’s Quadrant. It has leapfrogged over HDS and IBM when comparing their position in Gartner’s MQ chart. This could be coming from their concerted effort to pitch their Converged Infrastructure, a vision that in my opinion, simplified computing. HP Malaysia shared with me their vision a few months ago, and I was impressed. What I was not very impressed then and even now, is that their storage solutions story is still staggered, lacking the gel. Perhaps it is work in progress for HP, the 3PAR, the IBRIX and the EVA. But one things for sure. They are slowly but surely getting the StoreOnce story right and that’s good news for customers. I did a review of HP StoreOnce technology a few months ago.

Perhaps it’s time for HP to ditch their VLS deduplication, which to me, confuses customers. By the way, HP VLS is an OEM from Sepaton. (Sepaton is “No tapes” spelled backwards)

Here’s a glimpse of last year’s Magic Quadrant.

 

In the Niche Quadrant, there are a few players making waves as well. 2 companies to watch out for are Huawei (they dropped Symantec 2 weeks ago) and Nexsan. Nexsan has been beefing up its marketing of late, and I often see them in mailing lists and ads on some websites I went to.

But the one to watch will be Huawei. This is a company with deep pockets, hiring the best in the storage industry and also has a very strong domestic market in China. In the next 2-3 years, Huawei could emerge as a strong contender to the big boys. So watch out!

Gartner Magic Quadrant is indeed weaving its magic and this time around the magic is good to HP.

Data Deduplication – Dell is first and last

A very interesting report surfaced in front of me today. It is Information Week’s IT Pro ranking of Data Deduplication vendors, just made available a few weeks ago, and it is the overview of the dedupe market so far.

It surveyed over 400 IT professionals from various industries with companies ranging from less than 50 employees to over 10,000 employees and revenues of less than USD5 million to USD1 billion. Overall, it had a good mix of respondents. But the results were quite interesting.

It surveyed 2 segments

  1. Overall performance – product reliability, product performance, acquisition costs, operations costs etc.
  2. Technical features – replication, VTL, encryption, iSCSI and FCoE support etc.

When I saw the results (shown below), surprise, surprise! Here’s the overall performance survey chart:

Dell/Compellent scored the highest in this survey while EMC/Data Domain ranked the lowest. However, the difference between the first place and the last place vendor is only 4%, and this is to suggest that EMC/Data Domain was about just as good as the Dell/Compellent solution, but it scored poorly in the areas that matters most to the customer. In fact, as we drill down into the requirements of the overall performance one-by-one, as shown below,

there is little difference among the 7 vendors.

However, when it comes to Technical Features, Dell/Compellent is ranked last, the complete opposite. As you can see from the survey chart below, IBM ProtecTier, NetApp and HP are all ranked #1.

The details, as per the technical requirements of the customers, are shown below:

These figures show that the competition between the vendors is very, very stiff, with little edge difference from one to another. But what I was more interested were the following findings, because these figures tell a story.

In the survey, only 34% of the respondents say they have implemented some data deduplication solutions, while the rest are evaluating and plan to evaluation. This means that the overall market is not saturated and there is still a window of opportunity for the vendors. However, the speed of the a maturing data deduplication market, from early adopters perhaps 4-5 years ago to overall market adoption, surprised many, because the storage industry tend to be a bit less trendy than most areas of IT. With the way the rate of data deduplication is going, it will be very much a standard feature of all storage vendors in the very near future.

The second figures that is probably not-so-surprising is, for most of the customers who have already implemented the data deduplication solution, almost 99% are satisfied or somewhat satisfied with their solutions. Therefore, the likelihood of these customer switching vendors and replacing their gear is very low, perhaps partly because of the reliability of the solution as well as those products performing as they should.

The Information Week’s IT Pro survey probably reflected well of where the deduplication market is going and there isn’t much difference in terms of technical and technology features from vendor to vendor. Customer will have to choose beyond the usual technology pitch, and look for other (and perhaps more important) subtleties such as customer service, price and flexibility of doing business with. EMC/Data Domain, being king-of-the-hill, has not been the best of vendor when it comes to price, quality of post-sales support and service innovation. Let’s hope they are not like the EMC sales folks of the past, carrying the “Take it or leave it” tag when they develop their relationship with their future customers. And it will not help if word-of-mouth goes around the industry about EMC’s arrogance of their dominance. It may not be true, and let’s hope it is not true because the EMC of today has changed plenty compared to the Symmetrix days. EMC/Data Domain is now part of their Backup Recovery Service (BRS) team, and I have good friends there at EMC Malaysia and Singapore. They are good guys but remember guys, customer is still king!

Dell, new with their acquisition of Compellent and Ocarina Networks, seems very eager to win the business and kudos to them as well. In fact, I heard from a little birdie that Dell is “giving away” several units of Compellents to selected customers in Malaysia. I did not and cannot ascertain if this is true or not but if it is, that’s what I call thinking-out-of-the-box, given Dell as a late comer into the storage game. Well done!

One thing to note is that the survey took in 17 vendors, including Exagrid, Falconstor, Quantum, Sepaton and so on, but only the top-7 shown in the charts qualified.

In the end, I believe the deduplication vendors had better scramble to grab as much as they can in the coming months, because this market will be going, going, gone pretty soon with nothing much to grab after that, unless there is a disruptive innovation to the deduplication technology

A wizer IBM

A couple nights ago, IBM launched a slew of new storage technology updates and a new cloud service called SmartCloud Enterprise, which incorporates some cloud technology from Nirvanix.

There were updates to IBM XIV, SVC, SONAS and also the DS8800 and the announcement reached us with a big bang. One of the notable updates that caught my eye was IBM Storwize V7000. When IBM first acquired Storwize in 2010, their solution was meant to be a compression engine in front of a NAS storage. And it pretty much of that for a while, until the new Storwize V7000.

The new Storwize V7000 is now a Unified Storage array, a multiprotocol box that IBM has quoted to compete with EMC VNX series. In the news, the V7000 has the block virtualization code from the IBM SVC, files support, a file distribution policy engine called ActiveCloud, and also included remote replication (Metro & Global Mirror), automatic storage tiering (EasyTier), clustering and storage virtualization as well. It also sports a new user interface inherited from IBM XIV’s Gen3 GUI that can manage both files and blocks.

The video below introduces the V7000:

While IBM is being courteous to NetApp (NetApp FAS series are IBM’s N-Series) by saying that their cannons are pointed towards EMC’s VNX, one cannot help to question the strong possibility of the V7000 hurting N-series sales as well. NetApp could see this relationship sailing choppy waters ahead.

To me, the current IBM storage technology lineup is staggered. It is everything to everyone, and there are things that are in need of sharpening. HDS has certainly made great leaps getting their act together and they have gained strong market share in the past 2 quarters. Dell and HP have not been so good, because their story just don’t gel well. It’s about time IBM get going with their own technology, and more importantly consolidate their storage technology lineup into a more focused strategy.

This is a great announcement for IBM and they are getting wizer!