VMware – the silent storage killer

When VMware 5.0 was launched last month, I heard the feature called Virtual Storage Appliance (VSA) was finally out and is now being offered as an SMB/SME “storage” solution. In my mind, alarm bells were ringing because in its own stealthy manner, VMware had just become a storage player.

What VMware is offering is “Hey! If you don’t have money to buy your enterprise storage array, don’t worry. Make your own shared storage with our very own VMware VSA“. VSA utilizes the internal disks of the ESX/ESXi host as its shared storage.

VSA is nothing new. For years, LeftHand Networks had one for its engineers to do demo and show the functionality of their solution. EMC had it too, and recently I found out that NetApp has its own VSA, but only resell through its partner, Fujitsu. I am not 100% sure about the NetApp thing and I need a NetApp guy to verify this.

Smaller players, but not insignificant, such as Nutanix, Nexenta and Tintri are already offering their own versions and implementation of VSA to their customers, each with its own uniqueness and differences. With the release of the VMware VSA into the open, we shall see all the big storage players offering their VSAs to VMware, like natives offering sacrifices to VMware God. Or perhaps, it has already begun. It is ala-Nexus 1000v all over again.

VMware has become a huge juggernaut and it is merely using its advantage to consolidate the storage component under its control. When VMware version 4.0 came out, vStorage API was introduced along with VAAI (vStorage API for Array Integration). VAAI was created to enhance the storage experience by offloading specific storage operations to the native features of that supported storage platform. That’s all I know about VAAI at this moment, but with this feature, the storage array is tightly integrating its platform to VMware, or should I say … quietly ensnared by VMware tentacles of doom! (Evil laugh in the background! Mua ha ha ha ….!)

In the recently past VMworld, this storage story is slowly being unfurled even more to the world. VASA (vStorage API for Storage Awareness) was recently announced and EMC’s COO Pat Gelsinger spoke about the tighter integration (that word again!) that blurs the administration domain of the VMware admin and the storage admin. Below is a video of Pat Gelsinger talking about VASA below (this is long 55 minute video – Click only if you have the time).

Mind you, the entire vStorage API is still evolving as VMware 5.0 rolls out but here’s the thing. VMware has come out and say that the storage world about LUNs, RAID groups and mount points are a level below what the VMware admin should be concerned about. VMware admins handles their storage at the VM level or as VMDK and therefore, anything below it is of little significance to them. Again, you can see that VMware is using its muscle to say “If you guys want to play, you have to play by my rules“.

So, some new announcements came out from VMworld for storage such as Capacity Pools, I/O Multiplexer, and Storage DRS (Storage Distributed Resource Management) and also an enhanced version (probably more storage resilient) SRM (Site Recovery Manager). All these are being managed at a level above the traditional storage admin level and VMware has said that the VMware admin would be able to carve out a VM volume with its own set of default storage properties, defined snapshot retentions, replication and perhaps even compression and deduplication. But all these will be happening at the VM volume or VMDK level, not a level below that.

Details are still sketchy at this point in time and we probably won’t see these GA until probably VMware version 6.0. But the inertia has been rocked quietly and the VMware storage momentum will gain strength as time passes by. We could see that VMware would just need JBOD (just a bunch of disks) because it has its own enterprise storage features through its vStorage APIs or its future storage specifications. We have seen it happening in VSA with VMware offering its own storage.

From the similar news, what surprised me was what was quoted as shown below.

The presenters said VMware developed the APIs with EMC, NetApp, Dell,
IBM and Hewlett-Packard,but they began the session with a disclaimer
that none of those vendors has committed to support the APIs in
their arrays.

Why the hell would EMC, NetApp, Dell, IBM and HP do something like that?!! Don’t they know that this could contribute to their insignificance in the future?

I am still perplexed but as the whole thing is still evolving, VMware seems to be only obvious winner here.

Gartner figures about the storage market – Half year report

After the IDC report a couple of weeks back, Gartner released their Worldwide External Controller-Based (ECB) Disk Storage Market report last week. The Gartner reports mirrors the IDC report, which confirms the situation in the storage market, and it’s good news!

Asia Pacific and Latin America are 2 regions which are experiencing tremendous growth, with 27.9% and 22.4% respectively. This means that the demand of storage networking and data management professionals is greater than ever. I have always maintained that it is important for professionals like us to enhance our technical and technology know-how to ride on the storage growth momentum.

So from the report, there are no surprises. Below is a table to summarizes the Gartner report.

 

As you can see, HP lost market share together with Dell, Fujitsu and Oracle. Oracle is focusing its energies on its Exadata platform (and it’s all about driving more database license sales), and hence their 7000-series is suffering. Despite Fujitsu partnership with NetApp and EMC, and also with its Eternus storage, lost ground as well.

Dell seems to be losing ground too, but that could be the after effects of divorcing EMC after picking up Compellent early this year. Dell should be able to bounce back as there are reports stating that Compellent is picking up a good pace for Dell. One of the reports is here.

The biggest loser of the last quarter is HP. Even though it has a 0.3% of a market drop, things does not seem so rosy as I have been observing their integration of 3PAR since the purchase late last year. No doubt they are firing all cylinders, but 3PAR does not seem to be helping HP to gain market share (yet). The mid-tier has to be addressed as well and having the old-timer EVA at the helm is beginning to show split ends. Good for the hairdresser; not good for HP. IBRIX and LeftHand complete most of HP storage line-up.

HDS is gaining ground as their storage story is beginning to gel quite well. Coupled with some great moves consolidating their services business and also their Deal Operations Center (DOC) in Kuala Lumpur, simplifies the customers doing business with them. Every company has its challenges but I am beginning to see quite a bit of traction from HDS in the local business scene.

IBM also increased market share with a 0.2% jump. Rather tepid overall but I was informed by an IBMer that their DS8000s and XIVs are doing great in the South East Asia Region. Kudos but again IBM still has to transform its mid-tier DS4000/5000 business, which IBM OEMs the storage backend from NetApp Engenio.

EMC and NetApp are the 2 juggernauts. EMC has been king of the hill for many quarters, and I have been always surprised how nimble EMC is, despite being an 800 pound gorilla. NetApp has proven its critics wrong. For many quarters it has been taking market share and that is reflected in the Gartner Half Year Report below:

 

There you have it folks. The Gartner WW ECB Disk Storage Report. Again, I just want to mention that this is a wonderful opportunity for us doing storage and data management solutions. The demand is there for experienced and skilled professionals but we have to be good, really good to compete with the rest.

EMC and NetApp gaining market share with the latest IDC figures

The IDC 2Q11 global disk storage systems report is out. The good news is data is still growing, and at a tremendous pace as well. Both revenue and capacity have raced ahead with double digit growth, with capacity growth reaching almost 50%.

And not surprisingly to me, EMC and NetApp have gained market share at the expense of HP, IBM and Dell. Here are a couple of statistics tables:

Both EMC and NetApp have recorded more than 25% revenue growth, taking 1st and joint-2nd place respectively. I have always been impressed by both companies.

For EMC, the 800lbs gorilla of the storage market, to be able to get a 26% revenue growth is a massive, massive endorsement of how well EMC execute. They are like a big oil tanker in the rough seas, with the ability to do a 90 degree turn at the blink of an eye. Kudos to Joe Tucci and Pat Gelsinger.

Netapp has always been my “little engine that could”. Their ability to take market share Q-on-Q, Yr-on-Yr is second to none and once again, they did not disappoint. Even with the change of the big man from Dan Warmenhoven to Tom Georgens did not manage a smudge in its armour. And with the purchase of LSI this year, NetApp will go from strength to strength, gaining market share at the other expense. I believe NetApp’s culture plays a big role in their ability and their success. The management has always been honest and frank and there’s a lot of respect of an individual’s ability to contribute. No wonder they are the #5 best company to work for in the US.

The big surprise for me here is Hitachi Data Systems, posting a 23.3% growth. That’s tremendous because HDS has never known to hit such high growth. Perhaps they have finally got the formula right. Their VSP and AMS range must be selling well but again, for HDS, it is a challenge running to 2 different cultural systems within their company. The Japanese team and the US team must be hitting synchronicity at last.

Dell, despite firing all cylinders with EqualLogic and Compellent, actually lost market share. Their partnership with EMC has come to an end and they have not converted their customers to the EqualLogic and Compellent boxes. The Compellent purchase is fairly new (Q1 of 2011) and this will take some time to sink in with their customer. Let’s see how they fare in the next IDC report.

In this table above, HP has always been king of the hill. Bundling their direct attached or internal storage with their servers, just like IBM, has given them an unfair advantage. But for the first time, EMC has outshipped HP, without the presence of DAS and internal storage (which EMC does not sell). Even with the purchase of 3PAR late last year, HP were not able to milk the best of what 3PAR can offer. And not to mention that HP also has LeftHand Networks which now renumbered as the P4000. On the other hand, this is a fantastic result to EMC.

Where’s IBM in all this? Rather anemic, sad to say, compared to EMC and NetApp. IBM’s figures were 1/2 of what EMC and NetApp are posting and this is not good. They don’t have the right weapons to compete. XIV is slowly taking over the mantel of DS8000 as their flagship storage, and their DS series putting up their usual numbers. But that’s not good enough because if you look at the IBM line up, their Shark is pretty much gone. XIV and Storwiz(e) are the only 2 storage platforms that IBM owns. Mind you, Storwiz(e) is not really a primary storage solution. It’s a compression engine. Both the DS-series and N-series actually belongs to LSI (which NetApp owns) and NetApp respectively. So, IBM lacks the IP for storage and in the long run, IBM must do something about it. They must either buy or innovate. They should have bought NetApp when they had the chance in 2002, but today NetApp is becoming an impossible meal to swallow.

We shall see how IBM turns out but if they continue to suffer from anemia, there’s going to be trouble down the road.

As for HP, what can I say? Their XP range is from HDS but with 3PAR in the picture, it looks like the marriage could be ending soon. EVA is an aging platform and they got to refresh it with stronger middle tier platforms. As for the low end of the range, MSA is also something unexciting and I secretly believe that LeftHand should have stepped up. But unfortunately, the HP sales have to be careful not to push MSA and LeftHand side-by-side, and not cannibalizing each other. HP definitely has a challenge in its hands and both 3PAR and LeftHand have been with them for more than 2 quarters. It’s time to execute because the IDC figures have already proved that they are slipping.

What next HP?

 

All SSDs storage array? There’s more than meets the eye at Pure Storage

Wow, after an entire week off with the holidays, I am back and excited about the many happenings in the storage world.

One of the more prominent news was the announcement of Pure Storage launching its enterprise storage array build entirely with flash-based solid state drives. In addition to that, there were other start-ups who were also offering SSDs storage arrays. The likes of Nimbus Data, Avere, Violin Memory Systems all made the news as well as the grand daddy of solid state storage arrays, Texas Memory Systems.

The first thing that came to my mind was, “Wow, this is great because this will push down the $/GB of SSDs closer to the range of $/GB for spinning disks”. But then skepticism crept in and I thought, “Do we really need an entire enterprise storage array of SSDs? That’s going to cost the world”.

At the same time, we in the storage industry knows that no piece of data are alike. They can be large, small, random, sequential, accessed frequently or infrequently and so on. It is obviously better to tier the storage, using SSDs for Tier 0, 10K/15K RPM spinning HDDs for Tier 1, SATA for Tier 2 and perhaps tape for the archive tier. I was already tempted to write my pessimism on Pure Storage when something interesting caught my attention.

Besides the usual marketing jive of sub-milliseconds, predictable latency, green messaging, global inline deduplication and compression and built-in data integrity into its Purity Operating Environment (POE), I was very surprised to find the team behind Pure Storage. Here’s their line-up

  • Scott Dietzen, CEO – starting from principal technologist of Transarc (sold to IBM), principal architect of Web Logic (sold to BEA Systems), CTO of BEA (sold to Oracle), CTO of Zimbra (sold to Yahoo! and then to VMware)
  • John “Coz” Colgrove, Founder & CTO – Veritas Fellow, CTO of Symantec Data Management group, principal architect of Veritas Volume Manager (VxVM) and Veritas File System (VxFS) and holder of 70 patents
  • John Hayes, Founder & Chief Architect – formerly of  Yahoo! office of Chief Technologist
  • Bob Wood, VP of Engineering – Formerly NetApp’s VP of File System Engineering,
  • Michael Cornwell, Director of Technology & Strategy – formerly the lead technologist of Sun Microsystems’ Sun Storage F5100 Flash Array and also Quantum’s storage architect for their storage telemetry, VTL and DXi solutions
  • Ko Yamamoto, VP of System Engineering – previously NetApp’s director of platform engineering, Quantum DXi director of hardware engineering, and also key contributor to 4-generations of Tandem NonStop technology

In addition to that, there are 3 key individual investors worth mentioning

  • Diane Green – Founder of VMware and former CEO
  • Dr. Mendel Rosenblum – Founder and former Chief Scientist and creator of VMware
  • Frank Slootman – formerly CEO of Data Domain (acquired by EMC)

All these industry big guns are flocking to Pure Storage for a reason and it looks to me that Pure Storage ain’t your ordinary, run-of-the-mill enterprise storage company. There’s definitely more than meet the eye.

On top of the enterprise storage array platform is Pure Storage’s Purity Operating Environment (POE). POE focuses on 3 key storage services which are

  • High Performance Data Reduction
  • Mission Critical Reliability
  • Predictable Sub-millisecond Performance

After going through the deep-dive videos by Pure Storage’s CTO, John Colgrove, they are very much banking the success of their solution around SSDs. Everything that they have done is based on SSDs.  For example, in order to achieve a larger capacity as well as a much cheaper $/GB, the data reduction techniques in global deduplication, high compression and also fine grained thin provision of 512 bytes are used. By trading off IOPS (which SSDs have plenty since they are several times faster than conventional spinning disks), a larger usable capacity is achieved.

In their RAID 3D, they also incorporated several high reliability techniques and data integrity algorithm that are specifically for SSDs. One note that was mentioned was that traditional RAID and especially the parity-based RAID levels were designed in the beginning to protect against an entire device failure. However, in SSDs, the failure does not necessarily occur in the entire device. Because of the way SSDs are built, the failure hotspots tend to happen at the much more granular bit level of the SSDs. The erase-then-write techniques that are inherent in NAND Flash SSDs causes the bit error rate (BER) of the SSD device to go up as the device ages. Therefore, it is more likely to get a read/write error from within the SSDs memory itself rather than having the entire SSD device failing. Pure Storage RAID 3D is meant to address such occurrences of bit errors.

I spoke a bit of storage tiering earlier in this article because every corporation employs storage tiering to be financially responsible. However, John Colgrove’s argument was why tier the storage when there’s plentiful of IOPS and the $/GB is comparable to spinning disks. That is true is when the $/GB of SSDs can match the $/GB of spinning disks. Factors we must also taken into account is the rack-space savings using the smaller profile disks of SSDs, the power-savings costs of SSDs versus conventional HDD-based enterprise storage arrays. In its entirety, there are strong indications that the $/GB of SSD-based systems to match or perhaps lower the $/GB of HDD-based systems. And since the IOPS requirement levels of present-day applications have not demanded super-high IOPS and multi-core processing is cheap, there’s plenty of head-room for Pure Storage and other similar enterprise storage array companies to grow.

The tides are changing for the storage industry and it is good to see a start-up like Pure Storage boldly coming forth to announce their backing for SSDs. It’s good for the consumer and good for the industry. But more importantly, they are driving innovations to rethink of how we build storage arrays. I am looking forward to more things to come.

Copy-on-Write and SSDs – A better match than other file systems?

We have been taught that file systems are like folders, sub-folders and eventually files. The criteria in designing file systems is to ensure that there are few key features

  • Ease of storing, retrieving and organizing files (sounds like a fridge, doesn’t it?)
  • Simple naming convention for files
  • Performance in storing and retrieving files – hence our write and read I/Os
  • Resilience in restoring full or part of a file when there are discrepancies

In file systems performance design, one of the most important factors is locality. By locality, I mean that data blocks of a particular file should be as nearby as possible. Hence, in most file systems designs originated from the Berkeley Fast File System (BFFS), requires the file system to seek the data block to be modified to ensure locality, i.e. you try not to split up the contiguity of the data blocks. The seek time to find the require data block takes time, but you are compensate with faster reads because the read-ahead feature allows you to read extra blocks ahead in anticipation that the data blocks are related.

In Copy-on-Write file systems (also known as shadow-paging file systems), the seek portion is usually not present because the new modified block is written somewhere else, not the present location of the original block. This is the foundation of Copy-on-Write file systems such as NetApp’s WAFL and Oracle Solaris ZFS. Because the new data blocks are written somewhere else, the storing (write operation) portion is faster. It eliminated the seek time and it also skipped the read-modify-write action to the original location of the data block. Therefore, write is likely to be faster.

However, the read portion will be slower because if you want to read a file, the file system has to go around looking for the data blocks because it lacks the locality. Therefore, as the COW file system ages, it tends to have higher file system fragmentation. I wrote about this in my previous blog. It is a case of ENJOY-FIRST/SUFFER-LATER. I am not writing this to say that COW file systems are bad. Obviously, NetApp and Oracle have done enough homework to make the file systems one of the better storage file systems in the market.

So, that’s Copy-on-Write file systems. But what about SSDs?

Solid State Drives (SSDs) will make enemies with file systems that tend prefer locality. Remember that some file systems prefer its data blocks to be contiguous? Well, SSDs employ “wear-leveling” and required writes to be spread out as much as possible across the SSDs device to prolong the life of the SSD device to reduce “wear-and-tear”. That’s not good news because SSDs just told the file systems, “I don’t like locality and I will spread out the data blocks“.

NAND Flash SSDs (the common ones we find in the market and not DRAM-based SSDs) are funny creatures. When you write to SSDs, you must ERASE first, WRITE AGAIN to the SSDs. This is the part that is creating the wear-and tear of the device. When I mean ERASE first, WRITE AGAIN, I describe it below

  • Writing 1 –> 0 (OK, no problem)
  • Writing 0 –> 1 (not OK, because NAND Flash can’t do that)

So, what does the SSD do? It ERASES everything, writing the entire data blocks on the device to 1s, and then converting some of them to 0s. Crazy, isn’t it? The firmware in the SSDs controller will also spread out the erase-and-then write operations across the entire SSD device to avoid concentrating the operations on a small location or dataset. This is the “wear-leveling” we often hear about.

Since SSDs shun locality and avoid the data blocks to be nearby, and Copy-on-Write file systems are already doing this because its nature to write new data blocks somewhere else, the combination of both COW file system and SSDs seems like a very good fit. It even looks symbiotic because it is a case of “I help you; and you help me“.

From this perspective, the benefits of COW file systems and SSDs extends beyond resiliency of the SSD device but also in performance. Since the data blocks are spread out at different locations in the SSD device, the effect of parallelism will inadvertently help with COW’s performance. Make sense, doesn’t it?

I have not learned about other file systems and how they behave with SSDs, but it is pretty clear that Copy-on-Write file systems works well with Solid State Devices. Have a good week ahead :-)!

Snapshots? Don’t have a C-O-W about it!

Unfortunately, I am having a COW about it!

Snapshots are the inherent offspring of the copy-on-write technique used in shadow-paging filesystems. NetApp’s WAFL and Oracle Solaris ZFS are commercial implementations of shadow-paging filesystems and they are typically promoted as Copy-on-Write filesystems.

As we may already know, snapshots are point-in-time copy of the active file system in the storage world. They perform quick backup of the active file system by making a copy of the block addresses (pointers) of the filesystem and then updating the pointer maps to the inodes in the fsinfo root inode of the WAFL filesystem for new changes after the snapshot has been taken. The equivalent of fsinfo is the uberblock in the ZFS filesystem.

However, contrary to popular belief, the snapshots from WAFL and ZFS are not copy-on-write implementations even though the shadow paging filesystem tree employs the copy-on-write technique.

Consider this for a while when a snapshot is being taken … Copy —- On —- Write. If the definition is (1) Copy then (2) Write, this means that there are several several steps to perform a copy-on-write snapshot. The filesystem has to to make a copy of the original data block (1 x Read I/O), then write the original data block to a new location (1 x Write I/O) and then write the new data block to the location of the original data block (1 x Write I/O).

This is a 3-step process that can be summarized as

  1. Read location of original data block (1 x Read I/O)
  2. Copy this data block to new unused location (1 x Write I/O)
  3. Write the new and modified data block to the location of original data block (1 x Write I/O)

This implementation, IS THE copy-on-write technique for snapshot but NetApp and possibly Oracle guys have been saying for years that their snapshots are based on copy-on-write. This is pretty much a misnomer that needs to be corrected. EMC, in its SnapSure and SnapView implementation, called this technique Copy-on-First-Write (COFW), probably to avoid the confusion. The data blocks are copied to a savvol, a separate location to store the changes of snapshots and defaults to 10% of the total capacity of their storage solutions.

As you have seen, this method is a 3 x I/O operation and it is an expensive solution. Therefore, when we compare the speed of NetApp/ZFS snapshots to EMC’s snapshots, the EMC COFW snapshot technique will be a tad slower.

However, this method has one superior advantage over the NetApp/ZFS snapshot technique. The data blocks in the active filesystem are almost always laid out in a more contiguous fashion, resulting in a more consistent read performance throughout the life of the active file system.

Below is a diagram of how copy-on-write snapshots are implemented:

 

What is NetApp/ZFS’s snapshot method then?

It is is known as Redirect-on-Write. Using the same step … REDIRECT —- ON —– WRITE. When a data block is about to be modified, the original data block is read (1 x Read I/O) and then the data block is written to a new location (1 x Write I/O). The active file system then updates the filesystem tree and its inode address to reflect the location of the new data block. The original data block remained unchanged.

In summary,

  1. Read location of original data block (1 x Read I/O)
  2. Write modified data block to new location (1 x Write I/O)

The Redirect-on-Write method resulted in 1 Write I/O less, making snapshot creation faster. This is the NetApp/ZFS method and it is superior when compared to the Copy-on-Write snapshot technique discussed earlier.

However, as the life of the filesystem progresses, fragmentation and holes will cause the performance of the active filesystem to degrade. The reason is most related data blocks are no longer contiguous and the active file system will be busy seeking the scattered data blocks across the volume. Fragmented filesystem would have to be “cleaned and reorganized” to regain its performance lustre.

Another unwanted problem using the Redirect-on-Write snapshot technique is the snapshot resides in the same boundary as the active filesystem. Over time, if the capacity consumed by the snapshots could overwhelm the active filesystem, if their recycle schedule is unchecked.

I guess this is a case of “SUFFER NOW/ENJOY LATER” or “ENJOY NOW/SUFFER LATER”. We have to make a conscious effort to understand what snapshots are all about.

Nimbus beats NetApp at Ebay – The details and the conspiracy of vendors

In my last entry, I mentioned that Nimbus has now 100TB in eBay and every single TB of it is on SSDs. The full details of how the deal was trashed out are detailed here, beating the competition from NetApp and 3PAR, the incumbents.

The significance of the deal was how a full SSDs system was able to out-price the storage arrays with a hybrid of spinning disks and SSDs.

The Nimbus news just obliterated the myth that SSDs are expensive. If you do the math, perhaps the price of the entire storage systems is not the SSDs. It could be the way some vendors structure their software licensing scheme or a combination of license, support and so on.

Just last week, we were out there discussing about hard disks and SSDs. The crux of the discussion was around pricing and the customer we were speaking too was perplexed that  the typical SATA disks from vendors such as HP, NetApp and so on cost a lot more than the Enterprise HDDs and SSDs you get from the distributors. Sometimes it is a factor of 3-4x.

I was contributing my side of the story that one unit of 1TB SATA (mind you, this is an Enterprise-grade HDD from Seagate) from a particular vendor would cost about RM4,000 to RM5,000. The usual story that we were trained when we worked for vendors was, “Oh, these disks had to be specially provisioned with our own firmware, and we can monitor their health with our software and so on ….”. My partner chipped in and cleared the BS smoke screen and basically, the high price disks comes with high margins for the vendor to feed the entire backline of the storage product, from sales, to engineers, to engineering and so on. He hit the nail right on the head because I believe a big part of the margin of each storage systems goes back to feed the vendor’s army of people behind the product.

In my research, a 2TB enterprise-grade SATA HDDs in Malaysia is approximately RM1,000 or less. Similar a SAS HDDs would be slightly higher, by 10-15%, while an enterprise-grade SSD is about RM3,000 or less. And this is far less than what is quoted by the vendors of storage arrays.

Of course, the question would be, “can the customer put in their own hard disks or ask the vendor to purchase cheaper hard disks from a cheaper source?” Apparently not! Unless you buy low end NAS from the likes of NetGear, Synology, Drobo and many low-end storage systems. But you can’t bet your business and operations on the reliability of these storage boxes, can you? Otherwise, it’s your head on the chopping block.

Eventually, the customers will demand such a “feature”. They will want to put in their own hard disks (with proper qualification from the storage vendor) because they will want cheaper HDDs or SSDs. It is already happening with some enterprise storage vendors but these vendors are not well known yet. It is happening though. I know of one vendor in Malaysia who could do such a thing …

Solid State Drives … are they reliable?

There’s been a lot of questions about Solid State Drives (SSD), aka Enterprise Flash Drives (EFD) by some vendors. Are they less reliable than our 10K or 15K RPM hard disk drives (HDDs)? I was asked this question in the middle of the stage when I was presenting the topic of Green Storage 3 weeks ago.

Well, the usual answer from the typical techie is … “It depends”.

We all fear the unknown and given the limited knowledge we have about SSDs (they are fairly new in the enterprise storage market), we tend to be drawn more to the negatives than the positives of what SSDs are and what they can be. I, for one, believe that SSDs have more positives and over time, we will grow to accept that this is all part of what the IT evolution. IT has always evolved into something better, stronger, faster, more reliable and so on. As famously quoted by Jeff Goldblum’s character Dr. Ian Malcolm, in the movie Jurassic Park I, “Life finds a way …”, IT will always find a way to be just that.

SSDs are typically categorized into MLCs (multi-level cells) and SLCs (single-level cells). They have typically predictable life expectancy ranging from tens of thousands of writes to more than a million writes per drive. This, by no means, is a measure of reliability of the SSDs versus the HDDs. However, SSD controllers and drives employ various techniques to enhance the durability of the drives. A common method is to balance the I/O accesses to the disk block to adapt the I/O usage patterns which can prolong the lifespan of the disk blocks (and subsequently the drives itself) and also ensure performance of the drive does not lag since the I/O is more “spread-out” in the drive. This is known as “wear-leveling” algorithm.

Most SSDs proposed by enterprise storage vendors are MLCs to meet the market price per IOP/$/GB demand because SLC are definitely more expensive for higher durability. Also MLCs have higher BER (bit-error-rate) and it is known than MLCs have 1 BER per 10,000 writes while SLCs have 1 BER per 100,000 writes.

But the advantage of SSDs clearly outweigh HDDs. Fast access (much lower latency) is one of the main advantages. Higher IOPS is another one. SSDs can provide from several thousand IOPS to more than 1 million IOPS when compared to enterprise HDDs. A typical 7,200 RPM SATA drive has less than 120 IOPS while a 15,000 RPM Fibre Channel or SAS drive ranges from 130-200 IOPS. That IOPS advantage is definitely a vast differentiator when comparing SSDs and HDDs.

We are also seeing both drive-format and card-format SSDs in the market. The drive-format type are typically in the 2.5″ and 3.5″ profile and they tend to fit into enterprise storage systems as “disk drives”. They are known to provide capacity. On the other hand, there are also card-format type of SSDs, that fit into a PCIe card that is inserted into host systems. These tend to address the performance requirement of systems and applications. The well known PCIe vendors are Fusion-IO which is in the high-end performance market and NetApp which peddles the PAM (Performance Access Module) card in its filers. The PAM card has been renamed as FlashCache. Rumour has it that EMC will be coming out with a similar solution soon.

Another to note is that SSDs can be read-biased or write-biased. Most SSDs in the market tend to be more read-biased, published with high read IOPS, not write IOPS. Therefore, we have to be prudent to know what out there. This means that some solution, such as the NetApp FlashCache, is more suitable for heavy-read I/O rather than writes I/O. The FlashCache addresses a large segment of the enterprise market because most applications are heavy on reads than writes.

SSDs have been positioned as Tier 0 layer in the Automated Storage Tiering segment of Enterprise Storage. Vendors such as Dell Compellent, HP 3PAR and also EMC FAST2 position themselves with enhanced tiering techniques to automated LUN and sub-LUN tiering and customers have been lapping up this feature like little puppies.

However, an up-and-coming segment for SSDs usage is positioning the SSDs as extended read or write cache to the existing memory of the systems. NetApp’s Flashcache is a PCIe solution that is basically an extended read cache. An interesting feature of Oracle Solaris ZFS called Hybrid Storage Pool allows the creation of read and write cache using SSDs. The Sun fellas even come up with cool names – ReadZilla and LogZilla – for this Hybrid Storage Pool features.

Basically, I have poured out what I know about SSDs (so far) and I intend to learn more about it. SNIA (Storage Networking Industry Association) has a Technical Working Group for Solid State Storage. I advise the readers to check it out.