About cfheoh

I am a technology blogger with 20+ years of IT experience. I write heavily on technologies related to storage networking and data management because that is my area of interest and expertise. I introduce technologies with the objectives to get readers to *know the facts*, and use that knowledge to cut through the marketing hypes, FUD (fear, uncertainty and doubt) and other fancy stuff. Only then, there will be progress. I am involved in SNIA (Storage Networking Industry Association) and as of October 2013, I have been appointed as SNIA South Asia & SNIA Malaysia non-voting representation to SNIA Technical Council. I was previously the Chairman of SNIA Malaysia until Dec 2012. As of August 2015, I am returning to NetApp to be the Country Manager of Malaysia & Brunei. Given my present position, I am not obligated to write about my employer and its technology, but I am indeed subjected to Social Media Guidelines of the company. Therefore, I would like to make a disclaimer that what I write is my personal opinion, and mine alone. Therefore, I am responsible for what I say and write and this statement indemnify my employer from any damages.

HDS acquires BlueArc … no surprise

After my early morning exercise routine, I sat down with my laptop hoping to start a new blog entry when a certain HDS news caught my eye. Here’s one of them.

It is of no surprise to me because all along, HDS hardly had a competitive, high-end NAS to compete of their own. Their first Linux-based NAS sucks, and HNAS wasn’t really successful either. But their 5-year OEM with BlueArc gave HDS an strong option to be in the NAS space.

As usual, HDS is as cautious as ever. While the 800-pound EMC has been on a shopping spree for the past 3-4 years, NetApp acquiring a few (note Engenio, Bycast, Akkori, Onaro) along the way, the only notable acquisition made by HDS was Archivas (news here). That was waaaaaay back in 2007. However, what prompted the HDS reaction was a surprise to me. According to Network Computing, it was IBM who wanted to acquire BlueArc, hence triggering HDS to have the first right to fork out the dough for BlueArc.

Why does IBM want to acquire BlueArc? IBM is sliding and lacking the storage array technology of their own. Only XIV and StorWiz(e) are  worth mentioning because their DS-series and N-series belong to NetApp. Their SONAS is pretty much a patchwork of IBM GPFS servers.  In fact, from the same Network Computing article, IBM has terminated their Data DirectNetworks storage back-end and just initiated the sourcing of the storage back-end from NetApp. It is good money to NetApp, but bad for IBM. Their story don’t gel anymore and their platform portfolio staggers as we speak.

This will definitely prompt IBM competitors to sharpen their knives. HP is renewing their artillery with 3PAR and LeftHand, and also IBRIX while Dell is coming out with guns blazing from Compellent, EqualLogic, a bit of Exanet and pretty soon, Ocarina Networks (this is a primary storage deduplication technology). Though Dell lost market share in the last IDC figures, and most likely because of lost EMC sales, they seem to be looking good with Compellent and EqualLogic. HP, is still renewing, and perhaps when they are done ditching their PC business, they should have more focus on the enterprise. Meanwhile, HDS has been winning market share in the last IDC quarter and doing well with their own VSP and AMS series.

HP and Dell have reloaded, and EMC and NetApp coming into the market as storage juggernauts. IBM cannot afford to sit quietly. How long is IBM prepared to do that as the world passes them by?

As for HDS, they are pitching their story together. AMS on the low and mid-end, VSP on the mid to high end. BlueArc fits into the NAS and scale-out NAS space. Yup, they are getting there.

We do not hear much about BlueArc from HDS Malaysia, but be prepared to know more about them soon. Wonder how HDS would rename BlueArc? H-BLU? H-ARC?

Funny Microsoft Cloud video – has Microsoft seen a mirror lately?

I am no virtualization expert, but any IT guy would be able to tell you how far ahead VMware is in the virtualization space. (note that I am not talking about the cloud space)

Virtualization is the cornerstone of Cloud Computing and everyone is claiming they are Cloud-this or Cloud-that. So is Microsoft but when it comes to the virtualization game, Microsoft Hyper-V has much to catch up to, if it ever catches up with VMware.

As what they have done in the past to other technologies (think Netscape), they cast a industry wide suffocation strategy that snuff the lights out of their competitors. But things have changed and new proponents such as Open Source, Mobile Computing and Cloud Computing are not going to be victims of Microsoft aphyxiation strategy (come on, Ballmer, try something new). Hence when I found this video

I found it really funny. It was likely Microsoft was poking at itself, without knowing it.

From the words of Mahatma Gandhi,

“First they ignore you,

then they laugh at you,

then they fight you,

then you win”

This is what Microsoft do best … throwing dirt at the competitor. Hmmm….

Got invited to HP Malaysia’s workshop … he he!

No, HP probably didn’t read my blogs and this isn’t a knee-jerk reaction from HP about things I have been writing. OK, I didn’t write about HP because I don’t know much about them. But this came as a coincidence as well as an apt title (my bad for the shameless plug for this entry’s title).

In my previous blog entry, I wrote about HP’s future in the latest IDC Q2 market share figures. I was not too enthusiastic about HP’s storage line up. Today, my old friend Mr. CC Chung, who is HP’s Country Manager for Storage, had tea with me at Bangsar Shopping Center. We were there to discuss about HP engagement with SNIA when the topic of HP’s storage came about (obviously). Chung said I lack the understanding of HP storage solutions, which I admit, is very true. And so, my friend with his kind gesture invited me to a series of HP Storage Solutions workshops, which I accept with glee and gratitude. Thank you very much, my friend.

Here’s a screenshot of their upcoming workshops:

 

I am seriously looking forward to the workshop and learn about the vibes of HP Storage Solutions. Too bad there aren’t workshops for HP 3PAR and HP X9000 IBRIX but I am sure this will be the start of my new friendship with HP.

Incidentally, as I was waiting for Chung, I was reading the HWM Magazine August 2011 issue, and lo behold, Chung was in the news announcing the HP X9000 IBRIX and X5000 G2 Network Storage System. I couldn’t find the HWM article online but I found the next best thing. A similar article (online, of course), appeared at CIO Asia. And with a nice picture of Mr. CC Chung too!

EMC and NetApp gaining market share with the latest IDC figures

The IDC 2Q11 global disk storage systems report is out. The good news is data is still growing, and at a tremendous pace as well. Both revenue and capacity have raced ahead with double digit growth, with capacity growth reaching almost 50%.

And not surprisingly to me, EMC and NetApp have gained market share at the expense of HP, IBM and Dell. Here are a couple of statistics tables:

Both EMC and NetApp have recorded more than 25% revenue growth, taking 1st and joint-2nd place respectively. I have always been impressed by both companies.

For EMC, the 800lbs gorilla of the storage market, to be able to get a 26% revenue growth is a massive, massive endorsement of how well EMC execute. They are like a big oil tanker in the rough seas, with the ability to do a 90 degree turn at the blink of an eye. Kudos to Joe Tucci and Pat Gelsinger.

Netapp has always been my “little engine that could”. Their ability to take market share Q-on-Q, Yr-on-Yr is second to none and once again, they did not disappoint. Even with the change of the big man from Dan Warmenhoven to Tom Georgens did not manage a smudge in its armour. And with the purchase of LSI this year, NetApp will go from strength to strength, gaining market share at the other expense. I believe NetApp’s culture plays a big role in their ability and their success. The management has always been honest and frank and there’s a lot of respect of an individual’s ability to contribute. No wonder they are the #5 best company to work for in the US.

The big surprise for me here is Hitachi Data Systems, posting a 23.3% growth. That’s tremendous because HDS has never known to hit such high growth. Perhaps they have finally got the formula right. Their VSP and AMS range must be selling well but again, for HDS, it is a challenge running to 2 different cultural systems within their company. The Japanese team and the US team must be hitting synchronicity at last.

Dell, despite firing all cylinders with EqualLogic and Compellent, actually lost market share. Their partnership with EMC has come to an end and they have not converted their customers to the EqualLogic and Compellent boxes. The Compellent purchase is fairly new (Q1 of 2011) and this will take some time to sink in with their customer. Let’s see how they fare in the next IDC report.

In this table above, HP has always been king of the hill. Bundling their direct attached or internal storage with their servers, just like IBM, has given them an unfair advantage. But for the first time, EMC has outshipped HP, without the presence of DAS and internal storage (which EMC does not sell). Even with the purchase of 3PAR late last year, HP were not able to milk the best of what 3PAR can offer. And not to mention that HP also has LeftHand Networks which now renumbered as the P4000. On the other hand, this is a fantastic result to EMC.

Where’s IBM in all this? Rather anemic, sad to say, compared to EMC and NetApp. IBM’s figures were 1/2 of what EMC and NetApp are posting and this is not good. They don’t have the right weapons to compete. XIV is slowly taking over the mantel of DS8000 as their flagship storage, and their DS series putting up their usual numbers. But that’s not good enough because if you look at the IBM line up, their Shark is pretty much gone. XIV and Storwiz(e) are the only 2 storage platforms that IBM owns. Mind you, Storwiz(e) is not really a primary storage solution. It’s a compression engine. Both the DS-series and N-series actually belongs to LSI (which NetApp owns) and NetApp respectively. So, IBM lacks the IP for storage and in the long run, IBM must do something about it. They must either buy or innovate. They should have bought NetApp when they had the chance in 2002, but today NetApp is becoming an impossible meal to swallow.

We shall see how IBM turns out but if they continue to suffer from anemia, there’s going to be trouble down the road.

As for HP, what can I say? Their XP range is from HDS but with 3PAR in the picture, it looks like the marriage could be ending soon. EVA is an aging platform and they got to refresh it with stronger middle tier platforms. As for the low end of the range, MSA is also something unexciting and I secretly believe that LeftHand should have stepped up. But unfortunately, the HP sales have to be careful not to push MSA and LeftHand side-by-side, and not cannibalizing each other. HP definitely has a challenge in its hands and both 3PAR and LeftHand have been with them for more than 2 quarters. It’s time to execute because the IDC figures have already proved that they are slipping.

What next HP?

 

All SSDs storage array? There’s more than meets the eye at Pure Storage

Wow, after an entire week off with the holidays, I am back and excited about the many happenings in the storage world.

One of the more prominent news was the announcement of Pure Storage launching its enterprise storage array build entirely with flash-based solid state drives. In addition to that, there were other start-ups who were also offering SSDs storage arrays. The likes of Nimbus Data, Avere, Violin Memory Systems all made the news as well as the grand daddy of solid state storage arrays, Texas Memory Systems.

The first thing that came to my mind was, “Wow, this is great because this will push down the $/GB of SSDs closer to the range of $/GB for spinning disks”. But then skepticism crept in and I thought, “Do we really need an entire enterprise storage array of SSDs? That’s going to cost the world”.

At the same time, we in the storage industry knows that no piece of data are alike. They can be large, small, random, sequential, accessed frequently or infrequently and so on. It is obviously better to tier the storage, using SSDs for Tier 0, 10K/15K RPM spinning HDDs for Tier 1, SATA for Tier 2 and perhaps tape for the archive tier. I was already tempted to write my pessimism on Pure Storage when something interesting caught my attention.

Besides the usual marketing jive of sub-milliseconds, predictable latency, green messaging, global inline deduplication and compression and built-in data integrity into its Purity Operating Environment (POE), I was very surprised to find the team behind Pure Storage. Here’s their line-up

  • Scott Dietzen, CEO – starting from principal technologist of Transarc (sold to IBM), principal architect of Web Logic (sold to BEA Systems), CTO of BEA (sold to Oracle), CTO of Zimbra (sold to Yahoo! and then to VMware)
  • John “Coz” Colgrove, Founder & CTO – Veritas Fellow, CTO of Symantec Data Management group, principal architect of Veritas Volume Manager (VxVM) and Veritas File System (VxFS) and holder of 70 patents
  • John Hayes, Founder & Chief Architect – formerly of  Yahoo! office of Chief Technologist
  • Bob Wood, VP of Engineering – Formerly NetApp’s VP of File System Engineering,
  • Michael Cornwell, Director of Technology & Strategy – formerly the lead technologist of Sun Microsystems’ Sun Storage F5100 Flash Array and also Quantum’s storage architect for their storage telemetry, VTL and DXi solutions
  • Ko Yamamoto, VP of System Engineering – previously NetApp’s director of platform engineering, Quantum DXi director of hardware engineering, and also key contributor to 4-generations of Tandem NonStop technology

In addition to that, there are 3 key individual investors worth mentioning

  • Diane Green – Founder of VMware and former CEO
  • Dr. Mendel Rosenblum – Founder and former Chief Scientist and creator of VMware
  • Frank Slootman – formerly CEO of Data Domain (acquired by EMC)

All these industry big guns are flocking to Pure Storage for a reason and it looks to me that Pure Storage ain’t your ordinary, run-of-the-mill enterprise storage company. There’s definitely more than meet the eye.

On top of the enterprise storage array platform is Pure Storage’s Purity Operating Environment (POE). POE focuses on 3 key storage services which are

  • High Performance Data Reduction
  • Mission Critical Reliability
  • Predictable Sub-millisecond Performance

After going through the deep-dive videos by Pure Storage’s CTO, John Colgrove, they are very much banking the success of their solution around SSDs. Everything that they have done is based on SSDs.  For example, in order to achieve a larger capacity as well as a much cheaper $/GB, the data reduction techniques in global deduplication, high compression and also fine grained thin provision of 512 bytes are used. By trading off IOPS (which SSDs have plenty since they are several times faster than conventional spinning disks), a larger usable capacity is achieved.

In their RAID 3D, they also incorporated several high reliability techniques and data integrity algorithm that are specifically for SSDs. One note that was mentioned was that traditional RAID and especially the parity-based RAID levels were designed in the beginning to protect against an entire device failure. However, in SSDs, the failure does not necessarily occur in the entire device. Because of the way SSDs are built, the failure hotspots tend to happen at the much more granular bit level of the SSDs. The erase-then-write techniques that are inherent in NAND Flash SSDs causes the bit error rate (BER) of the SSD device to go up as the device ages. Therefore, it is more likely to get a read/write error from within the SSDs memory itself rather than having the entire SSD device failing. Pure Storage RAID 3D is meant to address such occurrences of bit errors.

I spoke a bit of storage tiering earlier in this article because every corporation employs storage tiering to be financially responsible. However, John Colgrove’s argument was why tier the storage when there’s plentiful of IOPS and the $/GB is comparable to spinning disks. That is true is when the $/GB of SSDs can match the $/GB of spinning disks. Factors we must also taken into account is the rack-space savings using the smaller profile disks of SSDs, the power-savings costs of SSDs versus conventional HDD-based enterprise storage arrays. In its entirety, there are strong indications that the $/GB of SSD-based systems to match or perhaps lower the $/GB of HDD-based systems. And since the IOPS requirement levels of present-day applications have not demanded super-high IOPS and multi-core processing is cheap, there’s plenty of head-room for Pure Storage and other similar enterprise storage array companies to grow.

The tides are changing for the storage industry and it is good to see a start-up like Pure Storage boldly coming forth to announce their backing for SSDs. It’s good for the consumer and good for the industry. But more importantly, they are driving innovations to rethink of how we build storage arrays. I am looking forward to more things to come.

Having fun with your storage vendor and get the information to fit your data center

I was on my way to Singapore yesterday. At the departure lounge, I just started reading “Data Center Storage” by Hubbert Smith (ISBN#: 978-1439834879) yesterday and I learned something very interesting immediately. Then my thoughts started stirring and I thought I have a bit of fun with what I have learned from the book.

The single, most significant piece of the storage solution is the hard disk drive (HDD). Regardless of SAN or NAS protocols, the data is stored and served from the hard disk drives. And there are 4 key metrics of a HDD, which are

  • Price
  • Performance
  • Capacity
  • Power

As storage professionals, we are often challenged to deliver the best storage solution to meet the customer’s requirements. Therefore, it is not about providing the fastest IOPS or the best availability or the lowest price. It is about providing the best balance of the 4 key metrics above.

The 4 metrics are of little help when they are standalone but if they are combined in relation to each other, you as a customer, can obtain some measurable ratios that will be useful to size for a requirements, keeping the balance of the 4 key metrics better defined rather than getting fluff and BS from the storage vendor.

In the book, the following table was displayed and I found it to be extremely useful:

Key Ratios for HDDs
Ratio
Performance/Price IOPS/$
Performance/Power IOPS/watt
Capacity/Price GB/$
Capacity/Power GB/watt

The relational ratios in red are going to be useful in determining the right type of storage for the requirement. And we will come back to this later. We begin our quest to obtain the information that we want – Performance, Capacity, Price, Power.

Capacity is the easy one because it is a given fact the size of the HDDs.

IOPS for each type of HDDs is also easy to obtain. See table below:

Disk Type RPM IOPS Range
SATA 5,400 50-75
SATA 7,200 75-100
SAS/FC 10,000 100-125
SAS/FC 15,000 175-200
SSD N/A 5,000-10,000

The watt of each HDDs is also quite easy. Just ask the vendor to give the specification of the HDDs.

The pricing part would be part where we can have a bit of fun with the storage vendor. Usually, storage vendors do not release the price of a single HDD in the quotation. The total price is lumped together with everything else, making it harder to decipher the price. So, what can the customer do?

Easy. Get 4-5 quotations from the storage vendor, each with different type of HDDs. This is the customer’s rights. For example, I have created several fictitious quotations, each with a different type of HDDs/SSD and pricing.

Quote #1 (SATA 7200 RPM)

 

Quote #2 (SAS 10,000 RPM)

 

Quote #3 (SAS 15,000 RPM)

 

Quote #4 (SSD)

 

From the 4 quotations, we cannot ascertain the true price of a single disk, but we can assume that the 12 units HDDs/SSDs take up 50% of the entire quotation. With all things being equal, especially the quantity of 12, we can establish the very rough estimate of the price. Having fun asking the storage vendor to run around with the quotations is the added bonus.

But we can derive the following figures (rough estimates but useful when we apply them to the key ratios above)

1TB SATA = 3333.33; 300GB 10,000 RPM SAS = 5000.00; 300GB 15,000 RPM SAS = 6250.00; 100GB SSD = 10416.66

When we juxtapose the information that we have collected i.e. price, performance and capacity (ok, I am skipping power/watt because I am lazy to find out), we come up with a table below:

In the boxed area, we can now easily determine which HDDs/SSDs that give the best value for money either Performance/$ or Capacity/$. The higher the key ratio, the better the value.

From this aspect, the customer can now determine methodically which type of disk he should invest into, in order to get the best value.

This is just a very simplistic method to find the value of the storage solution to be purchased. Bear in mind that there are many other factors to consider as well, such as rack unit height, total power consumption, storage efficiency, data protection and many more.

I am not taking credit for what Hufferd Smith has proposed. All kudos to him but I am using his method to apply to what is relevant to us on the field.

In conclusion, the customer won’t be baffled and confused thinking that they got the best deal at lowest price or fastest performance. This crude method can help turn perception into something that is more concrete and analytical. It’s time we, as customer, know our rights, and know what we are buying into and have a bit of fun too with the storage vendor.

Copy-on-Write and SSDs – A better match than other file systems?

We have been taught that file systems are like folders, sub-folders and eventually files. The criteria in designing file systems is to ensure that there are few key features

  • Ease of storing, retrieving and organizing files (sounds like a fridge, doesn’t it?)
  • Simple naming convention for files
  • Performance in storing and retrieving files – hence our write and read I/Os
  • Resilience in restoring full or part of a file when there are discrepancies

In file systems performance design, one of the most important factors is locality. By locality, I mean that data blocks of a particular file should be as nearby as possible. Hence, in most file systems designs originated from the Berkeley Fast File System (BFFS), requires the file system to seek the data block to be modified to ensure locality, i.e. you try not to split up the contiguity of the data blocks. The seek time to find the require data block takes time, but you are compensate with faster reads because the read-ahead feature allows you to read extra blocks ahead in anticipation that the data blocks are related.

In Copy-on-Write file systems (also known as shadow-paging file systems), the seek portion is usually not present because the new modified block is written somewhere else, not the present location of the original block. This is the foundation of Copy-on-Write file systems such as NetApp’s WAFL and Oracle Solaris ZFS. Because the new data blocks are written somewhere else, the storing (write operation) portion is faster. It eliminated the seek time and it also skipped the read-modify-write action to the original location of the data block. Therefore, write is likely to be faster.

However, the read portion will be slower because if you want to read a file, the file system has to go around looking for the data blocks because it lacks the locality. Therefore, as the COW file system ages, it tends to have higher file system fragmentation. I wrote about this in my previous blog. It is a case of ENJOY-FIRST/SUFFER-LATER. I am not writing this to say that COW file systems are bad. Obviously, NetApp and Oracle have done enough homework to make the file systems one of the better storage file systems in the market.

So, that’s Copy-on-Write file systems. But what about SSDs?

Solid State Drives (SSDs) will make enemies with file systems that tend prefer locality. Remember that some file systems prefer its data blocks to be contiguous? Well, SSDs employ “wear-leveling” and required writes to be spread out as much as possible across the SSDs device to prolong the life of the SSD device to reduce “wear-and-tear”. That’s not good news because SSDs just told the file systems, “I don’t like locality and I will spread out the data blocks“.

NAND Flash SSDs (the common ones we find in the market and not DRAM-based SSDs) are funny creatures. When you write to SSDs, you must ERASE first, WRITE AGAIN to the SSDs. This is the part that is creating the wear-and tear of the device. When I mean ERASE first, WRITE AGAIN, I describe it below

  • Writing 1 –> 0 (OK, no problem)
  • Writing 0 –> 1 (not OK, because NAND Flash can’t do that)

So, what does the SSD do? It ERASES everything, writing the entire data blocks on the device to 1s, and then converting some of them to 0s. Crazy, isn’t it? The firmware in the SSDs controller will also spread out the erase-and-then write operations across the entire SSD device to avoid concentrating the operations on a small location or dataset. This is the “wear-leveling” we often hear about.

Since SSDs shun locality and avoid the data blocks to be nearby, and Copy-on-Write file systems are already doing this because its nature to write new data blocks somewhere else, the combination of both COW file system and SSDs seems like a very good fit. It even looks symbiotic because it is a case of “I help you; and you help me“.

From this perspective, the benefits of COW file systems and SSDs extends beyond resiliency of the SSD device but also in performance. Since the data blocks are spread out at different locations in the SSD device, the effect of parallelism will inadvertently help with COW’s performance. Make sense, doesn’t it?

I have not learned about other file systems and how they behave with SSDs, but it is pretty clear that Copy-on-Write file systems works well with Solid State Devices. Have a good week ahead :-)!

Will SAN or NAS matter if your customer’s storage is in the Cloud?

An interesting question popped into my head yesterday. With all this push into the Cloud, the customer does not own most of the computer equipment. They are just getting services and when they want storage, do you think they care whether their storage is on a SAN or NAS?

I have mentioned this before, Cloud makes a lot of IT stuff irrelevant. Read my previous blog. This means that the demand for IT techies, sysadmins, consultant will suddenly be squeezed into who’s very good, good, not-so-good and the downright bad ones. Let’s the survival-of-the-fittest games begin!

Yes, the SAN and NAS, or even unified storage story doesn’t hold much weight anymore. However, to the cloud service provider, they will be out there looking for what is best for their bottom line, whether it will be a branded box or just a white box if they are willing to build the storage on their own. For those providers who have strong financials, obviously investing in premium brands like EMC, IBM, NetApp, and so on, makes sense because they need someone to blame and penalize when the shit hits the fan. For those who doesn’t have the financial prowess, this presents a whole new economy that resellers, partners, distributors can tap on to – build for these cloud providers at a cheaper price (hint, hint).

However, storage relies on a strong storage operating system to do just that. They are plenty of open source ones. Hey, you can practically build a simple iSCSI or NAS box with Linux. Consumer grade NAS such as NetGear, Synology and DLink have been using open-source Linux to penetrate the low-end, home storage market for years. The cloud providers will be a different ballgame, but the storage piece is fundamentally the same.

Things are changing folks, and for those consultants, product pre-sales, post-sales, sysadmins, operators of storage, you have to evolve to meet this new market. SAN and NAS do not matter anymore when customers are using the cloud services.

p/s: I have been spending time looking at some very, very cool cloud-ready storage operating systems. If you have the time, leave me a comment and we’ll talk. 😀