Boosting Solid States beyond SATA

Lately, I have been getting deeper and deeper into low-level implementation related to storage technologies. In my previous blog, I was writing my learning adventure with Priority Flow Control (PFC) and intend to further the Data Center Bridging concepts with future blog entries.

Before I left for Sydney for a holiday last week, I got sidetracked into exciting stuff that’s happening in my daily encounters with friends and new friends. 2 significant storage related technologies fell onto my lap. One is NVMe (Non-Volatile Memory express) and the other FPGA (Field Programmable Gate Array).

While this blog is going to be about NVMe, I actually found FPGA much more exciting to me. Through conversations, I found that there are 2 “biggies” in the FPGA world, and they are designed and manufactured by Xilink and Altera. I admit that I have not done my homework on FPGA yet, having just returned from Sydney last night. I will blog about FPGA in future blogs.

But NVMe is also an important technology direction to the storage world as well.

I think most of us are probably already mesmerized by solid state drives. The bombardment of marketing, presentations, advertising and whatever else the vendors do to promote (and self-promote) solid state drives are inundating the intellectual senses of consumers and enterprises alike. And yet, many vendors do not explain both the pros and cons of integrating solid states into their IT environment. Even worse, many don’t even know the strengths and weaknesses of solid states, hence creating some exaggeration that continues to create a spiral vortex of inaccuracies. Like a self-feeding frenzy, the industry seems to have placed solid state storage as the saviour of the enterprise storage world. Go figure with that!

Continue reading

Novell Filr about to be revealed

My training engagement landed me in Manila this week. At the back of my mind is Novell Filr, first revealed to me a week ago by my buddy at Novell Malaysia. After almost 18 months since I first wrote about it, Novell Filr is about to be revealed in my blog within this month. And it has come at an opportune time, because the enterprise BYOD/file synchronization market is about to take off.

Gartner defines this market as Enterprise File Synchronization and Sharing (EFSS) and it is already a very crowded market given the popularity of Dropbox, Box.net, Sugarsync and many, many others. It is definitely a market that is coveted by many but mastered by a few. There are just too many pretenders and too few real players.

The proliferation of smart phones and tablets and other mobile devices has opened up a burgeoning need to have data everywhere. The wonderfulness of having data right at the fingertips every time they are wanted give rise to the need of wanting business and corporate data to be available as well. The power of having data instantly at the swipe of our fingers on the touchscreen is akin us feeling like God, giving life to our communication and us making opportunities come alive at the very moment. Continue reading

It’s all about executing the story

I have been in hibernation mode, with a bit of “writer’s block”.

I woke up in Bangalore in India at 3am, not having adjusted myself to the local timezone. Plenty of things were on my mind but I can’t help thinking about what’s happening in the enterprise storage market after the Gartner Worldwide External Controller-Based report for 4Q12 came out  last night. Below is the consolidated table from Gartner:

Just a few weeks ago, it was IDC with its Worldwide Disk Storage Tracker and below is their table as well:

Continue reading

Is there no one to challenge EMC?

It’s been a busy, busy month for me.

And when the IDC Worldwide Quarterly Disk Storage Systems Tracker for 3Q12 came out last week, I was reading in awe how impressive EMC was at the figures that came out. But most impressive of all is how the storage market continue to grow despite very challenging and uncertain business conditions. With the Eurozone crisis, China experiencing lower economic growth numbers and the uncertainty in the US economic sectors, it is unbelievable that the storage market grew 24.4% y-o-y. And for the first time, 7,104PB was shipped! Yes folks, more than 7 exabytes was shipped during that period!

In the Top 5 external disk storage market based on revenue, only EMC and HDS recorded respectable growth, recording 8.7% and 13.8% respectively. NetApp, my “little engine that could” seems to be running out of steam, earning only 0.9% growth. The rest of the field, IBM and HP, recorded negative growth. Here’s a look at the Top 5 and the rest of the pack:

HP -11% decline is shocking to me, and given the woes after woes that HP has been experiencing, HP has not seen the bottom yet. Let’s hope that the new slew of HP storage products and technologies announced at HP Discover 2012 will lift them up. It also looked like a total rebranding of the HP storage products as well, with a big play on the word “Store”. They have names like StoreOnce, StoreServ, StoreAll, StoreVirtual, StoreEasy and perhaps more coming.

The Open SAN market, which includes iSCSI has EMC again at Number 1, with 29.8%, followed by IBM (14%), HDS (12.2%) and HP (11.8%). When combined with NAS numbers, the NAS + Open SAN market, EMC has 33.5% while NetApp is 13.7%.

Of course, it is just not about external storage because the direct-attached storage numbers count too. With that, the server vendors of IBM, HP and Dell are still placed behind EMC. Here’s a look at that table from IDC:

There’s a highlight of Dell in the table above. Dell actually grew by 4.0% compared to decline in HP and IBM, gaining 0.1%. However, their numbers seem too tepid and led to the exit of Darren Thomas, Dell’s storage group head honco. News of Darren’s exit was on TheRegister.

I also want to note that NAS growth numbers actually outpaced Open SAN numbers including iSCSI.

This leads me to say that there is a dire need for NAS technical and technology expertise in the local storage market. As the adoption of NFSv4 under way and SMB 2.0 and 3.0 coming into the picture, I urge all storage networking professionals who are more pro-SAN to step out of their comfort zone and look into NAS as well. The world is changing and it is no longer SAN vs NAS anymore. And NFSv4.1 is blurring the lines even more with the concepts of layout.

But back to the subject to storage market, is there no one out there challenging EMC in a big way? NetApp was, some years ago, recorded double digit growth and challenging EMC neck-and-neck, but that mantle seems to be taken over by HDS. But both are long way to go to get close to EMC.

Kudos to the EMC team for damn good execution!

“Cloud” hosting hacked – customer data lost

Yes, Yes, I have been inactive for almost 2 months. There were many things I had to do to put my business back into shape again, and hence my lack of activities in my blog.

Yes, Yes, I have a lot of catching up to do, but first I would like to report that one of the more prominent web hosting companies (many of who frequently brand themselves as “Cloud” companies) in Malaysia have been hacked.

I got the news at about 8.00am on September 28th morning and I was in Bangalore, India. Friend of mine buzzed me on Facebook Messenger, and shared with me the following:

Thursday, September 27, 2012 1:46 AM
Date: 27th Sep 2012
Time: 6.01PM GMT +0800

We have an intrusion incident that happened early this morning around 12midnight of 27th September 2012. About 50 customers’ Virtual Machines hosted on our CLOUD were deleted from the cloud server. When we spotted the abnormal behavior, we managed to stop the intruder from causing more damages to our system.

From our initial investigation, we suspect one of our employees who will leave the company at this month end logged into one of our control panels and deleted some Virtual Machines. The backup was terminated at the same time when the Virtual Machines were deleted.

At this point of time, our team is working relentlessly on restoring the affected virtual machines and customer data.

In the mean time, my COO is lodging a police report and my manager is lodging a report to MyCERT while I am writing this email.

We are truly sorry about the whole incident as it has caused a great deal of inconvenience to our customers and their end customers as well.

Please also be rest assured that our CLOUD is truly secured; this incident was not a successful hacking attempt but rather sabotage via an ordinary login.

Detailed investigation reports will be compiled and sent to our customers.

Sincerely,

Chan Kee Siak
Founder and CEO

===================================
Summary / History of issues:
===================================
27th Sep 2012,

1.00am:
- We detected several virtual machines on the cloud were throwing warning signals.
- Technical Managers were immediately informed.

01.30am:
- We found out that an intruder was attempting to delete some of the virtual machines on our CLOUD cluster.
- The intruder was using a valid login to access our CLOUD control panel.
- COO was informed, signed in to co-ordinate.
- The access of the intruder has been disabled to prevent further damage.
- We posted an announcement at: https://support.exabytes.com.my/News/2248/c...aintenance.aspx

02.00am:
- CEO was informed.
- We found out that the intruder was using the login ID and password which belonged to one of the staff members whom we had recently sent out termination notice. The last working day of this staff was end of this month.
- Around 50++ Virtual Machines / VPS were affected.
- We started to inform affected customers.

02.30am:
- Rebuild and restoration of virtual machines began.

10.00am:
- Some Virtual Machines were Restored. The rest were still pending, on going.
- For Virtual machines without extra R1Soft Backup, we have recreated blank virtual machines with Operating System.

12:30pm:
- Attempted to recover the deleted backup on the CLOUD Backup server via data recovery tool. No guarantee and no ETA yet, we were doing our very best.

5.39pm:
- 80% of virtual machines were recreated. However, some were without the latest backup of data.
- Our engineers were attempting to recover the Cloud Backup Hard Drive with the use of recovery tool. However, as the size was huge, it might take few more hours.

Damage:
- The CLOUD Accounts, Virtual Machines and CLOUD Backup of affected clients were deleted. Only client with additional R1Soft backup still has the recent backup.

=================================

Date: 27th September 2012
Time: 1:55 AM GMT+8

Maintenance Details:
We have been alert by our monitoring system that certain Cloud VM has been found to be inaccessible. Our senior admin engineers are now working to resolve the issues.

Maintenance effect:
VMs affected isolated under MY-CLOUD-02 Zone.

We regret for any inconveniences caused.

Best regards,

Support team
------------------
Technical Support Department.

Continue reading

Expensive hard disk is good

No, I don’t mean to be bad, but the spinning HDDs’ prices will remain high even if the post-Thailand flood production has resumed to normalcy.

According to IHS iSuppli, a market research intelligence firm, the prices will continue to hold steady and will not fall to pre-flood level until 2014. The reason is simple. The prices of the hard disk drives are pretty much dictated by the only 2 real remaining hard disk companies in the world – Seagate and Western Digital. These guys controls more than 85% of the hard disk market and as demand of HDDs outstrips supply, the current hard disk prices are hitting the bottom line hard for just about everyone.

But the bad news is turning into good news for solid state storage devices. NAND-Flash based devices are driving a new clan of storage start-ups in the likes of Violin Memory, Kaminario, Pure Storage and Virident. The EMC acquisition of XtremIO was a strong endorsement that cements the cornerstone of all enterprise storage arrays to come. Even the Register predicted that the EMC VMAX will be the last primary storage array before the flash tsunami.

The NAND-Flash solid state of multi-level cells (MLCs) and single level cells (SLCs) and even triple level cells (TLCs) are going through birth, puberty, adolescent extremely fast because the demand for faster and faster IOPS, throughput and lower latency is hitting at full speed. And it is likely that all the xLCs (SLCs, MLCs and TLCs) could go through cycle in an extremely short lifespan, because there is a new class of solid state that is pushing the performance-price envelope closer and closer to speed of DRAM but with the price of Flash. This new type of solid state is Storage Class Memory (SCM). Continue reading

4TB disks – the end of RAID

Seriously? 4 freaking terabyte disk drives?

The enterprise SATA/SAS disks have just grown larger, up to 4TB now. Just a few days ago, Hitachi boasted the shipment of the first 4TB HDD, the 7,200 RPM Ultrastar™ 7K4000 Enterprise-Class Hard Drive.

And just weeks ago, Seagate touted their Heat-Assisted Magnetic Recording (HAMR) technology will bring forth the 6TB hard disk drives in the near future, and 60TB HDDs not far in the horizon. 60TB is a lot of capacity but a big, big nightmare for disks availability and data backup. My NetApp Malaysia friend joked that the RAID reconstruction of 60TB HDDs would probably finish by the time his daughter finishes college, and his daughter is still in primary school!.

But the joke reflects something very serious we are facing as the capacity of the HDDs is forever growing into something that could be unmanageable if the traditional implementation of RAID does not change to meet such monstrous capacity.

Yes, RAID has changed since 1988 as every vendor approaches RAID differently. NetApp was always about RAID-4 and later RAID-DP and I remembered the days when EMC had a RAID-S. There was even a vendor in the past who marketed RAID-7 but it was proprietary and wasn’t an industry standard. But fundamentally, RAID did not change in a revolutionary way and continued to withstand the ever ballooning capacities (and pressures) of the HDDs. RAID-6 was introduced when the first 1TB HDDs first came out, to address the risk of a possible second disk failure in a parity-based RAID like RAID-4 or RAID-5. But today, the 4TB HDDs could be the last straw that will break the camel’s back, or in this case, RAID’s back.

RAID-5 obviously is dead. Even RAID-6 might be considered insufficient now. Having a 3rd parity drive (3P) is an option and the only commercial technology that I know of which has 3 parity drives support is ZFS. But having 3P will cause additional overhead in performance and usable capacity. Will the fickle customer ever accept such inadequate factors?

Note that 3P is not RAID-7. RAID-7 is a trademark of a old company called Storage Computer Corporation and RAID-7 is not a standard definition of RAID.

One of the biggest concerns is rebuild times. If a 4TB HDD fails, the average rebuild speed could take days. The failure of a second HDD could up the rebuild times to a week or so … and there is vulnerability when the disks are being rebuilt.

There are a lot of talks about declustered RAID, and I think it is about time we learn about this RAID technology. At the same time, we should demand this technology before we even consider buying storage arrays with 4TB hard disk drives!

I have said this before. I am still trying to wrap my head around declustered RAID. So I invite the gurus on this matter to comment on this concept, but I am giving my understanding on the subject of declustered RAID.

Panasas‘ founder, Dr. Garth Gibson is one of the people who proposed RAID declustering way back in 1999. He is a true visionary.

One of the issues of traditional RAID today is that we still treat the hard disk component in a RAID domain as a whole device. Traditional RAID is designed to protect whole disks with block-level redundancy.  An array of disks is treated as a RAID group, or protection domain, that can tolerate one or more failures and still recover a failed disk by the redundancy encoded on other drives. The RAID recovery requires reading all the surviving blocks on the other disks in the RAID group to recompute blocks lost on the failed disk. In short, the recovery, in the event of a disk failure, is on the whole object and therefore, a entire 4TB HDD has to be recovered. This is not good.

The concept of RAID declustering is to break away from the whole device idea. Apply RAID at a more granular scale. IBM GPFS works with logical tracks and RAID is applied at the logical track level. Here’s an overview of how is compares to the traditional RAID:

The logical tracks are spread out algorithmically spread out across all physical HDDs and the RAID protection layer is applied at the track level, not at the HDD device level. So, when a disk actually fails, the RAID rebuild is applied at the track level. This significant improves the rebuild times of the failed device, and does not affect the performance of the entire RAID volume much. The diagram below shows the declustered RAID’s time and performance impact when compared to a traditional RAID:

While the IBM GPFS approach to declustered RAID is applied at a semi-device level, the future is leaning towards OSD. OSD or object storage device is the next generation of storage and I blogged about it some time back. Panasas is the leader when it comes to OSD and their radical approach to this is applying RAID at the object level. They call this Object RAID.

With object RAID, data protection occurs at the file-level. The Panasas system integrates the file system and data protection to provide novel, robust data protection for the file system.  Each file is divided into chunks that are stored in different objects on different storage devices (OSD).  File data is written into those container objects using a RAID algorithm to produce redundant data specific to that file.  If any object is damaged for whatever reason, the system can recompute the lost object(s) using redundant information in other objects that store the rest of the file.

The above was a quote from the blog of Brent Welch, Panasas’ Director of Software Architecture. As mentioned, the RAID protection of the objects in the OSD architecture in Panasas occurs at file-level, and the file or files constitute the object. Therefore, the recovery domain in Object RAID is at the file level, confining the risk and damage of data loss within the file level and not at the entire device level. Consequently, the speed of recovery is much, much faster, even for 4TB HDDs.

Reliability is the key objective here. Without reliability, there is no availability. Without availability, there is no performance factors to consider. Therefore, the system’s reliability is paramount when it comes to having the data protected. RAID has been the guardian all these years. It’s time to have a revolutionary approach to safeguard the reliability and ensure data availability.

So, how many vendors can claim they have declustered RAID?

Panasas is a big YES, and they apply their intelligence in large HPC (high performance computing) environments. Their technology is tried and tested. IBM GPFS is another. But where are the rest?

 

Server way of locked-in storage

It is kind of interesting when every vendor out there claims that they are as open as they can be but the very reality is, the competitive nature of the game is really forcing storage vendors to speak open, but their actions are certainly not.

Confused? I am beginning to see a trend … a trend that is forcing customers to be locked-in with a certain storage vendor. I am beginning to feel that customers are given lesser choices, especially when the brand of the server they select for their applications  will have implications on the brand of storage they will be locked in into.

And surprise, surprise, SSDs are the pawns of this new cloak-and-dagger game. How? Well, I have been observing this for quite a while now, and when HP announced their SMART portfolio for their storage, it’s time for me to say something.

In the announcement, it was reported that HP is coming out with its 8th generation ProLiant servers. As quoted:

The eighth generation ProLiant is turbo-charging its storage with a Smart Array containing solid state drives and Smart Caching.

It also includes two Smart storage items: the Smart Array controllers and Smart Caching, which both feature solid state storage to solve the disk I/O bottleneck problem, as well as Smart Data Services software to use this hardware

From the outside, analysts are claiming this is a reaction to the recent EMC VFCache product. (I blogged about it here) and HP was there to put the EMC VFcache solution as a first generation product, lacking the smarts (pun intended) of what the HP products have to offer. You can read about its performance prowess in the HP Connect blog.

Similarly, Dell announced their ExpressFlash solution that ties up its 12th generation PowerEdge servers with their flagship (what else), Dell Compellent storage.

The idea is very obvious. Put in a PCIe-based flash caching card in the server, and use a condescending caching/tiering technology that ties the server to a certain brand of storage. Only with this card, that (incidentally) works only with this brand of servers, will you, Mr. Customer, be able to take advantage of the performance power of this brand of storage. Does that sound open to you?

HP is doing it with its ProLiant servers; Dell is doing it with its ExpressFlash; EMC’s VFCache, while not advocating any brand of servers, is doing it because VFCache works only with EMC storage. We have seen Oracle doing it with Oracle ExaData. Oracle Enterprise database works best with Oracle’s own storage and the intelligence is in its SmartScan layer, a proprietary technology that works exclusively with the storage layer in the Exadata. Hitachi Japan, with its Hitachi servers (yes, Hitachi servers that we rarely see in Malaysia), already has such a technology since the last 2 years. I wouldn’t be surprised that IBM and Fujitsu already have something in store (or probably I missed the announcement).

NetApp has been slow in the game, but we hope to see them coming out with their own server-based caching products soon. More pure play storage are already singing the tune of SSDs (though not necessarily server-based).

The trend is obviously too, because the messaging is almost always about storage performance.

Yes, I totally agree that storage (any storage) has a performance bottleneck, especially when it comes to IOPS, response time and throughput. And every storage vendor is claiming SSDs, in one form or another, is the knight in shining armour, ready to rid the world of lousy storage performance. Well, SSDs are not the panacea of storage performance headaches because while they solve some performance issues, they introduce new ones somewhere else.

But it is becoming an excuse to introduce storage vendor lock-in, and how has the customers responded this new “concept”? Things are fairly new right now, but I would always advise customers to find out and ask questions.

Cloud storage for no vendor lock-in? Going to the cloud also has cloud service provider lock-in as well, but that’s another story.

 

The marriage in the cloud

Admit it! You are a terabyte junkie! I am sure many of us have one terabyte or more of your personal “stuff” at home. Heck, I even heard from a friend that he has almost 20TB of high definition movies (thank you Torrent!) at home! That’s crazy!

And what the typical Malaysian consumer would do after he or she runs out of hard disk space? In KL (our beloved capital city, Kuala Lumpur), they would throng the Low Yat IT mall or extensions of it, like Digital Mall in PJ Section 14. In other towns and cities in Malaysia, PC fairs are popular, as consumers try to get the best price possible (We Malaysian are good at squeezing the max of a deal)

It is difficult for the not-so-IT-literate consumer to differentiate which brand is the best. Buffalo, Iomega, DLink, Western Digital, etc, etc. But the tides are changing, because these vendors want to tie you down for the rest of your digital life. You see, buying a small NAS for the home now comes with a big carrot, an incentive to keep you wanting for more, and yet you can’t unbind yourself from the tether once you are hooked.

Cloud storage hasn’t taken off in a big way last year. But many cloud storage vendors know there are plenty of opportunities out there but how do they get the consumers to upload their files, photos and whatever stuff they might have, to cloud storage? Ingeniously, they work together with other smaller NAS storage players and use these vendor’s product offerings as baits. They bundle a significantly large FREE capacity or data protection offering in the Cloud Storage as the carrot, and once the consumer decides to put their files in the cloud storage, boom, they are ensnared to become a long term ATM machine to the Cloud Storage Provider.

Sneaky? No? I call this good, smart marketing. You have a market of opportunities out there, but cloud storage isn’t catching on. You have small NAS vendors that is reaching out to the market of consumer, but it’s a brutal, competitive arena and margins are razor thin. It’s a win-win situation for both sides.

And this trend is catching on. When I first read about Drobo (a high-end consumer NAS storage) partnering Carbonite (a remote backup vendor now repackaged as a Cloud storage backup provider), I thought it was a pretty darn good idea. It was a marriage that happened in the cloud. Late last year, another consumer NAS company, QNAP paired up with Symform, a cloud storage and backup vendor.

This was moving towards a market that scratches the itch. The consumers wanted reliable backup too, but consumer-grade disk drives fail ever so often. Laptops get stolen, and files could be infected by viruses. The list goes on, but the point is that the Cloud Storage Providers may have found a silver lining in getting the consumers to leap into the cloud. And the whole idea of small NAS vendor-big Cloud Backup dynamic duo, just got a big endorsement last night. Guess who has decided to dip its grubby hands into the pie?

EMC, the 800-pound gorilla of the information and storage world, through its Iomega subsidiary, wants your money! They had just married Iomega with EMC Atmos. It was quoted:

“EMC subsidiary and data protection specialist Iomega announced the integration between Iomega network storage solutions and EMC Atmos, extending Atmos cloud-based data protection and sharing to Iomega’s network storage product offerings. The new integration gives small and midsize businesses (SMBs), remote offices and distributed enterprises access to any Atmos powered cloud around the world.”

Surprised? Not really, but I guess EMC needs to breath new life into Atmos and this marriage just extended Atmos’ life support system.

SSDs rising in the flood crisis

The Thailand flood last year spelled disaster to the storage industry. We have already seen several big boys in the likes of HP, EMC and NetApp announcing the rise of prices because of the flood.

NetApp’s announcement is here; EMC is here; and HP is here, if you want to read about it. Below is a nice and courteous EMC letter to their customers.

But the Chinese character of “crisis” (below) also spells opportunities; opportunities for Solid State Drives (SSDs) that is.

For those of us close to the ground, the market for spinning hard disk drives (HDDs) has certainly been challenging for the past few months, especially for smaller system providers like us. Without the leveraging powers of the bigger boys, we practically had to beg to buy HDDs, not to mention the fact that the price has practically doubled.

Before the Thailand flood crisis, the GB/$ of a 2TB HDD was 0.325 Malaysian ringgit per GB. That’s about 33 cents. Today, the price is about 55 cents per GB. In comparison, at least from my experience, the GB/$ of SSDs has gone down from $5.83 to $4.99.

I know some of you might pooh-pooh the price difference between a 2TB SATA/SAS and a 120GB SSD, partly because the SSD seems so expensive. But when you consider that doing the math, the SSDs is likely to be 50x faster (at worst average) and 200x faster (at best average) for applications requiring IOPS, this could mean that transactional applications are likely to be completed an average of 100x faster, with better response time, with lower latency. This will have a domino effect on other related applications, making the entire service request performing and completing faster. When we put a price to the transactional hours, for example $10/hour work, then we can see the cost savings coming from using SSDs in the storage.

Interestingly, a friend of mine asked me about the prominence of an all SSDs storage systems. I have written about all SSDs systems in the past, and also did a high overview of Pure Storage some time back. And a very interesting fact I recalled was these systems having massive amount of IOPS. Having plenty of IOPS helps because you do away with Automated Storage Tiering (AST) because you don’t have to tier your data, and you don’t have to pay for such a feature.

Yes, all-SSDs pure-play storage systems are gaining prominence and it’s time to take notice. Nimbus beat NetApp and HP 3PAR last year to win eBay with an all SSDs storage solution and other players such as Violin Memory Systems, Pure Storage, SolidFire and of course, Texas Memory Systems (aka RAMSAN). And they are attracting big names into their management portfolios and getting VC dollars of course.

The Thailand flood aftermath will probably take 6 months or more to return to its previous production capacity prior to the crisis and SSDs can take this window of opportunity in the crisis to surge ahead. And if this flood is going to be an annual thing for Thailand (God bless Thailand), HDD market is going to have a perennial problem. And SSDs is going to rise even faster.