Hail Hydra!

The last of the Storage Field Day 6 on November 7th took me and the other delegates to NEC. There was an obvious, yet eerie silence among everyone about this visit. NEC? Are you kidding me?

NEC isn’t exactly THE exciting storage company in the Silicon Valley, yet I was pleasantly surprised with their HydraStorprowess. It is indeed quite a beast, with published numbers of backup throughput of 4PB/hour, and scales to 100PB of capacity. Most impressive indeed, and HydraStor deserves this blogger’s honourable architectural dissection.

HydraStor is NEC’s grid-based, scale-out storage platform with an object storage backend. The technology, powered by the DynamicStor ™ software, a distributed file system laid over the HydraStor grid architecture. At the same time, it has the DataRedux™ technology that provides the global in-line deduplication as the HydraStor ingests data for data protection, replication, archiving and WORM purposes. It is a massive data consolidation platform, storing gazillion loads of data (100PB you say?) for short-term and long-term retention and recovery.

The architecture is indeed solid, and its data availability goes beyond traditional RAID-level resiliency. HydraStor employs their proprietary erasure coding, called Distributed Resilient Data™. The resiliency knob can be configured to withstand 6 concurrent disks or nodes failure, but by default configured with a resiliency level of 3.

We can quickly deduce that DynamicStor™, DataRedux™ and Distributed Resilient Data™ are the technology pillars of HydraStor. How do they work, and how do they work together?

Let’s look a bit deeper into the HydraStor architecture.

HydraStor is made up of 2 types of nodes:

  • Accelerator Nodes
  • Storage Nodes

The Accelerator Nodes (AN) are the access nodes. They interface with the HydraStor front end, which could be CIFS, NFS or OST (Open Storage Technology). The AN nodes chunks the in-coming data and performs in-line deduplication at a very high speed. It can reach speed of 300TB/hour, which is blazingly fast!

The AN nodes also runs DynamicStor™, handling the performance heavy-lifting portion of HydraStor. The chunked data from the AN nodes are then passed on to the Storage Nodes (SN), where they are further “deduped in-line” to determined if the chunks are unique or not. It is a two-step inline deduplication process. Below is a diagram showing the ANs built above the SNs in the HydraStor grid architecture.

NEC AN & SN grid architecture

 

The HydraStor grid architecture is also a very scalable architecture, allow the dynamic scale-in and scale-out of both ANs and SNs. AN nodes and SN nodes can be added or removed into the system, auto-configuring and auto-optimizing while everything stays online. This capability further strengthens the reliability and the resiliency of the HydraStor.

NEC Hydrastor dynamic topology

Moving on to DataRedux™. DataRedux™ is HydraStor’s global in-line data deduplication technology. It performs dedupe at the sub-file level, with variable length window. This is performed at the AN nodes and the SN nodes level,chunking and creating unique hash values. All unique chunks are further compressed with a modified LZ compression algorithm, shrinking the data to its optimized footprint on the disk storage. To maintain the global in-line deduplication, the hash table is available across the HydraStor cluster.

NEC Deduplication & Compression

The unique data chunk resulting from deduplication and compression are then written to disks using the configured Distributed Resilient Data™ (DRD) algorithm, at its set resiliency level.

At the junction of DRD, with erasure coding parity, the data is broken up into multiples of fragments and assigned a parity to a grouping of fragments. If the resiliency level is set to 3 (the default), the data is broken into 12 pieces, 9 data fragments + 3 parity fragments. The 3 parity fragments corresponds to the resiliency level of 3. See diagram below of the 12 fragments spread across a group of selected disks in the storage pool of the Storage Nodes.

NEC DRD erasure coding on Storage Nodes

 

If the HydraStor experiences a failure in the disks or nodes, and has resulted in the loss of a fragment or fragments, the DRD self-healing function will auto-rebuild and auto-reconfigure the recovered fragments in another set of disks, maintaining the level of 3 parities.

The resiliency level, as mentioned earlier, can be set up to 6, boosting the HydraStor survival factor of 6 disks or nodes failure in the grid. See below of how the autonomous DRD recovery works:

NEC Autonomous Data recovery

Despite lacking the razzle dazzle of most Silicon Valley storage startups and upstarts, credit be given where credit is due. NEC HydraStor is indeed a strong show stopper.

However, in a market that is as fickle as storage, deduplication solutions such as HydraStor, EMC Data Domain, and HP StoreOnce, are being superceded by Copy Data Management technology, touted by Actifio. It was rumoured that EMC restructured their entire BURA (Backup Recovery Archive) division to DPAD (Data Protection and Availability Division) to go after the burgeoning copy data management market.

It would be good if NEC can take notice and turn their HydraStor “supertanker” towards the Copy Data Management market. That would be something special to savour.

P/S: NEC. Sorry about the title. I just couldn’t resist it ;-)

Praying to the hypervisor God

I was reading a great article by Frank Denneman about storage intelligence moving up the stack. It was pretty much in line with what I have been observing in the past 18 months or so, about the storage pendulum having swung back to DAS (direct attached storage). To be more precise, the DAS form factor I am referring to are physical server hardware that houses many disk drives.

Like it or not, the hypervisor has become the center of the universe in the IT space. VMware has become the indomitable force in the hypervisor technology, with Microsoft Hyper-V playing catch-up. The seismic shift of these 2 hypervisor technologies are leading storage vendors to place them on to the altar and revering them as deities. The others, with the likes of Xen and KVM, and to lesser extent Solaris Containers aren’t really worth mentioning.

This shift, as the pendulum swings from networked storage back to internal “direct-attached” storage are dictated by 4 main technology factors:

  • The x86 server architecture
  • Software-defined
  • Scale-out architecture
  • Flash-based storage technology

Anyone remember Thumper? Not the Disney character from the Bambi movie!

thumper-bambi-cartoon-character

When the SunFire X4500 (aka Thumper) was first released in (intermission: checking Wiki for the right year) in 2006, I felt that significant wound inflicted in the networked storage industry. Instead of the usual 4-8 hard disk drives in the all the industry servers at the time, the X4500 4U chassis housed 48 hard disk drives. The design and architecture were so astounding to me, I even went and bought a 1U SunFire X4150 for my personal server collection. Such was my adoration for Sun’s technology at the time.

Continue reading

Technology prowess of Riverbed SteelFusion

The Riverbed SteelFusion (aka Granite) impressed me the moment it was introduced to me 2 years ago. I remembered that genius light bulb moment well, in December 2012 to be exact, and it had left its mark on me. Like I said last week in my previous blog, the SteelFusion technology is unique in the industry so far and has differentiated itself from its WAN optimization competitors.

To further understand the ability of Riverbed SteelFusion, a deeper inspection of the technology is essential. I am fortunate to be given the opportunity to learn more about SteelFusion’s technology and here I am, sharing what I have learned.

What does the technology of SteelFusion do?

Riverbed SteelFusion takes SAN volumes from supported storage vendors in the central datacenter and projects the storage volumes (aka LUNs)to applications and hosts at the remote branches. The technology requires a paired relationship between SteelFusion Core (in the centralized datacenter) and SteelFusion Edge (at the branch). Both SteelFusion Core and Edge are fronted respectively by the Riverbed SteelHead WAN optimization device, to deliver the performance required.

The diagram below gives an overview of how the entire SteelFusion network architecture is like:

Riverbed SteelFusion Overall Solution 2 Continue reading

Convergence data strategy should not forget the branches

The word “CONVERGENCE” is boiling over as the IT industry goes gaga over darlings like Simplivity and Nutanix, and the hyper-convergence market. Yet, if we take a step back and remove our emotional attachment from the frenzy, we realize that the application and implementation of hyper-convergence technologies forgot one crucial elementThe other people and the other offices!

ROBOs (remote offices branch offices) are part of the organization, and often they are given the shorter end of the straw. ROBOs are like the family’s black sheeps. You know they are there but there is little mention of them most of the time.

Of course, through the decades, there are efforts to consolidate the organization’s circle to include ROBOs but somehow, technology was lacking. FTP used to be a popular but crude technology that binds the branch offices and the headquarter’s operations and data services. FTP is still used today, in countries where network bandwidth costs a premium. Data cloud services are beginning to appear of part of the organization’s outreaching strategy to include ROBOs but the fear of security weaknesses, data breaches and misuses is always there. Often, concerns of the weaknesses of the cloud overcome whatever bold strategies concocted and designed.

For those organizations in between, WAN acceleration/optimization techonolgy is another option. Companies like Riverbed, Silverpeak, F5 and Ipanema have addressed the ROBOs data strategy market well several years ago, but the demand for greater data consolidation and centralization, tighter and more effective data management and data control to meet the data compliance and data governance requirements, has grown much more sophisticated and advanced. Continue reading

The Prophet has arrived

Early last week, I had a catch up with my friend. He was excited to share with me the new company he just joined. It was ProphetStor. It was a catchy name and after our conversation, I have decided to spend a bit of my weekend afternoon finding out more about the company and its technology.

From another friend at FalconStor, I knew of this company several months ago. Ex-FalconStor executives have ventured to found ProphetStor as the next generation of storage resource orchestration engine. And it has found a very interesting tack to differentiate from the many would-bes of so-called “software-defined storage” leaders. ProphetStor made their early appearance at the OpenStack Summit in Hong Kong back in November last year, positioning several key technologies including OpenStack Cinder, SNIA CDMI (Cloud Data Management Interface) and SMI-S (Storage Management Initiative Specification) to provide federation of storage resources discovery, provisioning and automation. 

The federation of storage resources and services solution is aptly called ProphetStor Federator. The diagram I picked up from the El Reg article presents the Federator working with different OpenStack initiatives quite nicely below:  There are 3 things that attracted me to the uniqueness of ProphetStor.

1. The underlying storage resources, be it files, objects, or blocks, can be presented and exposed as Cinder-style volumes.

2. The ability to define the different performance capabilities and SLAs (IOPS, throughput and latency) from the underlying storage resources and matching them to the right application requirements.

3. The use of SNIA of SMI-S and CDMI Needless to say that the Federator software will abstract the physical and logical structures of any storage brands or storage architectures, giving it a very strong validation of the “software-defined storage (SDS)” concept.

While the SDS definition is still being moulded in the marketplace (and I know that SNIA already has a draft SDS paper out), the ProphetStor SDS concept does indeed look similar to the route taken by EMC ViPR. The use of the control plane (ProphetStor Federator) and the data plane (underlying physical and logical storage resource) is obvious.

I wrote about ViPR many moons ago in my blog and I see ProphetStor as another hat in the SDS ring. I grabbed the screenshot (below) from the ProphetStor website which I thought did beautifully explained what ProphetStor is from 10,000 feet view.

ProphetStor How it works

The Cinder-style volume is a class move. It preserves the sanctity of many enterprise applications which still need block storage volumes but now it comes with a twist. These block storage volumes now will have different capability and performance profiles, tagged with the relevant classifications and SLAs.

And this is where SNIA SMI-S discovery component is critical because SMI-S mines these storage characteristics and presents them to the ProphetStor Federator for storage resource classification. For storage vendors that do not have SMI-S support, ProphetStor can customize the relevant interfaces to the proprietary API to discover the storage characteristics.

On the north-end, SNIA CDMI works with the ProphetStor Federator’s Offer & Provisioning functions to bundle wrap various storage resources for the cloud and other traditional storage network architectures.

I have asked my friend for more technology deep-dive materials (he has yet to reply me) of ProphetStor to ascertain what I have just wrote. (Simon, you have to respond to me!)

This is indeed very exciting times knowing ProphetStor as one of the early leaders in the SDS space. And I like to see ProphetStor go far with this.

Now let us pray … because the prophet has arrived.

HDS HNAS kicks ass

I am dusting off the cobwebs of my blog. After almost 3 months of inactivity, (and trying to avoid the Social Guidelines Media of my present company), I have bolstered enough energy to start writing again. I am tired, and I am finishing off the previous engagements prior to joining HDS. But I am glad those are coming to an end, with the last job in Beijing next week.

So officially, I will be in HDS as of November 4, 2013 . And to get into my employer’s good books, I think I should start with something that HDS has proved many critics wrong. The notion that HDS is poor with NAS solutions has been dispelled with a recent benchmark report from SPECSfs, especially when it comes to NFS file performance. HDS has never been much of a big shouter about their HNAS, even back in the days of OEM with BlueArc. The gap period after the BlueArc acquisition was also, in my opinion, quiet unless it was the gestation period for this Kick-Ass announcement a couple of weeks ago. Here is one of the news circling in the web, from the ever trusty El-Reg.

HDS has never been big shouting like the guys, like EMC and NetApp, who have plenty of marketing dollars to spend. EMC Isilon and NetApp C-Mode have always touted their mighty SPECSfs numbers, usually with a high number of controllers or nodes behind the benchmarks. More often than not, many readers would probably focus more on the NFSops/sec figures rather than the number of heads required to generate the figures.

Unaware of this HDS announcement, I was already asking myself that question about NFSops/sec per SINGLE controller head. So, on September 26 2013, I did a table comparing some key participants of the SPECSfs2008_nfs.v3 and here is the table:

SPECSfs2008_nfs.v3-26-Sept-2013In the last columns of the 2 halves (which I have highlighted in Red), the NFSops/sec/single controller head numbers are shown. I hope that readers would view the performance numbers more objectively after reading this. Therefore, I let you make your own decisions but ultimately, they are what they are. One should not be over-mesmerized by the super million NFSops/sec until one looks under the hood. Secondly, one should also look at things more holistically such as $/NFSops/sec, $/ORT (overall response time), and $/GB/NFSops/managed and other relevant indicators of the systems sold.

But I do not want to take the thunder away from HDS’ HNAS platforms in this recent benchmark. In summary,

HDS SPECbench summaryTo reach a respectable number of 607,647 NFSops/sec with a sub-second response time is quite incredible. The ORT of 0.59 msecs should not be taken lightly because to eke just about a 0.1 msec is not easy. Therefore, reaching 0.5 millisecond is pretty awesome.

This is my first blog after 3 months. I am glad to be back and hopefully with the monkey off my back (I am referring to my outstanding engagements), I can concentrating on writing good stuff again. I know, I know … I still owe some people some entries. It’s great to be back :-)

The openness of Novell Filr technology

In the previous blog entry, I spoke about finally getting the opportunity look deeper into Novell Filr technology. As I continue my journey of exploration, I am already consolidating information about the other EFSS (Enterprise File Synchronization and Sharing) solutions out there.

Many corporate IT users are moving away from pedantic corporate IT control toward the seemingly easy to synchronize, easy to share, cloud-based services such as Dropbox and Box.net. This practice exposes a big hole in the corporate network, leaking data and files, and yet most corporate IT users are completely ignorant about such a irresponsible act.

Corporate IT users cannot blame IT for being a big A-hole because they keep tight controls of the network and security. It is their job to safeguard the company’s data and files for security, compliance and privacy reasons.

In the past 9-12 months, IT has certainly relaxed (probably “relented” is a better word) their uptight demeanour because they know they couldn’t stop the onslaught of BYOD (bring your own devices). The C-level and the senior management have practically demanded it and had forced their way to bring in their own smart devices and tablets to increase their productivity (Yeah, right!).

To alleviate data security concerns, MDM (Mobile Device Management) solutions are now hot items on the IT shopping list. Since we are talking about Novell, I also got to know that Novell also has an MDM solution called ZenWorks Mobile Management. Novell Zenworks is already well integrated with the proven Novell track record of user and identity management as well as integration with LDAP authentication systems such as Active Directory and eDirectory.

The collision of the BYOD phenomena and the need to securely share corporate data and files security conceives the Enterprise File Synchronization and Sharing market. Continue reading

Washing too much software defined

There’s been practically a firestorm when EMC announced ViPR, its own version of “software-defined storage” at EMC World last week. Whether you want to call it Virtualization Platform Re-defined or Re-imagined, competitors such as NetApp, HDS, Nexenta have taken pot-shots at EMC, and touting their own version of software-defined storage.

In the release announcement, EMC claimed the following (a cut-&-paste from the announcement):

  • The EMC ViPR Software-Defined Storage Platform uniquely provides the ability to both manage storage infrastructure (Control Plane) and the data residing within that infrastructure (Data Plane).
  • The EMC ViPR Controller leverages existing storage infrastructures for traditional workloads, but provisions new ViPR Object Data Services (with access via Amazon S3 or HDFS APIs) for next-generation workloads. ViPR Object Data Services integrate with OpenStack via Swift and can be run against enterprise or commodity storage.
  • EMC ViPR integrates tightly with VMware’s Software Defined Data Center through industry standard APIs and interoperates with Microsoft and OpenStack.

The separation of the Control Plane and the Data Plane of the ViPR allows the abstraction of 2 main layers.

Layer 1 is the abstraction of the underlying storage hardware infrastructure. Although I don’t have the full details (EMC guys please enlighten me, please!), I believe storage administrator no longer need to carve out LUNs from RAID groups or Storage Pools, striped and sliced them and further provision them into meta file systems before they are exported or shared through NAS protocols. I am , of course, quoting the underlying provisioning architecture of Celerra, which can be quite complex. Anyone who has done manual provisioning with Celerra Manager should know what I mean.

Here’s the provisioning architecture of Celerra:

Continue reading

The big boys better be flash friendly

An interesting article came up in the news this week. The article, from the ever popular The Register, mentioned 3 up and rising storage stars, Nimble Storage, Tintri and Tegile, and their assault on a flash strategy “blind spot” of the big boys, notably EMC and NetApp.

I have known about Nimble Storage and Tintri for a couple of years now, and I did take some time to read up on their storage technology offering. Tegile is new to me when it appeared on my radar after SearchStorage.com announced as the Gold Winner of the enterprise storage category for 2012.

The Register article intriqued me because it implied that these traditional storage vendors such as EMC and NetApp are probably doing a “band-aid” when putting together their flash storage strategy. And typically, I see these strategic concepts introduced by these 2 vendors:

  1. Have a server-side cache strategy by putting a PCIe card on the hosting server
  2. Have a network-based all-flash caching area
  3. Have a PCIe-based flash card on the storage system
  4. Have solid state drives (SSDs) in its disk shelves enclosures

In (1), EMC has VFCache (the server side caching software has been renamed to XtremSW Cache and under repackaging with the Xtrem brand name) and NetApp has it FlashAccel solution. Previously, as I was informed, FlashAccel was using the FusionIO ioTurbine solution but just days ago, NetApp expanded the LSI Nytro WarpDrive into its FlashAccel solution as well. The main objective of a server-side caching strategy using flash is to accelerate mostly read-based I/O operations for specific application workloads at the server side.

Continue reading

It’s all about executing the story

I have been in hibernation mode, with a bit of “writer’s block”.

I woke up in Bangalore in India at 3am, not having adjusted myself to the local timezone. Plenty of things were on my mind but I can’t help thinking about what’s happening in the enterprise storage market after the Gartner Worldwide External Controller-Based report for 4Q12 came out  last night. Below is the consolidated table from Gartner:

Just a few weeks ago, it was IDC with its Worldwide Disk Storage Tracker and below is their table as well:

Continue reading