Can NetApp do it a bit better?

[Preamble: I was a delegate of Storage Field Day 12. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented in this event]

In Day 2 of Storage Field Day 12, I and the other delegates were hustled to NetApp’s Sunnyvale campus headquarters. That was a homecoming for me, and it was a bit ironic too.

Just 8 months ago, I was NetApp Malaysia Country Manager. That country sales lead role was my second stint with NetApp. I lasted almost 1 year.

17 years ago, my first stint with NetApp was the employee #2 in Malaysia as an SE. That SE stint went by quickly for 5 1/2 years, and I loved that time. Those Fall Classics NetApp used to have at the Batcave and the Fortress of Solitude left a mark with me, and the experiences still are as vivid as ever.

Despite what has happened in both stints and even outside the circle, I am still one of NetApp’s active cheerleaders in the Asia Pacific region. I even got accused by being biased as a community leader in the SNIA Malaysia Facebook page (unofficial but recognized by SNIA), because I was supposed to be neutral. I have put in 10 years to promote the storage technology community with SNIA Malaysia. [To the guy named Stanley, my response was be “Too bad, pick a religion“.]

The highlight of the SFD12 NetApp visit was of course, having lunch with Dave Hitz, one of the co-founders and the one still remaining. But throughout the presentations, I was unimpressed.

For me, the only one which stood out was CloudSync. I have read about CloudSync since NetApp Insight 2016 and yes, it’s a nice little piece of data shipping service between on-premise and AWS cloud.

Here’s how CloudSync looks like:

Continue reading

The engineering of Elastifile

[Preamble: I was a delegate of Storage Field Day 12. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented in this event]

When it comes to large scale storage capacity requirements with distributed cloud and on-premise capability, object storage is all the rage. Amazon Web Services started the object-based S3 storage service more than a decade ago, and the romance with object storage started.

Today, there are hundreds of object-based storage vendors out there, touting features after features of invincibility. But after researching and reading through many design and architecture papers, I found that many object-based storage technology vendors began to sound the same.

At the back of my mind, object storage is not easy when it comes to most applications integration. Yes, there is a new breed of cloud-based applications with RESTful CRUD API operations to access object storage, but most applications still rely on file systems to access storage for capacity, performance and protection.

These CRUD and CRUD-like APIs are the common semantics of interfacing object storage platforms. But many, many real-world applications do not have the object semantics to interface with storage. They are mostly designed to interface and interact with file systems, and secretly, I believe many application developers and users want a file system interface to storage. It does not matter if the storage is on-premise or in the cloud.

Let’s not kid ourselves. We are most natural when we work with files and folders.

Implementing object storage also denies us the ability to optimally utilize Flash and solid state storage on-premise when the compute is in the cloud. Similarly, when the compute is on-premise and the flash-based object storage is in the cloud, you get a mismatch of performance and availability requirements as well. In the end, there has to be a compromise.

Another “feature” of object storage is its poor ability to handle transactional data. Most of the object storage do not allow modification of data once the object has been created. Putting a NAS front (aka a NAS gateway) does not take away the fact that it is still object-based storage at the very core of the infrastructure, regardless if it is on-premise or in the cloud.

Resiliency, latency and scalability are the greatest challenges when we want to build a true globally distributed storage or data services platform. Object storage can be resilient and it can scale, but it has to compromise performance and latency to be so. And managing object storage will not be as natural as to managing a file system with folders and files.

Enter Elastifile.

Continue reading

Let’s smoke the storage peace pipe

NVMe (Non-Volatile Memory Express) is upon us. And in the next 2-3 years, we will see a slew of new storage solutions and technology based on NVMe.

Just a few days ago, The Register released an article “Seventeen hopefuls fight for the NVMe Fabric array crown“, and it was timely. I, for one, cannot be more excited about the development and advancement of NVMe and the upcoming NVMeF (NVMe over Fabrics).

This is it. This is the one that will end the wars of DAS, NAS and SAN and unite the warring factions between server-based SAN (the sexy name differentiating old DAS and new DAS) and the networked storage of SAN and NAS. There will be PEACE.

Remember this?

nutanix-nosan-buntingNutanix popularized the “No SAN” movement which later led to VMware VSAN and other server-based SAN solutions, hyperconverged techs such as PernixData (acquired by Nutanix), DataCore, EMC ScaleIO and also operated in hyperscalers – the likes of Facebook and Google. The hyperconverged solutions and the server-based SAN lines blurred of storage but still, they are not the usual networked storage architectures of SAN and NAS. I blogged about this, mentioning about how the pendulum has swung back to favour DAS, or to put it more appropriately, server-based SAN. There was always a “Great Divide” between the 2 modes of storage architectures. Continue reading

Can CDMI emancipate an interoperable medical records cloud ecosystem?

PREFACE: This is just a thought, an idea. I am by no means an expert in this area. I have researched this to inspire a thought process of how we can bring together 2 disparate worlds of medical records and imaging with the emerging cloud services for healthcare.

Healthcare has been moving out of its archaic shell in the past few years, and digital healthcare technology and services are booming. And this movement is part of the digital transformation which could eventually lead to a secure and compliant distribution and collaboration of health data, medical imaging and electronic medical records (EMR).

It is a blessing that today’s medical imaging industry has been consolidated with the DICOM (Digital Imaging and Communications in Medicine) standard. DICOM dictates the how medical imaging information and pictures are used, stored, printed, transmitted and exchanged. It is also a communication protocol which runs over TCP/IP, and links up different service class providers (SCPs) and service class users (SCUs), and the backend systems such as PACS (Picture Archiving & Communications Systems) and RIS (Radiology Information Systems).

Another well accepted standard is HL7 (Health Level 7), a similar Layer 7, application-level communication protocol for transferring and exchanging clinical and administrative data.

The diagram below shows a self-contained ecosystem involving the front-end HIS (Hospital Information Systems), and the integration of healthcare, medical systems and other DICOM modalities.

Hospital Enterprise

(Picture courtesy of Meddiff Technologies)

Continue reading

Oops, excuse me but your silo is showing

It is the morning that the SNIA Global Steering Committee reporting session is starting soon. I am in the office extremely early waiting for my turn to share the happenings in SNIA Malaysia.

And of late, I have been getting a lot of calls to catch up on hot technologies, notably All Flash Storage arrays and hyper-converged infrastructure. Even though I am now working for Interica, a company that focuses on Oil & Gas exploration and production software, my free coffee sessions with folks from the IT side have not diminished. And I recalled a week back in mid-March where I had coffee overdose!

Flash storage and hyperconvergence are HOT! Despite the hypes and frenzies of both flash storage and hyperconvergence, I still believe that integrating either or, or both, still have an effect that many IT managers overlook. The effect is a data silo.

Continue reading

The reverse wars – DAS vs NAS vs SAN

It has been quite an interesting 2 decades.

In the beginning (starting in the early to mid-90s), SAN (Storage Area Network) was the dominant architecture. DAS (Direct Attached Storage) was on the wane as the channel-like throughput of Fibre Channel protocol coupled by the million-device addressing of FC obliterated parallel SCSI, which was only able to handle 16 devices and throughput up to 80 (later on 160 and 320) MB/sec.

NAS, defined by CIFS/SMB and NFS protocols – was happily chugging along the 100 Mbit/sec network, and occasionally getting sucked into the arguments about why SAN was better than NAS. I was already heavily dipped into NFS, because I was pretty much a SunOS/Solaris bigot back then.

When I joined NetApp in Malaysia in 2000, that NAS-SAN wars were going on, waiting for me. NetApp (or Network Appliance as it was known then) was trying to grow beyond its dot-com roots, into the enterprise space and guys like EMC and HDS were frequently trying to put NetApp down.

It’s a toy”  was the most common jibe I got in regular engagements until EMC suddenly decided to attack Network Appliance directly with their EMC CLARiiON IP4700. EMC guys would fondly remember this as the “NetApp killer“. Continue reading

Hail Hydra!

The last of the Storage Field Day 6 on November 7th took me and the other delegates to NEC. There was an obvious, yet eerie silence among everyone about this visit. NEC? Are you kidding me?

NEC isn’t exactly THE exciting storage company in the Silicon Valley, yet I was pleasantly surprised with their HydraStorprowess. It is indeed quite a beast, with published numbers of backup throughput of 4PB/hour, and scales to 100PB of capacity. Most impressive indeed, and HydraStor deserves this blogger’s honourable architectural dissection.

HydraStor is NEC’s grid-based, scale-out storage platform with an object storage backend. The technology, powered by the DynamicStor ™ software, a distributed file system laid over the HydraStor grid architecture. At the same time, it has the DataRedux™ technology that provides the global in-line deduplication as the HydraStor ingests data for data protection, replication, archiving and WORM purposes. It is a massive data consolidation platform, storing gazillion loads of data (100PB you say?) for short-term and long-term retention and recovery.

The architecture is indeed solid, and its data availability goes beyond traditional RAID-level resiliency. HydraStor employs their proprietary erasure coding, called Distributed Resilient Data™. The resiliency knob can be configured to withstand 6 concurrent disks or nodes failure, but by default configured with a resiliency level of 3.

We can quickly deduce that DynamicStor™, DataRedux™ and Distributed Resilient Data™ are the technology pillars of HydraStor. How do they work, and how do they work together?

Let’s look a bit deeper into the HydraStor architecture.

HydraStor is made up of 2 types of nodes:

  • Accelerator Nodes
  • Storage Nodes

The Accelerator Nodes (AN) are the access nodes. They interface with the HydraStor front end, which could be CIFS, NFS or OST (Open Storage Technology). The AN nodes chunks the in-coming data and performs in-line deduplication at a very high speed. It can reach speed of 300TB/hour, which is blazingly fast!

The AN nodes also runs DynamicStor™, handling the performance heavy-lifting portion of HydraStor. The chunked data from the AN nodes are then passed on to the Storage Nodes (SN), where they are further “deduped in-line” to determined if the chunks are unique or not. It is a two-step inline deduplication process. Below is a diagram showing the ANs built above the SNs in the HydraStor grid architecture.

NEC AN & SN grid architecture


The HydraStor grid architecture is also a very scalable architecture, allow the dynamic scale-in and scale-out of both ANs and SNs. AN nodes and SN nodes can be added or removed into the system, auto-configuring and auto-optimizing while everything stays online. This capability further strengthens the reliability and the resiliency of the HydraStor.

NEC Hydrastor dynamic topology

Moving on to DataRedux™. DataRedux™ is HydraStor’s global in-line data deduplication technology. It performs dedupe at the sub-file level, with variable length window. This is performed at the AN nodes and the SN nodes level,chunking and creating unique hash values. All unique chunks are further compressed with a modified LZ compression algorithm, shrinking the data to its optimized footprint on the disk storage. To maintain the global in-line deduplication, the hash table is available across the HydraStor cluster.

NEC Deduplication & Compression

The unique data chunk resulting from deduplication and compression are then written to disks using the configured Distributed Resilient Data™ (DRD) algorithm, at its set resiliency level.

At the junction of DRD, with erasure coding parity, the data is broken up into multiples of fragments and assigned a parity to a grouping of fragments. If the resiliency level is set to 3 (the default), the data is broken into 12 pieces, 9 data fragments + 3 parity fragments. The 3 parity fragments corresponds to the resiliency level of 3. See diagram below of the 12 fragments spread across a group of selected disks in the storage pool of the Storage Nodes.

NEC DRD erasure coding on Storage Nodes


If the HydraStor experiences a failure in the disks or nodes, and has resulted in the loss of a fragment or fragments, the DRD self-healing function will auto-rebuild and auto-reconfigure the recovered fragments in another set of disks, maintaining the level of 3 parities.

The resiliency level, as mentioned earlier, can be set up to 6, boosting the HydraStor survival factor of 6 disks or nodes failure in the grid. See below of how the autonomous DRD recovery works:

NEC Autonomous Data recovery

Despite lacking the razzle dazzle of most Silicon Valley storage startups and upstarts, credit be given where credit is due. NEC HydraStor is indeed a strong show stopper.

However, in a market that is as fickle as storage, deduplication solutions such as HydraStor, EMC Data Domain, and HP StoreOnce, are being superceded by Copy Data Management technology, touted by Actifio. It was rumoured that EMC restructured their entire BURA (Backup Recovery Archive) division to DPAD (Data Protection and Availability Division) to go after the burgeoning copy data management market.

It would be good if NEC can take notice and turn their HydraStor “supertanker” towards the Copy Data Management market. That would be something special to savour.

P/S: NEC. Sorry about the title. I just couldn’t resist it 😉

The Prophet has arrived

Early last week, I had a catch up with my friend. He was excited to share with me the new company he just joined. It was ProphetStor. It was a catchy name and after our conversation, I have decided to spend a bit of my weekend afternoon finding out more about the company and its technology.

From another friend at FalconStor, I knew of this company several months ago. Ex-FalconStor executives have ventured to found ProphetStor as the next generation of storage resource orchestration engine. And it has found a very interesting tack to differentiate from the many would-bes of so-called “software-defined storage” leaders. ProphetStor made their early appearance at the OpenStack Summit in Hong Kong back in November last year, positioning several key technologies including OpenStack Cinder, SNIA CDMI (Cloud Data Management Interface) and SMI-S (Storage Management Initiative Specification) to provide federation of storage resources discovery, provisioning and automation. 

The federation of storage resources and services solution is aptly called ProphetStor Federator. The diagram I picked up from the El Reg article presents the Federator working with different OpenStack initiatives quite nicely below:  There are 3 things that attracted me to the uniqueness of ProphetStor.

1. The underlying storage resources, be it files, objects, or blocks, can be presented and exposed as Cinder-style volumes.

2. The ability to define the different performance capabilities and SLAs (IOPS, throughput and latency) from the underlying storage resources and matching them to the right application requirements.

3. The use of SNIA of SMI-S and CDMI Needless to say that the Federator software will abstract the physical and logical structures of any storage brands or storage architectures, giving it a very strong validation of the “software-defined storage (SDS)” concept.

While the SDS definition is still being moulded in the marketplace (and I know that SNIA already has a draft SDS paper out), the ProphetStor SDS concept does indeed look similar to the route taken by EMC ViPR. The use of the control plane (ProphetStor Federator) and the data plane (underlying physical and logical storage resource) is obvious.

I wrote about ViPR many moons ago in my blog and I see ProphetStor as another hat in the SDS ring. I grabbed the screenshot (below) from the ProphetStor website which I thought did beautifully explained what ProphetStor is from 10,000 feet view.

ProphetStor How it works

The Cinder-style volume is a class move. It preserves the sanctity of many enterprise applications which still need block storage volumes but now it comes with a twist. These block storage volumes now will have different capability and performance profiles, tagged with the relevant classifications and SLAs.

And this is where SNIA SMI-S discovery component is critical because SMI-S mines these storage characteristics and presents them to the ProphetStor Federator for storage resource classification. For storage vendors that do not have SMI-S support, ProphetStor can customize the relevant interfaces to the proprietary API to discover the storage characteristics.

On the north-end, SNIA CDMI works with the ProphetStor Federator’s Offer & Provisioning functions to bundle wrap various storage resources for the cloud and other traditional storage network architectures.

I have asked my friend for more technology deep-dive materials (he has yet to reply me) of ProphetStor to ascertain what I have just wrote. (Simon, you have to respond to me!)

This is indeed very exciting times knowing ProphetStor as one of the early leaders in the SDS space. And I like to see ProphetStor go far with this.

Now let us pray … because the prophet has arrived.

Has Object Storage become the everything store?

I picked up a copy of latest Brad Stone’s book, “The Everything Store: Jeff Bezos and the Age of Amazon at the airport on my way to Beijing last Saturday. I have been reading it my whole time I have been in Beijing, reading in awe about the turbulent ups and downs of

The Everything Store cover

In its own serendipitous ways, Object-based Storage Devices (OSDs) have been floating in my universe in the past few weeks. Seems like OSDs have been getting a lot of coverage lately and suddenly, while in the shower, I just had an epiphany!

Are storage vendors now positioning Object-based Storage Devices (OSDs) as Everything Store?

Continue reading

HDS HNAS kicks ass

I am dusting off the cobwebs of my blog. After almost 3 months of inactivity, (and trying to avoid the Social Guidelines Media of my present company), I have bolstered enough energy to start writing again. I am tired, and I am finishing off the previous engagements prior to joining HDS. But I am glad those are coming to an end, with the last job in Beijing next week.

So officially, I will be in HDS as of November 4, 2013 . And to get into my employer’s good books, I think I should start with something that HDS has proved many critics wrong. The notion that HDS is poor with NAS solutions has been dispelled with a recent benchmark report from SPECSfs, especially when it comes to NFS file performance. HDS has never been much of a big shouter about their HNAS, even back in the days of OEM with BlueArc. The gap period after the BlueArc acquisition was also, in my opinion, quiet unless it was the gestation period for this Kick-Ass announcement a couple of weeks ago. Here is one of the news circling in the web, from the ever trusty El-Reg.

HDS has never been big shouting like the guys, like EMC and NetApp, who have plenty of marketing dollars to spend. EMC Isilon and NetApp C-Mode have always touted their mighty SPECSfs numbers, usually with a high number of controllers or nodes behind the benchmarks. More often than not, many readers would probably focus more on the NFSops/sec figures rather than the number of heads required to generate the figures.

Unaware of this HDS announcement, I was already asking myself that question about NFSops/sec per SINGLE controller head. So, on September 26 2013, I did a table comparing some key participants of the SPECSfs2008_nfs.v3 and here is the table:

SPECSfs2008_nfs.v3-26-Sept-2013In the last columns of the 2 halves (which I have highlighted in Red), the NFSops/sec/single controller head numbers are shown. I hope that readers would view the performance numbers more objectively after reading this. Therefore, I let you make your own decisions but ultimately, they are what they are. One should not be over-mesmerized by the super million NFSops/sec until one looks under the hood. Secondly, one should also look at things more holistically such as $/NFSops/sec, $/ORT (overall response time), and $/GB/NFSops/managed and other relevant indicators of the systems sold.

But I do not want to take the thunder away from HDS’ HNAS platforms in this recent benchmark. In summary,

HDS SPECbench summaryTo reach a respectable number of 607,647 NFSops/sec with a sub-second response time is quite incredible. The ORT of 0.59 msecs should not be taken lightly because to eke just about a 0.1 msec is not easy. Therefore, reaching 0.5 millisecond is pretty awesome.

This is my first blog after 3 months. I am glad to be back and hopefully with the monkey off my back (I am referring to my outstanding engagements), I can concentrating on writing good stuff again. I know, I know … I still owe some people some entries. It’s great to be back 🙂