Cohesity SpanFS – a foundational shift

[Preamble: I was a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

Cohesity SpanFS impressed me. Their filesystem was designed from ground up to meet the demands of the voluminous cloud-scale data, and yes, the sheer magnitude of data everywhere needs to be managed.

We all know that primary data is always the more important piece of data landscape but there is a growing need to address the secondary data segment as well.

Like a floating iceberg, the piece that is sticking out is the more important primary data but the larger piece beneath the surface of the water, which is the secondary data, is becoming more valuable. Applications such as file shares, archiving, backup, test and development, and analytics and insights are maturing as the foundational data management frameworks and fast becoming the bedrock of businesses.

The ability of businesses to bounce back after a disaster; the relentless testing of large data sets to develop new competitive advantage for businesses; the affirmations and the insights of analyzing data to reduce risks in decision making; all these are the powerful back engine applicability that thrust businesses forward. Even the ability to search for the right information in a sea of data for regulatory and compliance reasons is part of the organization’s data management application.

Continue reading

Magic happening

[Preamble: I am a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation are paid for by GestaltIT, the organizer and I am not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

The magic is happening.

Dropbox, the magical disruptor, is going IPO.

When Dropbox first entered into the market which eventually termed as BYOD (Bring your Own Device), it was a phenomenon. There was nothing else that matched its simplicity and ease-of-use. A file uploaded into the cloud was instantaneously available on the tablets and smart phones. It was on every storage vendor’s presentation slides, using Dropbox as the perennial name dropping tactic to get end users buy-in.

Dropbox was more than that, and it went on to define a whole new market segment known as Enterprise File Synchronization and Sharing (EFSS), together with everybody else such as Box, Easishare (they are here in South East Asia), and just about everybody else. And the executive team at Dropbox knew they were special too, so much so that they rejected a buyout attempt by Apple in 2011.

Today, Dropbox is beyond BYOD and EFSS. They are a full fledged collaboration platform that includes project management, project workflow, file versioning, secure file transfer, smart file synchronization and Dropbox Paper. And they offer comprehensive plans from Basic, Plus and Professional to Business and Enterprise. Their upcoming IPO, I am sure, will give them far greater capital to expand, and realize their full potential as the foremost content-based collaboration platform in the world.

Dropbox began their exodus from AWS a couple of years ago. They wanted to control their destiny and have moved more than 500PB into their own private data center for their customer data. That was half-an-exabyte, people! And two years later, they saved $75million of operating costs after they exited AWS. Today, they have more than 1 Exabyte of customer data! That is just incredible.

And Dropbox’s storage architecture started with a simple foundational design called “Magic Pocket“. Magic Pocket is a “fixed-length, immutable” block storage layer.

The block size is fixed at 4MB chunks (for parallel performance and service resumption reasons), compressed and deduped (for capacity savings reasons), encrypted (for security reasons) and replicated (for high availability reasons).

Continue reading

Storage dinosaurs evolving too

[Preamble: I am a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation are paid for by GestaltIT, the organizer and I am not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

I have been called a dinosaur. We storage networking professionals and storage technologists have been called dinosaurs. It wasn’t offensive or anything like that and I knew it was coming because the writing was on the wall, … or is it?

The cloud and the breakneck pace of all the technologies that came along have made us, the storage networking professionals, look like relics. The storage guys have been pigeonholed into a sunset segment of the IT industry. SAN and NAS, according to the non-practitioners, were no longer relevant. And cloud has clout (pun intended) us out of the park.

I don’t see us that way. I see that the Storage Dinosaurs are evolving as well, and our storage foundational knowledge and experience are more relevant that ever. And the greatest assets that we, the storage networking professionals, have is our deep understanding of data.

A little over a year ago, I changed the term Storage in my universe to Data Services Platform, and here was the blog I wrote. I blogged again just before the year 2018 began.


Continue reading

My dilemma of stateful storage marriage

I should be a love match maker.

I have been spending much hours in the past few months, thinking of stateful data in stateful storage containers and how they would consummate with distributed applications containers and functions-as-a-service (aka serverless, aka Lambda). It still hasn’t made much sense, and I have not solved this problem yet. Although there were bits and pieces that coming together and the jigsaw looked well enough to give a cackled reply, what I have now is still not good enough for me. I am still searching for answers, better than the ones I have now.

The CAP theorem is in center of my mind. Distributed data, distributed states of data are on my mind. And by the looks of things, the computing world is heading towards containers and serverless computing too. Both distributed applications containers and serverless computing make a lot of sense. If we were to engage a whole new world of fog computing, edge computing, IoT, autonomous systems, AI, and other real-time computing, I would say that the future belongs to decentralization. Cloud Computing and having edge systems and devices getting back to the cloud for data is too slow. The latency of micro- or even nano-seconds is just not good enough. If we rely on the present methods to access the most relevant data, we are too late.

Continue reading

Data services platforms – 2018 and beyond

2017 is drawing to a close. Sadly, I was greeted with the news of Oracle laying off their storage hardware sales team yesterday. And I couldn’t help to see where all this is going, the Oracle Cloud. The cloud has become the data services platform of choice, not on-premise storage infrastructure anymore.

Years ago, when I first started blogging, I wrote that Cloud Computing could make you lose your job unless… I don’t usually make predictions but that prediction in 2011 is becoming true and more prevalent.

What is the future looking like?

For one, it is not bleak and plenty to look forward to. Forrester predicts that in 2018 (or at the end of it), Amazon AWS, Microsoft Azure and Google will capture 76% of all cloud platform revenue and 80% by 2020. More data will be generated at the edge than it is being created centrally on public clouds. The demand for high performance data services platforms will be beyond your usual object-based storage and having a data singularity where data can transcend across premises will become crucial in maintaining, extending and improving the services from core-to-edge. Multi-Cloud or Cross-Cloud services platforms are maturing because the cloud platform space, while dominated by AWS, Azure and Google, includes IBM Cloud, Oracle Cloud, Rackspace, Alibaba Cloud, and is also about localized and regionalized players like Markely, Virtustream and ReScale serving unique and niche markets.

To address this shift, data services platforms are reinventing itself to be different. Flash-based, NVMe storage (err, I mean data services platform) is the foundation of building and drive self-service analytics, whether it is file-aware or content-aware, or infrastructure-aware. This new found “awareness” would inculcate platform intelligence and data intelligence, driving automation towards predictive and preemptive actions.

From a security point of view, data privacy and data governance take precedence of form and shape. As Europe enforces the General Data Protection Regulation in May of 2018, the proliferation of multi-clouds and cross-clouds will be questions. How safe is my organization’s data? How will it be regulated as the data crosses cloud boundaries? How to ensure that data workflows and pipelines move freely to shared and unencumbered? These questions are surely be eyeballed in any data regulated segments of the businesses and the individuals who have dealings in those markets.

What about people like us who have been in the storage technology industry for a lot time? I have reverberated that a technology person doing technical work has stand out. Going back to my old 2011 blog, you have to be better than better, knowing the technologies deeper than deeper, and be more connected than you are connected right now. Be EXTRAORDINARY than the typical run-of-the-mill engineer or consultant or architect. Stand Out!

This is not a prediction for the future. I am not a futurist but the signs of change upon the data services platforms (storage for you dinosaurs, yours truly included) are shaping up to tangible forms. And we are going to see lots of more disruptive stuff in 2018 and beyond.

Just my once-in-a-while ranting and we will have a fantastic 2018!


Of Object Storage, Filesystems and Multi-Cloud

Data storage silos everywhere. The early clarion call was to eliminate IT data storage silos by moving to the cloud. Fast forward to the present. Data storage silos are still everywhere, but this time, they are in the clouds. I blogged about this.

Object Storage was all the rage when it first started. AWS, with its S3 (Simple Storage Service) offering, started the cloud storage frenzy. Highly available, globally distributed, simple to access, and fitted superbly into the entire AWS ecosystem. Quickly, a smorgasbord of S3-compatible, S3-like object-based storage emerged. OpenStack Swift, HDS HCP, EMC Atmos, Cleversafe (which became IBM SpectrumScale), Inktank Ceph (which became RedHat Ceph), Bycast (acquired by NetApp to be StorageGrid), Quantum Lattus, Amplidata, and many more. For a period of a few years prior, it looked to me that the popularity of object storage with an S3 compatible front has overtaken distributed file systems.

What’s not to like? Object storage are distributed, they are metadata rich (at a certain structural level), they are immutable (hence secure from a certain point of view), and some even claim self-healing (depending on data protection policies). But one thing that object storage rarely touted dominance was high performance I/O. There were some cases, but they were either fronted by a file system (eg. NFSv4.1 with pNFS extensions), or using some host-based, SAN-client agent (eg. StorNext or Intel Lustre). Object-based storage, in its native form, has not been positioned as high performance I/O storage.

A few weeks ago, I read an article from Storage Soup, Dave Raffo. When I read it, it felt oxymoronic. SwiftStack was just nominated as a visionary in the Gartner Magic Quadrant for Distributed File Systems and Object Storage. But according to Dave’s article, Swiftstack did not want to be “associated” with object storage that much, even though Swiftstack’s technology underpinning was all object storage. Strange.

Continue reading

Commvault UDI – a new CPUU

[Preamble: I am a delegate of Storage Field Day 14. My expenses, travel and accommodation are paid for by GestaltIT, the organizer and I am not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

I am here at the Commvault GO 2017. Bob Hammer, Commvault’s CEO is on stage right now. He shares his wisdom and the message is clear. IT to DT. IT to DT? Yes, Information Technology to Data Technology. It is all about the DATA.

The data landscape has changed. The cloud has changed everything. And data is everywhere. This omnipresence of data presents new complexity and new challenges. It is great to get Commvault acknowledging and accepting this change and the challenges that come along with it, and introducing their HyperScale technology and their secret sauce – Universal Dynamic Index.

Continue reading

DellEMC Forum 2017 – A contrast of worlds

The DellEMC Forum 2017 in Kuala Lumpur concluded yesterday. I was there to catch up with old friends, pick up gossips in the grapevine but most of all, to really see how the new DellEMC was doing.

From the news and sources for the past 1 year, everything looked fine and dandy. In fact, I was super impressed in the way the whole merger thing was going, because DellEMC was firing at all fronts.

And the DellEMC Forum started with some good energy. They brought out many of the storage, hyper-converged, software-defined solutions. On show, they had XtremIO, Scale IO, Isilon, and even a full hands-on lab booth at the corner end of the floor. DellEMC was also inclusive the SMB/SME segment, dedicating a separate pavilion for the SMB/SME solutions.

Continue reading

Solid in the Fire

December 22 2015: I kept this blog in draft for 6 months. Now I am releasing it as NetApp acquires Solidfire.


The above is an old Chinese adage which means “True Gold fears no Fire“. That is how I would describe my revisited view and assessment of SolidFire, a high performance All-Flash array vendor which is starting to make its presence felt in South Asia.

I first blogged about SolidFire 3 years ago, and I have been following the company closely as more and more All-Flash array players entered the market over the 3 years. Many rode on the hype and momentum of flash storage, and as a result, muddied and convoluted the storage infrastructure market understanding. It seems to me spin marketing ruled the day and users could not make a difference between vendor A and vendor B, and C and D, and so on….

I have been often asked, which is the best All-Flash array today. I have always hesitated to say which is the best because there aren’t much to say, except for 2-3 well entrenched vendors. Pure Storage and EMC XtremIO come to mind but the one that had stayed under the enterprise storage radar was SolidFire, until now.

SolidFire Logo

Continue reading

HDS HNAS kicks ass

I am dusting off the cobwebs of my blog. After almost 3 months of inactivity, (and trying to avoid the Social Guidelines Media of my present company), I have bolstered enough energy to start writing again. I am tired, and I am finishing off the previous engagements prior to joining HDS. But I am glad those are coming to an end, with the last job in Beijing next week.

So officially, I will be in HDS as of November 4, 2013 . And to get into my employer’s good books, I think I should start with something that HDS has proved many critics wrong. The notion that HDS is poor with NAS solutions has been dispelled with a recent benchmark report from SPECSfs, especially when it comes to NFS file performance. HDS has never been much of a big shouter about their HNAS, even back in the days of OEM with BlueArc. The gap period after the BlueArc acquisition was also, in my opinion, quiet unless it was the gestation period for this Kick-Ass announcement a couple of weeks ago. Here is one of the news circling in the web, from the ever trusty El-Reg.

HDS has never been big shouting like the guys, like EMC and NetApp, who have plenty of marketing dollars to spend. EMC Isilon and NetApp C-Mode have always touted their mighty SPECSfs numbers, usually with a high number of controllers or nodes behind the benchmarks. More often than not, many readers would probably focus more on the NFSops/sec figures rather than the number of heads required to generate the figures.

Unaware of this HDS announcement, I was already asking myself that question about NFSops/sec per SINGLE controller head. So, on September 26 2013, I did a table comparing some key participants of the SPECSfs2008_nfs.v3 and here is the table:

SPECSfs2008_nfs.v3-26-Sept-2013In the last columns of the 2 halves (which I have highlighted in Red), the NFSops/sec/single controller head numbers are shown. I hope that readers would view the performance numbers more objectively after reading this. Therefore, I let you make your own decisions but ultimately, they are what they are. One should not be over-mesmerized by the super million NFSops/sec until one looks under the hood. Secondly, one should also look at things more holistically such as $/NFSops/sec, $/ORT (overall response time), and $/GB/NFSops/managed and other relevant indicators of the systems sold.

But I do not want to take the thunder away from HDS’ HNAS platforms in this recent benchmark. In summary,

HDS SPECbench summaryTo reach a respectable number of 607,647 NFSops/sec with a sub-second response time is quite incredible. The ORT of 0.59 msecs should not be taken lightly because to eke just about a 0.1 msec is not easy. Therefore, reaching 0.5 millisecond is pretty awesome.

This is my first blog after 3 months. I am glad to be back and hopefully with the monkey off my back (I am referring to my outstanding engagements), I can concentrating on writing good stuff again. I know, I know … I still owe some people some entries. It’s great to be back 🙂