Of Object Storage, Filesystems and Multi-Cloud

Data storage silos everywhere. The early clarion call was to eliminate IT data storage silos by moving to the cloud. Fast forward to the present. Data storage silos are still everywhere, but this time, they are in the clouds. I blogged about this.

Object Storage was all the rage when it first started. AWS, with its S3 (Simple Storage Service) offering, started the cloud storage frenzy. Highly available, globally distributed, simple to access, and fitted superbly into the entire AWS ecosystem. Quickly, a smorgasbord of S3-compatible, S3-like object-based storage emerged. OpenStack Swift, HDS HCP, EMC Atmos, Cleversafe (which became IBM SpectrumScale), Inktank Ceph (which became RedHat Ceph), Bycast (acquired by NetApp to be StorageGrid), Quantum Lattus, Amplidata, and many more. For a period of a few years prior, it looked to me that the popularity of object storage with an S3 compatible front has overtaken distributed file systems.

What’s not to like? Object storage are distributed, they are metadata rich (at a certain structural level), they are immutable (hence secure from a certain point of view), and some even claim self-healing (depending on data protection policies). But one thing that object storage rarely touted dominance was high performance I/O. There were some cases, but they were either fronted by a file system (eg. NFSv4.1 with pNFS extensions), or using some host-based, SAN-client agent (eg. StorNext or Intel Lustre). Object-based storage, in its native form, has not been positioned as high performance I/O storage.

A few weeks ago, I read an article from Storage Soup, Dave Raffo. When I read it, it felt oxymoronic. SwiftStack was just nominated as a visionary in the Gartner Magic Quadrant for Distributed File Systems and Object Storage. But according to Dave’s article, Swiftstack did not want to be “associated” with object storage that much, even though Swiftstack’s technology underpinning was all object storage. Strange.

Continue reading

Cloud silos after eliminating silos

I love cloud computing. I love the economics and the agility of the cloud and how it changed IT forever. The cloud has solved some of the headaches of IT, notably the silos in operations, the silos in development and the silos in infrastructure.

The virtualization and abstraction of rigid infrastructures and on-premise operations have given birth to X-as-a-Service and Cloud Services. Along with this, comes cloud orchestration, cloud automation, policies, DevOps and plenty more. IT responds well to this and thus, public clouds services like Amazon Web Services, Microsoft Azure, and Google Cloud Platforms are dominating the landscape. Other cloud vendors like Rackspace, SoftLayer, Alibaba Cloud are following the leaders pack offering public, private, hybrid and specialized services as well.

In this pile, we can now see the certain “camps” emerging. Many love Azure Stack and many adore AWS Lambda. Google just had their summit here in Malaysia yesterday, appealing to a green field and looking for new adopters. What we are seeing is we have customers and end users adopting various public cloud services providers, their services, their ecosystem, their tools, their libraries and so on. We also know that many customers and end users having several applications on AWS, and some on Azure and perhaps looking for better deals with another cloud vendor. Multi-cloud is becoming flavour of the season, and that word keeps appearing in presentations and conversations.

Yes, multi-cloud is a good thing. Customers and end users would love it because they can get the most bang for their buck, if only … it wasn’t so complicated. There aren’t many “multi-cloud” platforms out there yet. Continue reading

Can NetApp do it a bit better?

[Preamble: I was a delegate of Storage Field Day 12. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented in this event]

In Day 2 of Storage Field Day 12, I and the other delegates were hustled to NetApp’s Sunnyvale campus headquarters. That was a homecoming for me, and it was a bit ironic too.

Just 8 months ago, I was NetApp Malaysia Country Manager. That country sales lead role was my second stint with NetApp. I lasted almost 1 year.

17 years ago, my first stint with NetApp was the employee #2 in Malaysia as an SE. That SE stint went by quickly for 5 1/2 years, and I loved that time. Those Fall Classics NetApp used to have at the Batcave and the Fortress of Solitude left a mark with me, and the experiences still are as vivid as ever.

Despite what has happened in both stints and even outside the circle, I am still one of NetApp’s active cheerleaders in the Asia Pacific region. I even got accused by being biased as a community leader in the SNIA Malaysia Facebook page (unofficial but recognized by SNIA), because I was supposed to be neutral. I have put in 10 years to promote the storage technology community with SNIA Malaysia. [To the guy named Stanley, my response was be “Too bad, pick a religion“.]

The highlight of the SFD12 NetApp visit was of course, having lunch with Dave Hitz, one of the co-founders and the one still remaining. But throughout the presentations, I was unimpressed.

For me, the only one which stood out was CloudSync. I have read about CloudSync since NetApp Insight 2016 and yes, it’s a nice little piece of data shipping service between on-premise and AWS cloud.

Here’s how CloudSync looks like:

Continue reading

The engineering of Elastifile

[Preamble: I was a delegate of Storage Field Day 12. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented in this event]

When it comes to large scale storage capacity requirements with distributed cloud and on-premise capability, object storage is all the rage. Amazon Web Services started the object-based S3 storage service more than a decade ago, and the romance with object storage started.

Today, there are hundreds of object-based storage vendors out there, touting features after features of invincibility. But after researching and reading through many design and architecture papers, I found that many object-based storage technology vendors began to sound the same.

At the back of my mind, object storage is not easy when it comes to most applications integration. Yes, there is a new breed of cloud-based applications with RESTful CRUD API operations to access object storage, but most applications still rely on file systems to access storage for capacity, performance and protection.

These CRUD and CRUD-like APIs are the common semantics of interfacing object storage platforms. But many, many real-world applications do not have the object semantics to interface with storage. They are mostly designed to interface and interact with file systems, and secretly, I believe many application developers and users want a file system interface to storage. It does not matter if the storage is on-premise or in the cloud.

Let’s not kid ourselves. We are most natural when we work with files and folders.

Implementing object storage also denies us the ability to optimally utilize Flash and solid state storage on-premise when the compute is in the cloud. Similarly, when the compute is on-premise and the flash-based object storage is in the cloud, you get a mismatch of performance and availability requirements as well. In the end, there has to be a compromise.

Another “feature” of object storage is its poor ability to handle transactional data. Most of the object storage do not allow modification of data once the object has been created. Putting a NAS front (aka a NAS gateway) does not take away the fact that it is still object-based storage at the very core of the infrastructure, regardless if it is on-premise or in the cloud.

Resiliency, latency and scalability are the greatest challenges when we want to build a true globally distributed storage or data services platform. Object storage can be resilient and it can scale, but it has to compromise performance and latency to be so. And managing object storage will not be as natural as to managing a file system with folders and files.

Enter Elastifile.

Continue reading