Data storage silos everywhere. The early clarion call was to eliminate IT data storage silos by moving to the cloud. Fast forward to the present. Data storage silos are still everywhere, but this time, they are in the clouds. I blogged about this.
Object Storage was all the rage when it first started. AWS, with its S3 (Simple Storage Service) offering, started the cloud storage frenzy. Highly available, globally distributed, simple to access, and fitted superbly into the entire AWS ecosystem. Quickly, a smorgasbord of S3-compatible, S3-like object-based storage emerged. OpenStack Swift, HDS HCP, EMC Atmos, Cleversafe (which became IBM SpectrumScale), Inktank Ceph (which became RedHat Ceph), Bycast (acquired by NetApp to be StorageGrid), Quantum Lattus, Amplidata, and many more. For a period of a few years prior, it looked to me that the popularity of object storage with an S3 compatible front has overtaken distributed file systems.
What’s not to like? Object storage are distributed, they are metadata rich (at a certain structural level), they are immutable (hence secure from a certain point of view), and some even claim self-healing (depending on data protection policies). But one thing that object storage rarely touted dominance was high performance I/O. There were some cases, but they were either fronted by a file system (eg. NFSv4.1 with pNFS extensions), or using some host-based, SAN-client agent (eg. StorNext or Intel Lustre). Object-based storage, in its native form, has not been positioned as high performance I/O storage.
A few weeks ago, I read an article from Storage Soup, Dave Raffo. When I read it, it felt oxymoronic. SwiftStack was just nominated as a visionary in the Gartner Magic Quadrant for Distributed File Systems and Object Storage. But according to Dave’s article, Swiftstack did not want to be “associated” with object storage that much, even though Swiftstack’s technology underpinning was all object storage. Strange.
Why the disassociation? Then came a bunch of announcements within a space of these 1 month. Cloudian has been bugging me with the frequent email blasts about HyperStore; SwiftStack touted more on their multi-cloud controllers than object storage; and in my recent Storage Field Day 14 trip, Scality shared their Zenko.io multi-cloud data controller. Strangely, Zenko web presence had a completely different look and feel than Scality’s web presence with a small reciprocated mention of each other in their respective web sites. Why the distinction?
What has changed? For a moment, object storage took a slight turn with these 3 companies – Cloudian, Scality, Swiftstack. Coming from behind, their present multi-cloud messaging have a more “file system” feel, tagged with a performance enhanced capability, previously not sold with object storage.
I see this “pivot” that these 3 companies made, and object storage maturing in the clouds. It is no longer Azure BLOB, or AWS S3 or GCP Storage, but a new “file system”/”data fabric” layer covering over these early cloud storage offering。
Perhaps the next generation of cloud storage is here. One which the eventual consistency feature of the distributed data stores is no longer an issue. One where the always relevant, and always consistent data will bring on the high performance I/O game in cloud storage. That is something I would love to see, and I think it is already happening through this new generation of multi-cloud data fabric technologies.
[Note: Non-HPC, data fabric companies were participating in the SuperComputing SC17 last week in Denver. In my radar, Elastifile and OpenIO were at the conference, further proving that these multi-cloud technologies are positioning High Performance I/O in their game plan]
Pingback: Of Object Storage, Filesystems and Multi-Cloud - Tech Field Day