Project COSI

The S3 (Simple Storage Service) has become a de facto standard for accessing object storage. Many vendors claim 100% compatibility to S3, but from what I know, several file storage services integration and validation with the S3 have revealed otherwise. There are certain nuances that have derailed some of the more advanced integrations. I shall not reveal the ones that I know of, but let us use this thought as a basis of our discussion for Project COSI in this blog.

Project COSI high level architecture

What is Project COSI?

COSI stands for Container Object Storage Interface. It is still an alpha stage project in Kubernetes version 1.25 as of September 2022 whilst the latest version of Kubernetes today is version 1.26. To understand the objectives COSI, one must understand the journey and the challenges of persistent storage for containers and Kubernetes.

For me at least, there have been arduous arguments of provisioning a storage repository that keeps the data persistent (and permanent) after containers in a Kubernetes pod have stopped, or replicated to another cluster. And for now, many storage vendors in the industry have settled with the CSI (container storage interface) framework when it comes to data persistence using file-based and block-based storage. You can find a long list of CSI drivers here.

However, you would think that since object storage is the most native storage to containers and Kubernetes pods, there is already a consistent way to accessing object storage services. From the objectives set out by Project COSI, turns out that there isn’t a standard way to provision and accessing object storage as compared to the CSI framework for file-based and block-based storage. So the COSI objectives were set to:

  • Kubernetes Native – Use the Kubernetes API to provision, configure and manage buckets
  • Self Service – A clear delineation between administration and operations (DevOps) to enable self-service capability for DevOps personnel
  • Portability – Vendor neutrality enabled through portability across Kubernetes Clusters and across Object Storage vendors

Further details describing Project COSI can be found here at the Kubernetes site titled “Introducing COSI: Object Storage Management using Kubernetes API“.

Standardization equals technology adoption

Standardization means consistency, control, confidence. The higher the standardization across the storage and containerized apps industry, the higher the adoption of the technology. And given what I have heard from the industry over these few years, Kubernetes, to me, even till this day, is a platform and a framework that are filled and riddled with so many moving parts. Many of the components looks the same, feels the same, and sounds the same, but might not work out the same when deployed.

Therefore, the COSI standardization work is important and critical to grow this burgeoning segment, especially when we are rocketing towards disaggregation of computing service units, resources that be orchestrated to scale up or down at the execution of codes. Infrastructure-as-Code (IAC) is becoming a reality more and more with each passing day, and object storage is at the heart of this transformation for Kubernetes and containers.

Continue reading

Object Storage becoming storage lingua franca of Edge-Core-Cloud

Data Fabric was a big buzzword going back several years. I wrote a piece talking about Data Fabric, mostly NetApp®’s,  almost 7 years ago, which I titled “The Transcendence of Data Fabric“. Regardless of storage brands and technology platforms, and each has its own version and interpretations, one thing holds true. There must be a one layer of Data Singularity. But this is easier said than done.

Fast forward to present. The latest buzzword is Edge-to-Core-Cloud or Cloud-to-Core-Edge. The proliferation of Cloud Computing services, has spawned beyond to multiclouds, superclouds and of course, to Edge Computing. Data is reaching to so many premises everywhere, and like water, data has found its way.

Edge-to-Core-to-Cloud (Gratitude thanks to https://www.techtalkthai.com/dell-technologies-opens-iot-solutions-division-and-introduces-distributed-core-architecture/)

The question on my mind is can we have a single storage platform to serve the Edge-to-Core-to-Cloud paradigm? Is there a storage technology which can be the seamless singularity of data? 7+ years onwards since my Data Fabric blog, The answer is obvious. Object Storage.

The ubiquitous object storage and the S3 access protocol

For a storage technology that was initially labeled “cheap and deep”, object storage has become immensely popular with developers, cloud storage providers and is fast becoming storage repositories for data connectors. I wrote a piece called “All the Sources and Sinks going to Object Storage” over a month back, which aptly articulate how far this technology has come.

But unknown to many (Google NASD and little is found), object storage started its presence in SNIA (it was developed in Carnegie-Mellon University prior to that) in the early 90s, then known as NASD (network attached secure disk). As it is made its way into the ANSI T10 INCITS standards development, it became known as Object-based Storage Device or OSD.

The introduction of object storage services 16+ years ago by Amazon Web Services (AWS) via their Simple Storage Services (S3) further strengthened the march of object storage, solidified its status as a top tier storage platform. It was to AWS’ genius to put the REST API over HTTP/HTTPS with its game changing approach to use CRUD (create, retrieve, update, delete) operations to work with object storage. Hence the S3 protocol, which has become the de facto access protocol to object storage.

Yes, I wrote those 2 blogs 11 and 9 years ago respectively because I saw that object storage technology was a natural fit to the burgeoning new world of storage computing. It has since come true many times over.

Continue reading

Fueling the Flywheel of AWS Storage

It was bound to happen. It happened. AWS Storage is the Number 1 Storage Company.

The tell tale signs were there when Silicon Angle reported that AWS Storage revenue was around USD$6.5-7.0 billion last year and will reach USD$10 billion at the end of 2021. That news was just a month ago. Last week, IT Brand Pulse went a step further declaring AWS Storage the Number 1 in terms of revenue. Both have the numbers to back it up.

AWS Logo

How did it become that way? How did AWS Storage became numero uno?

Flywheel juggernaut

I became interested in the Flywheel concept some years back. It was conceived in Jim Collins’ book, “Good to Great” almost 20 years ago, and since then, Amazon.com has become the real life enactment of the Flywheel concept.

Amazon.com Flywheel – How each turn becomes sturdier, brawnier.

Every turn of the flywheel requires the same amount of effort although in the beginning, the noticeable effect is minuscule. But as every turn gains momentum, the returns of each turn scales greater and greater to the fixed efforts of operating a single turn.

Continue reading