I should be a love match maker.
I have been spending much hours in the past few months, thinking of stateful data in stateful storage containers and how they would consummate with distributed applications containers and functions-as-a-service (aka serverless, aka Lambda). It still hasn’t made much sense, and I have not solved this problem yet. Although there were bits and pieces that coming together and the jigsaw looked well enough to give a cackled reply, what I have now is still not good enough for me. I am still searching for answers, better than the ones I have now.
The CAP theorem is in center of my mind. Distributed data, distributed states of data are on my mind. And by the looks of things, the computing world is heading towards containers and serverless computing too. Both distributed applications containers and serverless computing make a lot of sense. If we were to engage a whole new world of fog computing, edge computing, IoT, autonomous systems, AI, and other real-time computing, I would say that the future belongs to decentralization. Cloud Computing and having edge systems and devices getting back to the cloud for data is too slow. The latency of micro- or even nano-seconds is just not good enough. If we rely on the present methods to access the most relevant data, we are too late.
If we were to build the next generation of distributed data systems, what would it be? It won’t be a data center for sure. Even the word data “CENTER” means centralization, the anti-matter of distributed systems and data data systems.
I have seen and read many technology vendors touting object storage as the right fit to meet the requirements of stateful data for the containers and serverless computing. Applications containers are distributed. So are server computing factories. Objects are distributed too, so I’d guess that object storage is the best answer to this. Maybe I am too naive; maybe I am too uninformed. But it seems to me that generation 1 of most object storage platforms use eventual data consistency (CAP Theorem applies), and that has become the crux of my dilemma.
Eventual data consistency is latency. Eventual data consistency is lack of relevancy. Both are, to me, complete opposites of the real-time statefulness of data, something of a misnomer to the requirements of distributed applications containers and serverless computing.
There is hope. Generation 2 multi-cloud storage systems is promising. It is bringing more intelligence and layering a new fabric over generation 1 object storage. They are touting microseconds I/O access and I have seen several of these Gen2 vendors in SuperComputing conferences in the US and in Europe in the past 6-8 months. And I love to keep learning and have deeper understandings of this wonderful dilemma.
I am still out there searching for answers, better answers. My previous trips to Storage Field Days were good. In SFD12, there were Datera and Elastifile in my books. In SFD14, it was Scality. And now, in the upcoming Storage Field Day 15, we will be expecting Hedvig and WekaIO to hand out more jigsaw pieces to the puzzle. I have always enjoyed the serendipity of Storage Field Days, and I am looking for more.
Happy Lunar New Year to all. It’s the Year of the Dog!
Pingback: My dilemma of stateful storage marriage - Tech Field Day
Pingback: VMs vs. Containers for Microservises - Hyscale