The multi-cloud for infrastructure-as-a-service (IaaS) era is not here (yet). That is what the technology marketers want you to think. The hype, the vapourware, the frenzy. It is what they do. The same goes to technology analysts where they describe vision and futures, and the high level constructs and strategies to get there. The hype of multi-cloud is often thought of running applications and infrastructure services seamlessly in several public clouds such as Amazon AWS, Microsoft® Azure and Google Cloud Platform, and linking it to on-premises data centers and private clouds. Hybrid is the new black.
And the aspiration of multi-cloud is the right one, when it is truly ready. Gartner® wrote a high level article titled “Why Organizations Choose a Multicloud Strategy“. To take advantage of each individual cloud’s strengths and resiliency in respective geographies make good business sense, but there are many other considerations that cannot be an afterthought. In this blog, we look at a few of them from a data storage perspective.
In the beginning there was …
For this storage dinosaur, data storage and compute have always coupled as one. In the mainframe DASD days. these 2 were together. Even with the rise of networking architectures and protocols, from IBM SNA, DECnet, Ethernet & TCP/IP, and Token Ring FC-SAN (sorry, this is just a joke), the SANs, the filers to the servers were close together, albeit with a network buffered layer.
A decade ago, when the public clouds started appearing, data storage and compute were mostly inseparable. There was demarcation of public clouds and private clouds. The notion of hybrid clouds meant public clouds and private clouds can intermix with on-premise computing and data storage but in almost all cases, this was confined to a single public cloud provider. Until these public cloud providers realized they were not able to entice the larger enterprises to move their IT out of their on-premises data centers to the cloud convincingly. So, these public cloud providers decided to reverse their strategy and peddled their cloud services back to on-prem. Today, Amazon AWS has Outposts; Microsoft® Azure has Arc; and Google Cloud Platform launched Anthos.
The Elephant and the Birds
There was an interesting analogy I adored from some NetApp® presentation slides a while back. The huge elephant, is heavy and that is the Data Storage. It is encumbered by many things including backup copies, archived copies, compliance regulations and so on. It is tethered to several data points requirements in which I formulated as A.P.P.A.R.M.S.C. [ Read more here ]
The birds is the Compute. Lightweight, mobile and agile, and non-persistent. They can be spun up or down easily.
The different polarities of this dichotomy (the heavy vs the light) meant that data storage became the center of gravity of IT. It attracted more applications and services, and with that, more copies, and copies of those copies. Data storage is hard to move. Thus, compute must be close to data storage to reduce latency, to increase the response time, leading to better outcomes and experiences.
The Conceptual-Reality Conundrum
In a multi-cloud space, how will this pan out? There are a few “models” to consider.
- We can put the data storage infrastructure in each of the clouds, to be close to the compute of the respective cloud to take advantage of the public cloud’s unique application(s) capabilities. For example, Google Cloud for analytics®, Microsoft Azure for Office productivity suite and AWS for DocumentDB.
- We can put the data storage at a cloud exchange like Equinix® and use AWS Direct Connect, or Azure ExpressRoute or Google Dedicated Interconnect, to link up with the respective public cloud infrastructure for each type of application. The 10 Gigabit links would still be considered low latency to the compute in the public clouds.
Here is a depiction of Model 2, from Faction® Multi-Cloud services.
In model 1, here are the possible disadvantages
- Data silos; single source of truth may be harder to achieve
- Lock-in to respective public cloud services
- Lift and shift is required for on-premises enterprise applications when they are not made for the cloud
- Different data strategies eg. 3 different backups for 3 different clouds, unless you subscribe to a 3rd party cloud backup solution like Druva or Clumio. Egress fees still apply.
- Required to maintain different management consoles, security, compliance policies, processes and procedures for each cloud
- Data transfer costs between clouds
In model 2,
- Data is centralized in the cloud exchange datacenter but this service might not exist in every geographical locations. Data compliance and sovereignty requirements may be limited to specific locations
- Data transfer between cloud exchange and each of the respective clouds may incur heavier costs as the data moves in and out regularly
- May limit some storage network, security and enterprise applications architecture designs
There are pros to both models as well, but at this point, the disadvantages of reality may have overweigh the lofty concepts of multi-cloud.
Shiny things don’t always last
The Gartner® Hype Cycle is right. Version 1.0 is for the innovators, the early adopters, way before reality sets in. In many enterprises, there are test labs and skunkworks which are ready to fail, and ready to fail fast to disrupt the business as usual. Never settle.
But the push to digital transformation, especially one that involves enterprise data on enterprise storage must be taken with a dose of reality. Th attraction to shiny things and being enamoured by the promise that multi-cloud storage as the panacea must be questioned at length before betting the enterprise applications on them.
The only issue I see today is many storage technology vendors continue to tout multi-cloud with very little reality dust sprinkled on the messaging. It is time for enterprises to ask the hard questions. Stop the hype.