[Preamble: I have been invited by GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]
Sun Microsystems coined the phrase “The Network is the Computer“. It became one of the most powerful ideologies in the computing world, but over the years, many technology companies have tried to emulate and practise the mantra, but fell short.
I have never heard of Drivescale. It wasn’t in my radar until the legendary NFS guru, Brian Pawlowski joined them in April this year. Beepy, as he is known, was CTO of NetApp and later at Pure Storage, and held many technology leadership roles, including leading the development of NFSv3 and v4.
Prior to Tech Field Day 17, I was given some “homework”. Stephen Foskett, Chief Cat Herder (as he is known) of Tech Field Days and Storage Field Days, highly recommended Drivescale and asked the delegates to pick up some notes on their technology. Going through a couple of the videos, Drivescale’s message and philosophy resonated well with me. Perhaps it was their Sun Microsystems DNA? Many of the Drivescale team members were from Sun, and I was previously from Sun as well. I was drinking Sun’s Kool Aid by the bucket loads even before I graduated in 1991, and so what Drivescale preached made a lot of sense to me.Drivescale is all about Scale-Out Architecture at the webscale level, to address the massive scale of data processing. To understand deeper, we must think about “Data Locality” and “Data Mobility“. I frequently use these 2 “points of discussion” in my consulting practice in architecting and designing data center infrastructure. The gist of data locality is simple – the closer the data is to the processing, the cheaper/lightweight/efficient it gets. Moving data – the data mobility part – is expensive.
Furthermore, scale-out application frameworks already have some degree of resiliency built into its design. For example, Hadoop Distributed File System (HDFS) framework usually replicate 3 data blocks – The replication factor of 3. I wrote about the Hadoop design consideration when I was running NetApp sales in Malaysia to change IT mindsets focusing on the NetApp E-series. The replication factor of 3, even with 10Gigabit Ethernet or 8Gigabit Fibre Channel back in 2016, would have caused so much network traffic in a scale-out design and choked it. Therefore, shared storage architectures such as SAN and NAS do not always bode well with scale-out architectures, and scale-out applications much prefer the DAS or DAS-like design.
Scale-out architecture is all about economics. That is why global giants like AWS, Facebook and Google don’t buy branded SANs or NAS boxes. Even Oracle Cloud Infrastructure, whom we met at this Tech Field Day, does not run their SPARCs or Exadatas for scale-out workloads, as we have found out. Drivescale wants to bring that scale-out economics to the large enterprises with their technology.
In the diagram below, Drivescale creates Virtual Clusters for different webscale applications according to compute resources and storage resources. A “pink cluster” would have a rightsized resource configuration for its requirements, as would a “beige cluster” and a “yellow cluster”. Underutilized cluster configuration can be rebalanced with other virtual clusters, to allow the right fit for each requirement. The provisioning of the Drivescale Virtual Clusters is executed through the Drivescale Composer, a GUI interface as well as the management and orchestration platform for the Drivescale Composable Platform.
Underneath the Drivescale Composer is where the MAGIC happens. On each Compute node, a user-space Server Agent (as shown in the diagram above) and on storage side, Adapter Agent having a full view and understanding of the hard disks and SSDs pools. The Adapter Agent acts as a bridge between the storage and the Ethernet switching “fabric”, zero copy with extreme low latency. To marry both sides of compute and storage together, the network and protocol-level (iSCSI or NVMe) switching, together with LLDP (Link Layer Discovery Protocol), establishes non-blocking data paths between both compute and storage.
Drivescale did not go deep in the network switching part, but that is definitely where the secret sauce is. Thus, my view of their technology as “The Network is Still the Computer“.
And all these tech are deployed across multiple racks (hence Rackscale level), across different Recovery Domains, and across different Availability Zones. The Drivescale Composer is just a simple piece of software that sits in a Linux server with just 128GB of memory and it can manage up to 3,000 compute nodes with 100,000 disks or SSDs.
Drivescale also introduced a hardware composable flash adapter, that allows the NVMe-based SSDs to be sliced into partitions, unlike the HDDs which are presented as a single drive, and providing further optimization for SSDs. This adapter is embedded into a JBOF (just a bunch of flash) array, as shown below.
I left the Drivescale session impressed because this is the type of deep engineering design needed to solve webscale, scale-out applications. The deluge of data is already unprecedented and will continue to inundate data centers. The drive for data analytics, machine learning, deep learning and AI based applications will grow exponentially. Not everyone is a Google, an Amazon and a Facebook but the enterprises need these levels of infrastructure – both compute and storage – to compete.
To release the power of infrastructure with the right economics can only be realized by Software Composable Infrastructure such as Drivescale. Thus I believe, they are on the right path to carry the torch of “The Network is the Computer” again.