Figuring out storage for Kubernetes and containers

Oops! I forgot about you!

To me, containers and container orchestration (CO) engines such as Kubernetes, Mesos, Docker Swarm are fantastic. They scale effortlessly and are truly designed for cloud native applications (CNA).

But one thing irks me. Storage management for containers and COs. It was as if when they designed and constructed containers and the containers orchestration (CO) engines, they forgot about the considerations of storage and storage management. At least the persistent part of storage.

Over a year ago, I was in two minds about persistent storage, especially when it comes to the transient nature of microservices which was so prevalent and were inundating the cloud native applications landscape. I was searching for answers in my blog. The decentralization of microservices in containers means mass deployment at the edge, but to have the pre-processed and post-processed data stick to the persistent storage at the edge device is a challenge. The operative word here is “STICK”.

Two different worlds

Containers were initially designed and built for lightweight applications such as microservices. The runtime, libraries, configuration files and dependencies are all in one package. They were meant to do simple tasks quickly and scales to thousands easily. They could be brought up and brought down in little time and did not have to bother about the persistent data stored by the host. The state of the containers were also not important to the application tasks at hand.

Today containers like Docker have matured to run enterprise applications and the state of the container is important. The applications must know the state and the health of the container. The container could be in online mode, online but not accepting data mode, suspended mode, paused mode, interrupted mode, quiesced mode or halted mode. Each mode or state of the container is important to the running applications and the container can easily brought up or down in an instance of a command. The stateful nature of the containers and applications is critical for the business. The same situation applies to container orchestration engines such as Kubernetes.

Container and Kubernetes Storage

Docker provides 3 methods to local storage. In the diagram below, it describes:

Continue reading

WekaIO controls their performance destiny

[Preamble: I have been invited by GestaltIT as a delegate to their Tech Field Day for Storage Field Day 18 from Feb 27-Mar 1, 2019 in the Silicon Valley USA. My expenses, travel and accommodation were covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

I was first introduced to WekaIO back in Storage Field Day 15. I did not blog about them back then, but I have followed their progress quite attentively throughout 2018. 2 Storage Field Days and a year later, they were back for Storage Field Day 18 with a new CTO, Andy Watson, and several performance benchmark records.

Blowout year

2018 was a blowout year for WekaIO. They have experienced over 400% growth, placed #1 in the Virtual Institute IO-500 10-node performance challenge, and also became #1 in the SPEC SFS 2014 performance and latency benchmark. (Note: This record was broken by NetApp a few days later but at a higher cost per client)

The Virtual Institute for I/O IO-500 10-node performance challenge was particularly interesting, because it pitted WekaIO against Oak Ridge National Lab (ORNL) Summit supercomputer, and WekaIO won. Details of the challenge were listed in Blocks and Files and WekaIO Matrix Filesystem became the fastest parallel file system in the world to date.

Control, control and control

I studied WekaIO’s architecture prior to this Field Day. And I spent quite a bit of time digesting and understanding their data paths, I/O paths and control paths, in particular, the diagram below:

Starting from the top right corner of the diagram, applications on the Linux client (running Weka Client software) and it presents to the Linux client as a POSIX-compliant file system. Through the network, the Linux client interacts with the WekaIO kernel-based VFS (virtual file system) driver which coordinates the Front End (grey box in upper right corner) to the Linux client. Other client-based protocols such as NFS, SMB, S3 and HDFS are also supported. The Front End then interacts with the NIC (which can be 10/100G Ethernet, Infiniband, and NVMeoF) through SR-IOV (single root IO virtualization), bypassing the Linux kernel for maximum throughput. This is with WekaIO’s own networking stack in user space. Continue reading

StorPool – Block storage managed well

[Preamble: I have been invited by GestaltIT as a delegate to their Tech Field Day for Storage Field Day 18 from Feb 27-Mar 1, 2019 in the Silicon Valley USA. My expenses, travel and accommodation were covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

Storage technology is complex. Storage infrastructure and data management operations are not trivial, despite what the hyperscalers like Amazon Web Services and Microsoft Azure would like you to think. As the adoption of cloud infrastructure services grow, the small and medium businesses/enterprises (SMB/SME) are usually left to their own devices to manage the virtual storage infrastructure. Cloud Service Providers (CSPs) addressing the SMB/SME market are looking for easier, worry-free, software-defined storage to elevate their value to their customers.

Managed high performance block storage

Enter StorPool.

StorPool is a scale-out block storage technology, capable of delivering 1 million+ IOPS with sub-milliseconds response times. As described by fellow delegate, Ray Lucchesi in his recent blog, they were able to achieve these impressive performance numbers in their demo, without the high throughput RDMA network or the storage class memory of Intel Optane. Continue reading

Clever Cohesity

[Preamble: I have been invited by GestaltIT as a delegate to their Tech Field Day for Storage Field Day 18 from Feb 27-Mar 1, 2019 in the Silicon Valley USA. My expenses, travel and accommodation were covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

This is clever. This is very smart.

The moment the Cohesity App Marketplace pitch was shared at the Storage Field Day 18 session, somewhere in my mind, enlightenment came to me.

The hyperconverged platform for secondary data, or is it?

When Cohesity came into the scene, they were branded the latest unicorn alongside Rubrik. Both were gunning for the top hyperconverged platform for secondary data. Crazy money was pouring into that segment – Cohesity got USD250 million in June 2018; Rubrik received USD261 million in Jan 2019 – making the market for hyperconverged platforms for secondary data red-hot. Continue reading

VAST Data must be something special

[Preamble: I have been invited by GestaltIT as a delegate to their Tech Field Day for Storage Field Day 18 from Feb 27-Mar 1, 2019 in the Silicon Valley USA. My expenses, travel and accommodation were covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

Vast Data coming out bash!

The delegates of Storage Field Days were always the lucky bunch. We have witnessed several storage technology companies coming out of stealth at these Tech Field Days. The recent ones in memory for me were Excelero and Hammerspace. But to have one where the venerable storage doyen, Mr. Howard Marks, Vast Data new tech evangelist, to introduce the deep dive of Vast Data technology was something special.

For those who knew Howard, he is fiercely independent, very storage technology smart, opinionated and not easily impressed. As a storage technology connoisseur myself, I believe Howard must have seen something special in Vast Data. They must be doing something extremely unique and impressive that someone like Howard could not resist, and made him jump to the vendor side. This sets the tone of my blog.

Continue reading

Quantum Corp should spin off Stornext

What’s happening at Quantum Corporation?

I picked up the latest development news about Quantum Corporation. Last month, in December 2018, they secured a USD210 million financial lifeline to support their deflating business and their debts. And if you follow their development, they are with their 3rd CEO in the past 12 months, which is quite extraordinary. What is happening at Quantum Corp?

Quantum Logo (PRNewsFoto/Quantum Corp.)

Stornext – The Swiss Army knife of Data Management

I have known Quantum since 2000, very focused on the DLT tape library business. At that time, prior to the coming of LTO, DLT and its successor, SuperDLT dominated the tape market together with IBM. In 2006, they acquired ADIC, another tape vendor and became one of the largest tape library vendors in the world. From the ADIC acquisition, Quantum also got their rights on Stornext, a high performance scale out file system. I was deeply impressed with Stornext, and I once called it the Swiss Army knife of Data Management. The versatility of Stornext addressed many of the required functions within the data management lifecycle and workflows, and thus it has made its name in the Media and Entertainment space.

Jack of all trades, master of none

However, Quantum has never reached great heights in my opinion. They are everything to everybody, like a Jack of all trades, master of none. They are backup with their tape libraries and DXi series, archive and tiering with the Lattus, hybrid storage with QXS, and file system and scale-out with Stornext. If they have good business run rates and a healthy pipeline, having a broad product line is fine and dandy. But Quantum has been having CEO changes like turning a turnstile, and amid “a few” accounting missteps and a 2018 CEO who only lasted 5 months, they better steady their rocking boat quickly. Continue reading

Storage and Data Management Planning crucial for Malaysian SMBs

Hybrid IT for 2019 and beyond

2019 is here.

I am especially buoyed by the strong network storage industry footing in 2018, reported by The Register last week. 2018 was certainly a blowout year for storage infrastructure and storage software, both for on-premises and the cloud computing platforms. The AWS Outposts announcement over a month ago also just affirmed that the new world is Hybrid IT. And there is plenty to look forward to in 2019.

Malaysian Economic Doldrums

Things are not as rosy for the Malaysia economy in 2019. It will be a challenging 2019 as reported by the Edge, a local business publication. The GDP (gross domestic product) of the first half of 2018 shrunk, from 5.9% in 2017, to 4.65%, and it is estimated to be 4.9% in 2019. With an inexperienced new government, a weak currency, and more competitive economies emerging in ASEAN, Malaysia small and medium businesses (SMBs) could be challenged.

The knee jerk reaction would be to cut the IT spending and revert to buying on price. This has happened too often, because there are always other operating costs that may be more pressing. Furthermore, many of the SMBs are still aimless when it comes to transforming their businesses into the digital data era, groping in the dark and sputtering to get its worth with their IT investments. Often, many are misinformed and stumbled, resulting in much higher wastage and costs.

There is a local saying here:

Good thing No Cheap; Cheap thing No Good

And the saying is very apt to describe that there is value in investing well, and the price factor should not always be the main determinant criteria of buying IT infrastructure, software and services.

Many of these SMBs also lack experienced IT staff to manage their IT environment. There is also a hurried urgency to modernize IT, because a well-planned and executed IT strategy and operations would definitely increase their Competitive Advantage. Continue reading

From the past to the future

2019 beckons. The year 2018 is coming to a close and I look upon what I blogged in the past years to reflect what is the future.

The evolution of the Data Services Platform

Late 2017, I blogged about the Data Services Platform. Storage is no longer the storage infrastructure we know but has evolved to a platform where a plethora of data services are served. The changing face of storage is continually evolving as the IT industry changes. I take this opportunity to reflect what I wrote since I started blogging years ago, and look at the articles that are shaping up the landscape today and also some duds.

Some good ones …

One of the most memorable ones is about memory cloud. I wrote the article when Dell acquired a small company by the name of RNA Networks. I vividly recalled what was going through my mind when I wrote the blog. With the SAN, NAS and DAS, and even FAN (File Area Network) happening during that period, the first thing was the System Area Network, the original objective Infiniband and RDMA. I believed the final pool of where storage will be is the memory, hence I called it the “The Last Bastion – Memory“. RNA’s technology became part of Dell Fluid Architecture.

True enough, the present technology of Storage Class Memory and SNIA’s NVDIMM are along the memory cloud I espoused years ago.

What about Fibre Channel over Ethernet (FCoE)? It wasn’t a compelling enough technology for me when it came into the game. Reduced port and cable counts, and reduced power consumption were what the FCoE folks were pitching, but the cost of putting in the FC switches, the HBAs were just too great as an investment. In the end, we could see the cracks of the FCoE story, and I wrote the pre-mature eulogy of FCoE in my 2012 blog. I got some unsavoury comments writing that blog back then, but fast forward to the present, FCoE isn’t a force anymore.

Weeks ago, Amazon Web Services (AWS) just became a hybrid cloud service provider/vendor with the Outposts announcement. It didn’t surprise me but it may have shook the traditional systems integrators. I took the stance 2 years ago when AWS partnered with VMware and juxtaposed it to the philosophical quote in the 1993 Jurassic Park movie – “Life will not be contained, … Life finds a way“.

Continue reading

Is Pure Play Storage good?

I post storage and cloud related articles to my unofficial SNIA Malaysia Facebook community (you are welcomed to join) every day. It is a community I started over 9 years ago, and there are active live banters of the posts of the day. Casual, personal were the original reasons why I started the community on Facebook rather than on LinkedIn, and I have been curating it religiously for the longest time.

The Big 5 of Storage (it was Big 6 before this)

Looking back 8-9 years ago, the storage vendor landscape of today has not changed much. The Big 5 hegemony is still there, still dominating the Gartner Magic Quadrant for Enterprise and Mid-end Arrays, and is still there in the All-Flash quadrant as well, albeit the presence of Pure Storage in that market.

The Big 5 of today – Dell EMC, NetApp, HPE, IBM and Hitachi Vantara – were the Big 6 of 2009-2010, consisting of EMC, NetApp, Dell, HP, IBM and Hitachi Data Systems. The All-Flash, or Gartner calls it Solid State Arrays (SSA) market was still an afterthought, and Pure Storage was just founded. Pure Storage did not appear in my radar until 2 years later when I blogged about Pure Storage’s presence in the market.

Here’s a look at the Gartner Magic Quadrant for 2010:

We see Pure Play Storage vendors in the likes of EMC, NetApp, Hitachi Data Systems (before they adopted the UCP into their foray), 3PAR, Compellent, Pillar Data Systems, BlueArc, Xiotech, Nexsan, DDN and Infortrend. And when we compare that to the 2017 Magic Quadrant (I have not seen the 2018 one yet) below:

Continue reading

Disaggregation or hyperconvergence?

[Preamble: I have been invited by  GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

There is an argument about NetApp‘s HCI (hyperconverged infrastructure). It is not really a hyperconverged product at all, according to one school of thought. Maybe NetApp is just riding on the hyperconvergence marketing coat tails, and just wanted to be associated to the HCI hot streak. In the same spectrum of argument, Datrium decided to call their technology open convergence, clearly trying not to be related to hyperconvergence.

Hyperconvergence has been enjoying a period of renaissance for a few years now. Leaders like Nutanix, VMware vSAN, Cisco Hyperflex and HPE Simplivity have been dominating the scene, and touting great IT benefits and eliminating IT efficiencies. But in these technologies, performance and capacity are tightly intertwined. That means that in each of the individual hyperconverged nodes, typically starting with a trio of nodes, the processing power and the storage capacity comes together. You have to accept both resources as a node. If you want more processing power, you get the additional storage capacity that comes with that node. If you want more storage capacity, you get more processing power whether you like it or not. This means, you get underutilized resources over time, and definitely not rightsized for the job.

And here in Malaysia, we have seen vendors throw in hyperconverged infrastructure solutions for every single requirement. That was why I wrote a piece about some zealots of hyperconverged solutions 3+ years ago. When you think you have a magical hammer, every problem is a nail. 😉

In my radar, NetApp and Datrium are the only 2 vendors that offer separate nodes for compute processing and storage capacity and still fall within the hyperconverged space. This approach obviously benefits the IT planners and the IT architects, and the customers too because they get what they want for their business. However, the disaggregation of compute processing and storage leads to the argument of whether these 2 companies belong to the hyperconverged infrastructure category.

Continue reading