Celebrating MinIO

Essentially MinIO is a web server …

I vaguely recalled Anand Babu Periasamy (AB as he is known), the CEO of MinIO saying that when I first met him in 2017. I was fresh “playing around” with MinIO and instantly I fell in love with software technology. Wait a minute. Object storage wasn’t supposed to be so easy. It was not supposed to be that simple to set up and use, but MinIO burst into my storage universe like the birth of the Infinity Stones. There was a eureka moment. And I was attending one of the Storage Field Days in the US shortly after my MinIO discovery in late 2017. What an opportunity!

I could not recall how I made the appointment to meeting MinIO, but I recalled myself taking an Uber to their cosy office on University Avenue in Palo Alto to meet. Through Andy Watson (one of the CTOs then), I was introduced to AB, Garima Kapoor, MinIO’s COO and his wife, Frank Wessels, Zamin (one of the business people who is no longer there) and Ugur Tigli (East Coast CTO) who was on the Polycom. I was awe struck.

Last week, MinIO scored a major Series B round funding of USD103 million. It was delayed by the pandemic because I recalled Garima telling me that the funding was happening in 2020. But I think the delay made it better, because the world now is even more ready for MinIO than ever before.

Continue reading

Rethinking data processing frameworks systems in real time

“Row, row, row your boat, gently down the stream…”

Except the stream isn’t gentle at all in the data processing’s new context.

For many of us in the storage infrastructure and data management world, the well known framework is storing and retrieve data from a storage media. That media could be a disk-based storage array, a tape, or some cloud storage where the storage media is abstracted from the users and the applications. The model of post processing the data after the data has safely and persistently stored on that media is a well understood and a mature one. Users, applications and workloads (A&W) process this data in its resting phase, retrieve it, work on it, and write it back to the resting phase again.

There is another model of data processing that has been bubbling over the years and now reaching a boiling point. Still it has not reached its apex yet. This is processing the data in flight, while it is still flowing as it passes through processing engine. The nature of this kind of data is described in one 2018 conference I chanced upon a year ago.

letgo marketplace processing numbers in 2018

  • * NRT = near real time

From a storage technology infrastructure perspective, this kind of data processing piqued my curiosity immensely. And I have been studying this burgeoning new data processing model in my spare time, and where it fits, bringing the understanding back into the storage infrastructure and data management side.

Continue reading

Storage Elephant Compute Birds

Data movement is expensive. Not just costs, but also latency and resources as well. Thus there were many narratives to move compute closer to where the data is stored because moving compute is definitely more economical than moving data. I borrowed the analogy of the 2 animals from some old NetApp® slides which depicted storage as the elephant, and compute as birds. It was the perfect analogy, because the storage is heavy and compute is light.

“Close up of a white Great Egret perching on top of an African Elephant aa Amboseli national park, Kenya”

Before the animals representation came about I used to use the term “Data locality, Data Mobility“, because of past work on storage technology in the Oil & Gas subsurface data management pipeline.

Take stock of your data movement

I had recent conversations with an end user who has been paying a lot of dollars keeping their “backup” and “archive” in AWS Glacier. The S3 storage is cheap enough to hold several petabytes of data for years, because the IT folks said that the data in AWS Glacier are for “backup” and “archive”. I put both words in quotes because they were termed as “backup” and “archive” because of their enterprise practice. However, the face of their business is changing. They are in manufacturing, oil and gas downstream, and the definitions of “backup” and “archive” data has changed.

For one, there is a strong demand for reusing the past data for various reasons and these datasets have to be recalled from their cloud storage. Secondly, their data movement activities still mimicked what they did in the past during their enterprise storage days. It was a classic lift-and-shift when they moved to the cloud, and not taking stock of  their data movements and the operations they ran on these datasets. Still ongoing, their monthly AWS cost a bomb.

Continue reading

Open Source Storage Technology Crafters

The conversation often starts with a challenge. “What’s so great about open source storage technology?

For the casual end users of storage systems, regardless of SAN (definitely not Fibre Channel) or NAS on-premises, or getting “files” from the personal cloud storage like Dropbox, OneDrive et al., there is a strong presumption that open source storage technology is cheap and flaky. This is not helped with the diet of consumer brands of NAS in the market, where the price is cheap, but the storage offering with capabilities, reliability and performance are found to be wanting. Thus this notion floats its way to the business and enterprise users, and often ended up with a negative perception of open source storage technology.

Highway Signpost with Open Source wording

Storage Assemblers

Anybody can “build” a storage system with open source storage software. Put the software together with any commodity x86 server, and it can function with the basic storage services. Most open source storage software can do the job pretty well. However, once the completed storage technology is put together, can it do the job well enough to serve a business critical end user? I have plenty of sob stories from end users I have spoken to in these many years in the industry related to so-called “enterprise” storage vendors. I wrote a few blogs in the past that related to these sad situations:

We have such storage offerings rigged with cybersecurity risks and holes too. In a recent Unit 42 report, 250,000 NAS devices are vulnerable and exposed to the public Internet. The brands in question are mentioned in the report.

I would categorize these as storage assemblers.

Continue reading

What the heck is Storage Modernization?

We often hear the word “modernization” thrown around these days. The push is to get the end user to refresh their infrastructure, and the storage infrastructure market is rife with modernization word. Is your storage ripe for “modernization“?

Many possibilities to modernize storage

To modernize, it has to be relative to legacy storage hardware, and the operating environment that came with it. But if the so-called “legacy” still does the job, should you modernize?

Big Data is right

When the word “Big Data” came into prominence a while back, it stirred the IT industry into a frenzy. At one point, Apache Hadoop became the poster elephant (pun intended) for this exciting new segment. So many Vs came out, but I settled with 4 Vs as the framework of my IT conversations. The 4Vs we often hear are:

  • Volume
  • Velocity
  • Variety
  • Veracity

Continue reading

Give back or no give

[ Disclosure: I work for iXsystems™ Inc. Views and opinions are my own. ]

If my memory served me right, I recalled the illustrious leader of the Illumos project, Garrett D’Amore ranting about companies, big and small, taking OpenZFS open source codes and projects to incorporate into their own technology but hardly ever giving back to the open source community. That was almost 6 years ago.

My thoughts immediately go back to the days when open source was starting to take off back in the early 2000s. Oracle 9i database had just embraced Linux in a big way, and the book by Eric S. Raymond, “The Cathedral and The Bazaar” was a big hit.

The Cathedral & The Bazaar by Eric S. Raymond

Since then, the blooming days of proprietary software world began to wilt, and over the next twenty plus year, open source software has pretty much taken over the world. Even Microsoft®, the ruthless ruler of the Evil Empire caved in to some of the open source calls. The Microsoft® “I Love Linux” embrace definitely gave the victory feeling of the Rebellion win over the Empire. Open Source won.

Open Source bag of worms

Even with the concerted efforts of the open source communities and projects, there were many situations which have caused frictions and inadvertently, major issues as well. There are several open source projects licenses, and they are not always compatible when different open source projects mesh together for the greater good.

On the storage side of things, 2 “incidents” caught the attention of the masses. For instance, Linus Torvalds, Linux BDFL (Benevolent Dictator for Life) and emperor supremo said “Don’t use ZFS” partly due to the ignorance and incompatibility of Linux GPL (General Public License) and ZFS CDDL (Common Development and Distribution License). That ruffled some feathers amongst the OpenZFS community that Matt Ahrens, the co-creator of the ZFS file system and OpenZFS community leader had to defend OpenZFS from Linus’ comments.

Continue reading

Hadoop is truly dead – LOTR version

[Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

This blog was not intended because it was not in my plans to write it. But a string of events happened in the Storage Field Day 19 week and I have the fodder to share my thoughts. Hadoop is indeed dead.

Warning: There are Lord of the Rings references in this blog. You might want to do some research. 😉

Storage metrics never happened

The fellowship of Arjan Timmerman, Keiran Shelden, Brian Gold (Pure Storage) and myself started at the office of Pure Storage in downtown Mountain View, much like Frodo Baggins, Samwise Gamgee, Peregrine Took and Meriadoc Brandybuck forging their journey vows at Rivendell. The podcast was supposed to be on the topic of storage metrics but was unanimously swung to talk about Hadoop under the stewardship of Mr. Stephen Foskett, our host of Tech Field Day. I saw Stephen as Elrond Half-elven, the Lord of Rivendell, moderating the podcast as he would have in the plans of decimating the One Ring in Mount Doom.

So there we were talking about Hadoop, or maybe Sauron, or both.

The photo of the Oliphaunt below seemed apt to describe the industry attacks on Hadoop.

Continue reading

Is General Purpose Object Storage disenfranchised?

[Disclosure: I am invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees will be covered by GestaltIT, the organizer and I am not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

This is NOT an advertisement for coloured balls.

This is the license to brag for the vendors in the next 2 weeks or so, as we approach the 2020 new year. This, of course, is the latest 2019 IDC Marketscape for Object-based Storage, released last week.

My object storage mentions

I have written extensively about Object Storage since 2011. With different angles and perspectives, here are some of them:

Continue reading

Commvault big bet

I woke up at 2.59am in the morning of Sept 5th morning, a bit discombobulated and quickly jumped into the Commvault call. The damn alarm rang and I slept through it, but I got up just in time for the 3am call.

As I was going through the motion of getting onto UberConference, organized by GestaltIT, I was already sensing something big. In the call, Commvault was acquiring Hedvig and it hit me. My drowsy self centered to the big news. And I saw a few guys from Veritas and Cohesity on my social media group making gestures about the acquisition.

I spent the rest of the week thinking about the acquisition. What is good? What is bad? How is Commvault going to move forward? This is at pressing against the stark background from the rumour mill here in South Asia, just a week before this acquisition news, where I heard that the entire Commvault teams in Malaysia and Asia Pacific were released. I couldn’t confirm the news in Asia Pacific, but the source of the news coming from Malaysia was strong and a reliable one.

What is good?

It is a big win for Hedvig. Nestled among several scale-out primary storage vendors and little competitive differentiation, this Commvault acquisition is Hedvig’s pay day.

Continue reading