[Preamble: I have been invited by GestaltIT as a delegate to their Tech Field Day for Storage Field Day 18 from Feb 27-Mar 1, 2019 in the Silicon Valley USA. My expenses, travel and accommodation were covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]
For the past couple of months, I have been speaking with a few parties in Malaysia about object storage technology. And I was fairly surprised with the responses.
The 2 reports
For a start, I did not set out to talk about object storage. It kind of fell onto my lap. 2 recent Hitachi Vantara reports revealed that countries like Australia, Hong Kong and even South East Asian countries were behind in their understanding of what object storage was, and the benefits it brought to the new generation of web scale and enterprise applications.
In the first report, an IDC survey sponsored by Hitachi Vantara, mentioned that 41% of the enterprises in Australia are not aware of object storage technology. In a similar survey, this one pointing towards Hong Kong and China, the percentages were 38% and 35% respectively. I would presume that the percentages for countries in South East Asia would not fall too far from the apple tree.
How is Malaysia doing?
However, I worry that the percentage number could be far more dire in Malaysia. In the past 2 months, responses from several conversations painted a darker hue about object storage technology with the companies in Malaysia. These included a reasonable sized hosting company, a well-established systems integrator, a software development company, several storage practitioners in Openstack and a DellEMC’s regional consultant for unstructured data. The collective conclusion was object storage technology was relatively unknown (probably similar to the percentages to the IDC/Hitachi Vantara reports), but it appeared to be shunned at this juncture. In web scale applications, Redhat Ceph block and files appeared popular in contrast to Openstack Swift. In enterprise applications, it was a toss of iSCSI and NFS.
Image from https://zdnet4.cbsistatic.com/hub/i/r/2018/04/24/c79e9dfb-b4a9-46bb-b831-f2c57fdf8a1d/resize/470xauto/5e4846d1bc7a034c382baf6dcbb612ed/cloud-storage.jpg
And this is driven by the unrelentless demand for higher and higher performance of computing, and along with it, the demands for faster and faster storage performance. Commercialization of Artificial Intelligence (AI), Deep Learning (DL) and newer applications and workloads together with the traditional HPC workloads are driving these ever increasing requirements. However, most enterprise storage platforms were not designed to meet the demands of these new generation of applications and workloads, as many have been led to believe. Why so?
I had a couple of conversations with a few well known vendors around the topic of HPC Storage. And several responses thrown back were to put Flash and NVMe to solve the high demands of HPC storage performance. In my mind, these responses were too trivial, too irresponsible. So I wanted to write this blog to share my views on HPC storage, and not just about its performance.
The HPC lines are blurring
I picked up this video (below) a few days ago. It was insideHPCRich Brueckner interview with Dr. Goh Eng Lim, HPE CTO and renowned HPC expert about the convergence of both traditional and commercial HPC applications and workloads.
I liked the conversation in the video because it addressed the 2 different approaches. And I welcomed Dr. Goh’s invitation to the Commercial HPC community to work with the Traditional HPC vendors to help push the envelope towards Exascale SuperComputing.
[Preamble: I have been invited by GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]
Sun Microsystems coined the phrase “The Network is the Computer“. It became one of the most powerful ideologies in the computing world, but over the years, many technology companies have tried to emulate and practise the mantra, but fell short.
Prior to Tech Field Day 17, I was given some “homework”. Stephen Foskett, Chief Cat Herder (as he is known) of Tech Field Days and Storage Field Days, highly recommended Drivescale and asked the delegates to pick up some notes on their technology. Going through a couple of the videos, Drivescale’s message and philosophy resonated well with me. Perhaps it was their Sun Microsystems DNA? Many of the Drivescale team members were from Sun, and I was previously from Sun as well. I was drinking Sun’s Kool Aid by the bucket loads even before I graduated in 1991, and so what Drivescale preached made a lot of sense to me.Drivescale is all about Scale-Out Architecture at the webscale level, to address the massive scale of data processing. To understand deeper, we must think about “Data Locality” and “Data Mobility“. I frequently use these 2 “points of discussion” in my consulting practice in architecting and designing data center infrastructure. The gist of data locality is simple – the closer the data is to the processing, the cheaper/lightweight/efficient it gets. Moving data – the data mobility part – is expensive.
[Preamble: I was a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]
I am a big proponent of Go-to-Market (GTM) solutions. Technology does not stand alone. It must be in an ecosystem, and in each industry, in each segment of each respective industry, every ecosystem is unique. And when we amalgamate data, the storage infrastructure technologies and the data management into the ecosystem, we reap the benefits in that ecosystem.
Data moves in the ecosystem, from system to system, north to south, east to west and vice versa, random, sequential, ad-hoc. Data acquires different statuses, different roles, different relevances in its lifecycle through the ecosystem. From it, we derive the flow, a workflow of data creating a data pipeline. The Data Pipeline concept has been around since the inception of data.
To illustrate my point, I created one for the Oil & Gas – Exploration & Production (EP) upstream some years ago.
I am guilty. I have not been tendering this blog for quite a while now, but it feels good to be back. What have I been doing? Since leaving NetApp 2 months or so ago, I have been active in the scenes again. This time I am more aligned towards data analytics and its burgeoning impact on the storage networking segment.
I was intrigued by an article posted by a friend of mine in Facebook. The article (circa 2013) was titled “Never, ever do this to Hadoop”. It described the author’s gripe with the SAN bigots. I have encountered storage professionals who throw in the SAN solution every time, because that was all they know. NAS, to them, was like that old relative smelled of camphor oil and they avoid NAS like a plague. Similar DAS was frowned upon but how things have changed. The pendulum has swung back to DAS and new market segments such as VSANs and Hyper Converged platforms have been dominating the scene in the past 2 years. I highlighted this in my blog, “Praying to the Hypervisor God” almost 2 years ago.
I agree with the author, Andrew C. Oliver. The “locality” of resources is central to Hadoop’s performance.
Consider these 2 models:
In the model on your left (Moving Data to Compute), the delivery process from Storage to Compute is HEAVY. That is because data has dependencies; data has gravity. However, if you consider the model on your right (Moving Compute to Data), delivering data processing to the storage layer is much lighter. Compute or data processing is transient, and the data in the compute layer is volatile. Once compute’s power is turned off, everything starts again from a clean slate, hence the volatile stage.