Sexy HPC storage is all the rage

HPC is sexy

There is no denying it. HPC is sexy. HPC Storage is just as sexy.

Looking at the latest buzz from Super Computing Conference 2018 which happened in Dallas 2 weeks ago, the number of storage related vendors participating was staggering. Panasas, Weka.io, Excelero, BeeGFS, are the ones that I know because I got friends posting their highlights. Then there are the perennial vendors like IBM, Dell, HPE, NetApp, Huawei, Supermicro, and so many more. A quick check on the SC18 website showed that there were 391 exhibitors on the floor.

And this is driven by the unrelentless demand for higher and higher performance of computing, and along with it, the demands for faster and faster storage performance. Commercialization of Artificial Intelligence (AI), Deep Learning (DL) and newer applications and workloads together with the traditional HPC workloads are driving these ever increasing requirements. However, most enterprise storage platforms were not designed to meet the demands of these new generation of applications and workloads, as many have been led to believe. Why so?

I had a couple of conversations with a few well known vendors around the topic of HPC Storage. And several responses thrown back were to put Flash and NVMe to solve the high demands of HPC storage performance. In my mind, these responses were too trivial, too irresponsible. So I wanted to write this blog to share my views on HPC storage, and not just about its performance.

The HPC lines are blurring

I picked up this video (below) a few days ago. It was insideHPC Rich Brueckner interview with Dr. Goh Eng Lim, HPE CTO and renowned HPC expert about the convergence of both traditional and commercial HPC applications and workloads.

I liked the conversation in the video because it addressed the 2 different approaches. And I welcomed Dr. Goh’s invitation to the Commercial HPC community to work with the Traditional HPC vendors to help push the envelope towards Exascale SuperComputing.

Continue reading

Is Pure Play Storage good?

I post storage and cloud related articles to my unofficial SNIA Malaysia Facebook community (you are welcomed to join) every day. It is a community I started over 9 years ago, and there are active live banters of the posts of the day. Casual, personal were the original reasons why I started the community on Facebook rather than on LinkedIn, and I have been curating it religiously for the longest time.

The Big 5 of Storage (it was Big 6 before this)

Looking back 8-9 years ago, the storage vendor landscape of today has not changed much. The Big 5 hegemony is still there, still dominating the Gartner Magic Quadrant for Enterprise and Mid-end Arrays, and is still there in the All-Flash quadrant as well, albeit the presence of Pure Storage in that market.

The Big 5 of today – Dell EMC, NetApp, HPE, IBM and Hitachi Vantara – were the Big 6 of 2009-2010, consisting of EMC, NetApp, Dell, HP, IBM and Hitachi Data Systems. The All-Flash, or Gartner calls it Solid State Arrays (SSA) market was still an afterthought, and Pure Storage was just founded. Pure Storage did not appear in my radar until 2 years later when I blogged about Pure Storage’s presence in the market.

Here’s a look at the Gartner Magic Quadrant for 2010:

We see Pure Play Storage vendors in the likes of EMC, NetApp, Hitachi Data Systems (before they adopted the UCP into their foray), 3PAR, Compellent, Pillar Data Systems, BlueArc, Xiotech, Nexsan, DDN and Infortrend. And when we compare that to the 2017 Magic Quadrant (I have not seen the 2018 one yet) below:

Continue reading

The Big Elephant in IoT Storage

It has been on my mind for a long time and I have been avoiding it too. But it is time to face the inevitable and just talk about it. After all, the more open the discussions, the more answers (and questions) will arise, and that is a good thing.

Yes, it is the big elephant in the room called Data Security. And the concern is going to get much worse as the proliferation of edge devices and fog computing, and IoT technobabble goes nuclear.

I have been involved in numerous discussions on IoT (Internet of Things) and Industrial Revolution 4.0. I have been in a consortium for the past 10 months, discussing with several experts of their field to face future with IR4.0. Malaysia just announced its National Policy for Industry 4.0 last week, known as Industry4WRD. Whilst the policy is a policy, there are many thoughts for implementation of IoT devices, edge and fog computing. And the thing that has been bugging me is related to of course, storage, most notably storage and data security.

Storage on the edge devices are likely to be ephemeral, and the data in these storage, transient. We can discuss about persistence in storage at the edge another day, because what I would like to address in the data security in these storage components. That’s the Big Elephant in the room I was relating to.

The more I work with IoT devices and the different frameworks (there are so many of them), I became further enlightened by the need to address data security. The proliferation and exponential multiplication of IoT devices at present and in the coming future have increased the attack vectors many folds. Many of the IoT devices are simplified components lacking the guards of data security and are easily exposed. These components are designed for simplicity and efficiency in mind. Things such as I/O performance, storage management and data security are probably the least important factors, because every single manufacturer and every single vendor are slogging to make their mark and presence in this wild, wild west world.

Picture from https://fcw.com/articles/2018/08/07/comment-iot-physical-risk.aspx

Continue reading

Disaggregation or hyperconvergence?

[Preamble: I have been invited by  GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

There is an argument about NetApp‘s HCI (hyperconverged infrastructure). It is not really a hyperconverged product at all, according to one school of thought. Maybe NetApp is just riding on the hyperconvergence marketing coat tails, and just wanted to be associated to the HCI hot streak. In the same spectrum of argument, Datrium decided to call their technology open convergence, clearly trying not to be related to hyperconvergence.

Hyperconvergence has been enjoying a period of renaissance for a few years now. Leaders like Nutanix, VMware vSAN, Cisco Hyperflex and HPE Simplivity have been dominating the scene, and touting great IT benefits and eliminating IT efficiencies. But in these technologies, performance and capacity are tightly intertwined. That means that in each of the individual hyperconverged nodes, typically starting with a trio of nodes, the processing power and the storage capacity comes together. You have to accept both resources as a node. If you want more processing power, you get the additional storage capacity that comes with that node. If you want more storage capacity, you get more processing power whether you like it or not. This means, you get underutilized resources over time, and definitely not rightsized for the job.

And here in Malaysia, we have seen vendors throw in hyperconverged infrastructure solutions for every single requirement. That was why I wrote a piece about some zealots of hyperconverged solutions 3+ years ago. When you think you have a magical hammer, every problem is a nail. 😉

In my radar, NetApp and Datrium are the only 2 vendors that offer separate nodes for compute processing and storage capacity and still fall within the hyperconverged space. This approach obviously benefits the IT planners and the IT architects, and the customers too because they get what they want for their business. However, the disaggregation of compute processing and storage leads to the argument of whether these 2 companies belong to the hyperconverged infrastructure category.

Continue reading

Pondering Redhat’s future with IBM

I woke up yesterday morning with a shocker of a news. IBM announced that they were buying Redhat for USD34 billion. Never in my mind that Redhat would sell but I guess that USD190.00 per share was too tempting. Redhat (RHT) was trading at USD116.68 on the previous Friday’s close.

Redhat is one of my favourite technology companies. I love their Linux development and progress, and I use a lot of Fedora and CentOS in my hobbies. I started with Redhat back in 2000, when I became obsessed to get my RHCE (Redhat Certified Engineer). I recalled on almost every weekend (Saturday and Sunday) back in 2002 when I was in the office, learning Redhat, and hacking scripts to be really good at it. I got certified with RHCE 4 with a 96% passing mark, and I was very proud of my certification.

One of my regrets was not joining Redhat in 2006. I was offered the job as an SE by Josep Garcia, and the very first position in Malaysia. Instead, I took up the Hitachi Data Systems job to helm the project implementation and delivery for the Shell GUSto project. It might have turned out differently if I did.

The IBM acquisition of Redhat left a poignant feeling in me. In many ways, Redhat has been the shining star of Linux. They are the only significant one left leading the charge of open source. They are the largest contributors to the Openstack projects and continue to support the project strongly whilst early protagonists like HPE, Cisco and Intel have reduced their support. They are of course, the perennial top 3 contributors to the Linux kernel since the very early days. And Redhat continues to contribute to projects such as containers and Kubernetes and made that commitment deeper with their recent acquisition of CoreOS a few months back.

Continue reading

Oracle Cloud Infrastructure to prove skeptics wrong

[Preamble: I have been invited by  GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

The much maligned Oracle Cloud is getting a fresh reboot, starting with their Oracle Cloud Infrastructure (OCI), and significant enhancements and technology updates were announced at the Oracle Open World this week. I had the privilege to hear about Oracle Cloud’s new attack plan when they presented at Tech Field Day 17 last week.

Oracle Cloud has not have the best of days in recent months. Thomas Kurian’s resignation as their President of Product Development was highly publicized in a disagreement with CTO and founder, Larry Ellison over cloud software strategy. Then there was an on-going lawsuit about how Oracle was misrepresenting their cloud revenue growth, which puts Oracle in a bad light.

On the local front here in Malaysia, I have heard from the grapevine of the aggressive nature of Oracle personnel pushing partners and customers to adopt their cloud services using legal scare tactics on their database licensing. A buddy of mine, who was previously the cloud business development manager at CTC Global, also shared Oracle’s cloud shortcomings compared to Amazon Web Service and Microsoft Azure a year ago.

Oracle Cloud Infrastructure team aimed to turnover the bad perceptions, starting with the delegates of Tech Field Day 17, including yours truly.Their strategy was clear. Oracle Cloud Infrastructure runs the highest performance and the highest enterprise grade Infrastructure-as-a-Service (IaaS), bar none. Unlike the IBM Cloud, which in my opinion is a wishy-washy cloud service platform, Oracle Cloud’s ambition is solid.

They did a demo on JDEdwards EnterpriseOne application, and they continue to demonstrate their prowess running the highest performance computing experience ever, for all enterprise-grade workload. And that enterprise pedigree is clear.

Just this week, Amazon Prime Day had an outage. Amazon is in the process of weaning Oracle database from their entire ecosystem by 2020, and this outage clearly showed that the Oracle database and the enterprise applications would only run best on Oracle Cloud Infrastructure.

Continue reading

The Network is Still the Computer

[Preamble: I have been invited by  GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

Sun Microsystems coined the phrase “The Network is the Computer“. It became one of the most powerful ideologies in the computing world, but over the years, many technology companies have tried to emulate and practise the mantra, but fell short.

I have never heard of Drivescale. It wasn’t in my radar until the legendary NFS guru, Brian Pawlowski joined them in April this year. Beepy, as he is known, was CTO of NetApp and later at Pure Storage, and held many technology leadership roles, including leading the development of NFSv3 and v4.

Prior to Tech Field Day 17, I was given some “homework”. Stephen Foskett, Chief Cat Herder (as he is known) of Tech Field Days and Storage Field Days, highly recommended Drivescale and asked the delegates to pick up some notes on their technology. Going through a couple of the videos, Drivescale’s message and philosophy resonated well with me. Perhaps it was their Sun Microsystems DNA? Many of the Drivescale team members were from Sun, and I was previously from Sun as well. I was drinking Sun’s Kool Aid by the bucket loads even before I graduated in 1991, and so what Drivescale preached made a lot of sense to me.Drivescale is all about Scale-Out Architecture at the webscale level, to address the massive scale of data processing. To understand deeper, we must think about “Data Locality” and “Data Mobility“. I frequently use these 2 “points of discussion” in my consulting practice in architecting and designing data center infrastructure. The gist of data locality is simple – the closer the data is to the processing, the cheaper/lightweight/efficient it gets. Moving data – the data mobility part – is expensive.

Continue reading

The Dell EMC Data Bunker

[Preamble: I have been invited by  GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

Another new announcement graced the Tech Field Day 17 delegates this week. Dell EMC Data Protection group announced their Cyber Recovery solution. The Cyber Recovery Vault solution and services is touted as the “The Last Line of Data Protection Defense against Cyber-Attacks” for the enterprise.

Security breaches and ransomware attacks have been rampant, and they are reeking havoc to organizations everywhere. These breaches and attacks cost businesses tens of millions, or even hundreds, and are capable of bring these businesses to their knees. One of the known practices is to corrupt backup metadata or catalogs, rendering operational recovery helpless before these perpetrators attack the primary data source. And there are times where the malicious and harmful agent could be dwelling in the organization’s network or servers for long period of times, launching and infecting primary images or gold copies of corporate data at the opportune time.

The Cyber Recovery (CR) solution from Dell EM focuses on Recovery of an Isolated Copy of the Data. The solution isolates strategic and mission critical secondary data and preserves the integrity and sanctity of the secondary data copy. Think of the CR solution as the data bunker, after doomsday has descended.

The CR solution is based on the Data Domain platforms. Describing from the diagram below, data backup occurs in the corporate network to a Data Domain appliance platform as the backup repository. This is just the usual daily backup, and is for operational recovery.

Diagram from Storage Review. URL Link: https://www.storagereview.com/dell_emc_releases_cyber_recovery_software

Continue reading

My first TechFieldDay

[Preamble: I have been invited by  GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are paid by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

I have attended a bunch of Storage Field Days over the years but I have never attended a Tech Field Day. This coming week, I will be attending their 17th edition, TechFieldDay 17, but my first. I have always enjoyed Storage Field Days. Everytime I joined as a delegate, there were new things to discover but almost always, serendipity happened.

Continue reading