The Network is Still the Computer

[Preamble: I have been invited by  GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

Sun Microsystems coined the phrase “The Network is the Computer“. It became one of the most powerful ideologies in the computing world, but over the years, many technology companies have tried to emulate and practise the mantra, but fell short.

I have never heard of Drivescale. It wasn’t in my radar until the legendary NFS guru, Brian Pawlowski joined them in April this year. Beepy, as he is known, was CTO of NetApp and later at Pure Storage, and held many technology leadership roles, including leading the development of NFSv3 and v4.

Prior to Tech Field Day 17, I was given some “homework”. Stephen Foskett, Chief Cat Herder (as he is known) of Tech Field Days and Storage Field Days, highly recommended Drivescale and asked the delegates to pick up some notes on their technology. Going through a couple of the videos, Drivescale’s message and philosophy resonated well with me. Perhaps it was their Sun Microsystems DNA? Many of the Drivescale team members were from Sun, and I was previously from Sun as well. I was drinking Sun’s Kool Aid by the bucket loads even before I graduated in 1991, and so what Drivescale preached made a lot of sense to me.Drivescale is all about Scale-Out Architecture at the webscale level, to address the massive scale of data processing. To understand deeper, we must think about “Data Locality” and “Data Mobility“. I frequently use these 2 “points of discussion” in my consulting practice in architecting and designing data center infrastructure. The gist of data locality is simple – the closer the data is to the processing, the cheaper/lightweight/efficient it gets. Moving data – the data mobility part – is expensive.

Continue reading

The Malaysian Openstack storage conundrum

The Openstack blippings on my radar have ratcheted up this year. I have been asked to put together the IaaS design several times, either with the flavours of RedHat or Ubuntu, and it’s a good thing to see the Openstack interest level going up in the Malaysian IT scene. Coming into its 8th year, Openstack has become a mature platform but in the storage projects of Openstack, my observations tell me that these storage-related projects are not as well known as we speak.

I was one of the speakers at the Openstack Malaysia 8th Summit over a month ago. I started my talk with question – “Can anyone name the 4 Openstack storage projects?“. The response from the floor was “Swift, Cinder, Ceph and … (nobody knew the 4th one)” It took me by surprise when the floor almost univocally agreed that Ceph is one of the Openstack projects but we know that Ceph isn’t one. Ceph? An Openstack storage project?

Besides Swift, Cinder, there is Glance (depending on how you look at it) and the least known .. Manila.

I have also been following on many Openstack Malaysia discussions and discussion groups for a while. That Ceph response showed the lack of awareness and knowledge of the Openstack storage projects among the Malaysian IT crowd, and it was a difficult issue to tackle. The storage conundrum continues to perplex me because many whom I have spoken to seemed to avoid talking about storage and viewing it like a dark art or some voodoo thingy.

I view storage as the cornerstone of the 3 infrastructure pillars  – compute, network and storage – of Openstack or any software-defined infrastructure stack for that matter. So it is important to get an understanding the Openstack storage projects, especially Cinder.

Cinder is the abstraction layer that gives management and control to block storage beneath it. In a nutshell, it allows Openstack VMs and applications consume block storage in a consistent and secure way, regardless of the storage infrastructure or technology beneath it. This is achieved through the cinder-volume service which is a driver most storage vendors integrate with (as shown in the diagram below).

Diagram in slides is from Mirantis found at https://www.slideshare.net/mirantis/openstack-architecture-43160012

Diagram in slides is from Mirantis found at https://www.slideshare.net/mirantis/openstack-architecture-43160012

Cinder-volume together with cinder-api, and cinder-scheduler, form the Block Storage Services for Openstack. There is another service, cinder-backup which integrates with Openstack Swift but in my last check, this service is not as popular as cinder-volume, which is widely supported by many storage vendors with both Fibre Channel and iSCSi implementations, and in a few vendors, with NFS and SMB as well. Continue reading

Own the Data Pipeline

[Preamble: I was a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

I am a big proponent of Go-to-Market (GTM) solutions. Technology does not stand alone. It must be in an ecosystem, and in each industry, in each segment of each respective industry, every ecosystem is unique. And when we amalgamate data, the storage infrastructure technologies and the data management into the ecosystem, we reap the benefits in that ecosystem.

Data moves in the ecosystem, from system to system, north to south, east to west and vice versa, random, sequential, ad-hoc. Data acquires different statuses, different roles, different relevances in its lifecycle through the ecosystem. From it, we derive the flow, a workflow of data creating a data pipeline. The Data Pipeline concept has been around since the inception of data.

To illustrate my point, I created one for the Oil & Gas – Exploration & Production (EP) upstream some years ago.

 

Continue reading

Cohesity SpanFS – a foundational shift

[Preamble: I was a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

Cohesity SpanFS impressed me. Their filesystem was designed from ground up to meet the demands of the voluminous cloud-scale data, and yes, the sheer magnitude of data everywhere needs to be managed.

We all know that primary data is always the more important piece of data landscape but there is a growing need to address the secondary data segment as well.

Like a floating iceberg, the piece that is sticking out is the more important primary data but the larger piece beneath the surface of the water, which is the secondary data, is becoming more valuable. Applications such as file shares, archiving, backup, test and development, and analytics and insights are maturing as the foundational data management frameworks and fast becoming the bedrock of businesses.

The ability of businesses to bounce back after a disaster; the relentless testing of large data sets to develop new competitive advantage for businesses; the affirmations and the insights of analyzing data to reduce risks in decision making; all these are the powerful back engine applicability that thrust businesses forward. Even the ability to search for the right information in a sea of data for regulatory and compliance reasons is part of the organization’s data management application.

Continue reading

Storage dinosaurs evolving too

[Preamble: I am a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation are paid for by GestaltIT, the organizer and I am not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

I have been called a dinosaur. We storage networking professionals and storage technologists have been called dinosaurs. It wasn’t offensive or anything like that and I knew it was coming because the writing was on the wall, … or is it?

The cloud and the breakneck pace of all the technologies that came along have made us, the storage networking professionals, look like relics. The storage guys have been pigeonholed into a sunset segment of the IT industry. SAN and NAS, according to the non-practitioners, were no longer relevant. And cloud has clout (pun intended) us out of the park.

I don’t see us that way. I see that the Storage Dinosaurs are evolving as well, and our storage foundational knowledge and experience are more relevant that ever. And the greatest assets that we, the storage networking professionals, have is our deep understanding of data.

A little over a year ago, I changed the term Storage in my universe to Data Services Platform, and here was the blog I wrote. I blogged again just before the year 2018 began.

 

Continue reading

My dilemma of stateful storage marriage

I should be a love match maker.

I have been spending much hours in the past few months, thinking of stateful data in stateful storage containers and how they would consummate with distributed applications containers and functions-as-a-service (aka serverless, aka Lambda). It still hasn’t made much sense, and I have not solved this problem yet. Although there were bits and pieces that coming together and the jigsaw looked well enough to give a cackled reply, what I have now is still not good enough for me. I am still searching for answers, better than the ones I have now.

The CAP theorem is in center of my mind. Distributed data, distributed states of data are on my mind. And by the looks of things, the computing world is heading towards containers and serverless computing too. Both distributed applications containers and serverless computing make a lot of sense. If we were to engage a whole new world of fog computing, edge computing, IoT, autonomous systems, AI, and other real-time computing, I would say that the future belongs to decentralization. Cloud Computing and having edge systems and devices getting back to the cloud for data is too slow. The latency of micro- or even nano-seconds is just not good enough. If we rely on the present methods to access the most relevant data, we are too late.

Continue reading

Of Object Storage, Filesystems and Multi-Cloud

Data storage silos everywhere. The early clarion call was to eliminate IT data storage silos by moving to the cloud. Fast forward to the present. Data storage silos are still everywhere, but this time, they are in the clouds. I blogged about this.

Object Storage was all the rage when it first started. AWS, with its S3 (Simple Storage Service) offering, started the cloud storage frenzy. Highly available, globally distributed, simple to access, and fitted superbly into the entire AWS ecosystem. Quickly, a smorgasbord of S3-compatible, S3-like object-based storage emerged. OpenStack Swift, HDS HCP, EMC Atmos, Cleversafe (which became IBM SpectrumScale), Inktank Ceph (which became RedHat Ceph), Bycast (acquired by NetApp to be StorageGrid), Quantum Lattus, Amplidata, and many more. For a period of a few years prior, it looked to me that the popularity of object storage with an S3 compatible front has overtaken distributed file systems.

What’s not to like? Object storage are distributed, they are metadata rich (at a certain structural level), they are immutable (hence secure from a certain point of view), and some even claim self-healing (depending on data protection policies). But one thing that object storage rarely touted dominance was high performance I/O. There were some cases, but they were either fronted by a file system (eg. NFSv4.1 with pNFS extensions), or using some host-based, SAN-client agent (eg. StorNext or Intel Lustre). Object-based storage, in its native form, has not been positioned as high performance I/O storage.

A few weeks ago, I read an article from Storage Soup, Dave Raffo. When I read it, it felt oxymoronic. SwiftStack was just nominated as a visionary in the Gartner Magic Quadrant for Distributed File Systems and Object Storage. But according to Dave’s article, Swiftstack did not want to be “associated” with object storage that much, even though Swiftstack’s technology underpinning was all object storage. Strange.

Continue reading

Commvault UDI – a new CPUU

[Preamble: I am a delegate of Storage Field Day 14. My expenses, travel and accommodation are paid for by GestaltIT, the organizer and I am not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

I am here at the Commvault GO 2017. Bob Hammer, Commvault’s CEO is on stage right now. He shares his wisdom and the message is clear. IT to DT. IT to DT? Yes, Information Technology to Data Technology. It is all about the DATA.

The data landscape has changed. The cloud has changed everything. And data is everywhere. This omnipresence of data presents new complexity and new challenges. It is great to get Commvault acknowledging and accepting this change and the challenges that come along with it, and introducing their HyperScale technology and their secret sauce – Universal Dynamic Index.

Continue reading