The Dell EMC Data Bunker

[Preamble: I have been invited by  GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

Another new announcement graced the Tech Field Day 17 delegates this week. Dell EMC Data Protection group announced their Cyber Recovery solution. The Cyber Recovery Vault solution and services is touted as the “The Last Line of Data Protection Defense against Cyber-Attacks” for the enterprise.

Security breaches and ransomware attacks have been rampant, and they are reeking havoc to organizations everywhere. These breaches and attacks cost businesses tens of millions, or even hundreds, and are capable of bring these businesses to their knees. One of the known practices is to corrupt backup metadata or catalogs, rendering operational recovery helpless before these perpetrators attack the primary data source. And there are times where the malicious and harmful agent could be dwelling in the organization’s network or servers for long period of times, launching and infecting primary images or gold copies of corporate data at the opportune time.

The Cyber Recovery (CR) solution from Dell EM focuses on Recovery of an Isolated Copy of the Data. The solution isolates strategic and mission critical secondary data and preserves the integrity and sanctity of the secondary data copy. Think of the CR solution as the data bunker, after doomsday has descended.

The CR solution is based on the Data Domain platforms. Describing from the diagram below, data backup occurs in the corporate network to a Data Domain appliance platform as the backup repository. This is just the usual daily backup, and is for operational recovery.

Diagram from Storage Review. URL Link: https://www.storagereview.com/dell_emc_releases_cyber_recovery_software

Continue reading

Let there be light with Commvault Activate

[Preamble: I have been invited by Commvault via GestaltIT as a delegate to their Commvault GO conference from Oct 9-11, 2018 in Nashville, TN, USA. My expenses, travel and accommodation are paid by Commvault, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

Nobody sees well in the dark.

I am piqued and I want to know more about Commvault Activate. The conversation started after lunch yesterday as the delegates were walking back to the Gaylord Opryland Convention Center. I was walking next to Patrick McGrath, one of Commvault marketing folks, and we struck up a conversation in the warm breeze. Patrick started sharing a bit of Commvault Activate and what it could do and the possibilities of many relevant business cases for the solution.

There was a dejà vu moment, bringing my thoughts back to mid-2009. I was just invited by a friend to join him to restructure his company, Real Data Matrix (RDM). They were a NetApp distributor, then Platinum reseller in the early and mid-2000s and they had fell into hard times. Most of their technical team had left them, putting them in a spot to retain one of the largest NetApp support contract in Malaysia at the time.

I wanted to expand on their NetApp DNA and I started to seek out complementary solutions to build on that DNA. Coming out of my gig at EMC, there was an interesting solution which tickled my fancy – VisualSRM. So, I went about seeking the most comprehensive SRM (storage resource management) solution for RDM, one which has the widest storage platforms support. I found Tek-Tools Software and I moved that RDM sign up as their reseller. We got their SE/Developer, Aravind Kurapati, from India to train the RDM engineers. We were ready to hit the market late-2009/early-2010 but a few weeks later, Tek-Tools was acquired by Solarwinds.

Long story short, my mindset about SRM was “If you can’t see your storage resource, you can’t manage your storage“.  Resource visibility is so important in SRM, and the same philosophy applies to Data as well. That’s where Commvault Activate comes in. More than ever, Data Insights is already the biggest differentiator in the Data-Driven transformation in any modern business today. Commvault Activate is the Data Insights that shines the light to all the data in every organization.

After that casual chat with Patrick, more details came up in the early access to Commvault embargoed announcements later that afternoon. Commvault Activate announcement came up in my Twitter feed.

Commvault Activate has a powerful dynamic Index Engine called the Commvault 4D Index and it is responsible to search, discover and learn about different types of data, data context and relationships within the organization. I picked up more information as the conference progressed and found out that the technology behind the Commvault Activate is based on the Apache Lucene Solr enterprise search and indexing platform, courtesy of Lucidworks‘ technology. Suddenly I had a recall moment. I had posted the Commvault and Lucidworks partnership a few months back in my SNIA Malaysia Facebook community. The pictures connected. You can read about the news of the partnership here at Forbes.

Continue reading

Own the Data Pipeline

[Preamble: I was a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

I am a big proponent of Go-to-Market (GTM) solutions. Technology does not stand alone. It must be in an ecosystem, and in each industry, in each segment of each respective industry, every ecosystem is unique. And when we amalgamate data, the storage infrastructure technologies and the data management into the ecosystem, we reap the benefits in that ecosystem.

Data moves in the ecosystem, from system to system, north to south, east to west and vice versa, random, sequential, ad-hoc. Data acquires different statuses, different roles, different relevances in its lifecycle through the ecosystem. From it, we derive the flow, a workflow of data creating a data pipeline. The Data Pipeline concept has been around since the inception of data.

To illustrate my point, I created one for the Oil & Gas – Exploration & Production (EP) upstream some years ago.

 

Continue reading

Cohesity SpanFS – a foundational shift

[Preamble: I was a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

Cohesity SpanFS impressed me. Their filesystem was designed from ground up to meet the demands of the voluminous cloud-scale data, and yes, the sheer magnitude of data everywhere needs to be managed.

We all know that primary data is always the more important piece of data landscape but there is a growing need to address the secondary data segment as well.

Like a floating iceberg, the piece that is sticking out is the more important primary data but the larger piece beneath the surface of the water, which is the secondary data, is becoming more valuable. Applications such as file shares, archiving, backup, test and development, and analytics and insights are maturing as the foundational data management frameworks and fast becoming the bedrock of businesses.

The ability of businesses to bounce back after a disaster; the relentless testing of large data sets to develop new competitive advantage for businesses; the affirmations and the insights of analyzing data to reduce risks in decision making; all these are the powerful back engine applicability that thrust businesses forward. Even the ability to search for the right information in a sea of data for regulatory and compliance reasons is part of the organization’s data management application.

Continue reading

Storage dinosaurs evolving too

[Preamble: I am a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation are paid for by GestaltIT, the organizer and I am not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

I have been called a dinosaur. We storage networking professionals and storage technologists have been called dinosaurs. It wasn’t offensive or anything like that and I knew it was coming because the writing was on the wall, … or is it?

The cloud and the breakneck pace of all the technologies that came along have made us, the storage networking professionals, look like relics. The storage guys have been pigeonholed into a sunset segment of the IT industry. SAN and NAS, according to the non-practitioners, were no longer relevant. And cloud has clout (pun intended) us out of the park.

I don’t see us that way. I see that the Storage Dinosaurs are evolving as well, and our storage foundational knowledge and experience are more relevant that ever. And the greatest assets that we, the storage networking professionals, have is our deep understanding of data.

A little over a year ago, I changed the term Storage in my universe to Data Services Platform, and here was the blog I wrote. I blogged again just before the year 2018 began.

 

Continue reading

My dilemma of stateful storage marriage

I should be a love match maker.

I have been spending much hours in the past few months, thinking of stateful data in stateful storage containers and how they would consummate with distributed applications containers and functions-as-a-service (aka serverless, aka Lambda). It still hasn’t made much sense, and I have not solved this problem yet. Although there were bits and pieces that coming together and the jigsaw looked well enough to give a cackled reply, what I have now is still not good enough for me. I am still searching for answers, better than the ones I have now.

The CAP theorem is in center of my mind. Distributed data, distributed states of data are on my mind. And by the looks of things, the computing world is heading towards containers and serverless computing too. Both distributed applications containers and serverless computing make a lot of sense. If we were to engage a whole new world of fog computing, edge computing, IoT, autonomous systems, AI, and other real-time computing, I would say that the future belongs to decentralization. Cloud Computing and having edge systems and devices getting back to the cloud for data is too slow. The latency of micro- or even nano-seconds is just not good enough. If we rely on the present methods to access the most relevant data, we are too late.

Continue reading

Pure Electric!

I didn’t get a chance to attend Pure Accelerate event last month. From the blogs and tweets of my friends, Pure Accelerate was an awesome event. When I got the email invitation for the localized Pure Live! event in Kuala Lumpur, I told myself that I have to attend the event.

The event was yesterday, and I was not disappointed. Coming off a strong fiscal Q1 2018, it has appeared that Pure Storage has gotten many things together, chugging full steam at all fronts.

When Pure Storage first come out, I was one of the early bloggers who took a fancy of them. My 2011 blog mentioned the storage luminaries in their team. Since then, they have come a long way. And it was apt that on the same morning yesterday, the latest Gartner Magic Quadrant for Solid State Arrays 2017 was released.

Continue reading

The rise of RDMA

I have known of RDMA (Remote Direct Memory Access) for quite some time, but never in depth. But since my contract work ended last week, and I have some time off to do some personal development, I decided to look deeper into RDMA. Why RDMA?

In the past 1 year or so, RDMA has been appearing in my radar very frequently, and rightly so. The speedy development and adoption of NVMe (Non-Volatile Memory Express) have pushed All Flash Arrays into the next level. This pushes the I/O and the throughput performance bottlenecks away from the NVMe storage medium into the legacy world of SCSI.

Most network storage interfaces and protocols like SAS, SATA, iSCSI, Fibre Channel today still carry SCSI loads and would have to translate between NVMe and SCSI. NVMe-to-SCSI bridges have to be present to facilitate the translation.

In the slide below, shared at the Flash Memory Summit, there were numerous red boxes which laid out the SCSI connections and interfaces where SCSI-to-NVMe translation (and vice versa) would be required.

Continue reading

The engineering of Elastifile

[Preamble: I was a delegate of Storage Field Day 12. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented in this event]

When it comes to large scale storage capacity requirements with distributed cloud and on-premise capability, object storage is all the rage. Amazon Web Services started the object-based S3 storage service more than a decade ago, and the romance with object storage started.

Today, there are hundreds of object-based storage vendors out there, touting features after features of invincibility. But after researching and reading through many design and architecture papers, I found that many object-based storage technology vendors began to sound the same.

At the back of my mind, object storage is not easy when it comes to most applications integration. Yes, there is a new breed of cloud-based applications with RESTful CRUD API operations to access object storage, but most applications still rely on file systems to access storage for capacity, performance and protection.

These CRUD and CRUD-like APIs are the common semantics of interfacing object storage platforms. But many, many real-world applications do not have the object semantics to interface with storage. They are mostly designed to interface and interact with file systems, and secretly, I believe many application developers and users want a file system interface to storage. It does not matter if the storage is on-premise or in the cloud.

Let’s not kid ourselves. We are most natural when we work with files and folders.

Implementing object storage also denies us the ability to optimally utilize Flash and solid state storage on-premise when the compute is in the cloud. Similarly, when the compute is on-premise and the flash-based object storage is in the cloud, you get a mismatch of performance and availability requirements as well. In the end, there has to be a compromise.

Another “feature” of object storage is its poor ability to handle transactional data. Most of the object storage do not allow modification of data once the object has been created. Putting a NAS front (aka a NAS gateway) does not take away the fact that it is still object-based storage at the very core of the infrastructure, regardless if it is on-premise or in the cloud.

Resiliency, latency and scalability are the greatest challenges when we want to build a true globally distributed storage or data services platform. Object storage can be resilient and it can scale, but it has to compromise performance and latency to be so. And managing object storage will not be as natural as to managing a file system with folders and files.

Enter Elastifile.

Continue reading