[ This is Part One of a longer conversation ]
EMC2 (before the Dell® acquisition) in the 2000s had a tagline called “Where Information Lives™“**. This was before the time of cloud storage. The tagline was an adage of enterprise data storage, proper and contemporaneous to the persistent narrative at the time – Data Consolidation. Within the data consolidation stories, thousands of files and folders moved about the networks of the organizations, from servers to clients, clients to servers. NAS (Network Attached Storage) was, and still is the work horse of many, many organizations.
[ **Side story ] There was an internal anti-EMC joke within NetApp® called “Information has a new address”.
This was a time where there were almost no concerns about Shadow IT; ransomware were less known; and most importantly, almost everyone knew where their files and folders were, more or less (except in Oil & Gas upstream – to be told in later in this blog). That was because there were concerted attempts to consolidate data, and inadvertently files and folders, in the organization.
Even when these organizations were spread across the world, there were distributed file technologies at the time that could deliver files and folders in an acceptable manner. Definitely not as good as what we have today in a cloudy world, but acceptable. I personally worked a project setting up Andrew File Systems for Intel® in Penang in the mid-90s, almost joined Tacit Networks in the mid-2000s, dabbled on Microsoft® Distributed File System with NetApp® and Windows File Servers while fixing the mountains of issues in deploying the worldwide GUSto (Global Unified Storage) Project in Shell 2006. Somewhere in my chronological listings, Acopia Networks (acquired by F5) and of course, EMC2 Rainfinity and NetApp® NuView OEM, Virtual File Manager.
The point I am trying to make here is most IT organizations had a good grip of where the files and folders were. I do not think this is very true anymore. Do you know where your files and folders are living today?
Oil & Gas Upstream file challenges
Files and folders management in Oil & Gas upstream, through my experiences, is a mess. Missing files, lost files, ghost files, zombie files, obsolete files, “where are the gold copy” files?, archived by not there anymore files, etc. These stories can fill a tiny booklet of what not to do in subsurface data management, and if tabulated financially, would have costs billions of dollars. I am sure there are other stories in other industries but these were my past experiences.
One of my favourite data management software was (and probably still is) Interica Project Resource Manager™ (PRM). It has little to do with the bread and butter work I do in the storage technology industry, but there was a period in my career where I was the Business Development Director for Interica in Asia Pacific. Interica provides subsurface data management software solutions in Oil & Gas upstream.
Who is Interica? For one, they now belong to Petrosys®. But they started out as Enigma Data Systems 30 over years ago, went through mergers, acquisitions, buyouts to become Interica before the Petrosys acquisition last year. And I have a special place for them in my heart because I have good friends there.
When done right, Interica PRM was able to consolidate the metadata of subsurface files in various subsurface projects linked to Petrel® (now part of Schlumberger), Kingdom™, R5000, EPOS® 4, Hampson Russell® and many more. Interica had storage connectors to these software, and a few more geo-mapping ones like ArcGIS© and Petrosys®.
The beauty is PRM was able to give a global view of the subsurface files and folders, their locations, the metadata associated with them (long-lat, CRS, faults, 2D/3D surveys, lines, well logs and more), their ages, versions, sizes, growth etc. Here are few screenshots of Interica PRM and how it was able to provide a one view of the subsurface project files and folders.
You can drill into the map, and look at the project blocks in more details, showing the “donut” where different subsurface applications, – Unix-based (R5000, Geoframe®) or Windows-based (Petrel®, Kingdom™) – were, broken down to into file-based details such as % of project storage capacities, project owners, size, age, creation date, last modified and so on.
Drilling down, Interica PRM showed the files and folders growth in the projects, as shown in the trending diagram below:
The most fun part of PRM was showing the scatter plot as you can see on the right (the browser that looked like a bed of colourful spots in a field of white) in the screenshot below:
There was always a revelation moment when PRM has done its discovery job, and was able to show all the data files and folders in all locations, with their associated details in the oil & gas organization spanning over decades.
Cloud computing and smart devices
The data consolidation story a decade ago disintegrated with the burgeoning adoption of the cloud computing and in recent course, adding in edge computing as well. Files and folders of every kind are now spread across multi-clouds, far flung to IOT devices, smart phones, tablets, handhelds and everything in between. There are many benefits and advantages of both cloud and edge computing, but on the flip side, there is a massive data file sprawl. The proliferation of these services continue to stretch the challenges of keeping relevant files and folders in one (or fewer) locations, against the grain of enterprise data consolidation and centralization.
Cybersecurity risks are much bigger now and harder to contain because the demarcation of data consolidation borders is broken. Rampant ransomware attacks are the results of having the files every where, and in many of those, securing and protecting these files have become more and more difficult.
What is next?
I intend to use this first part as the springboard to talk about some things we can do to consolidate NAS services in the local area network of the organization, and extending it to the “cloud”. Hybridity of file services (but not multi-cloud) is the way, and the pandemic has stamped and confirmed the vital need for the “work from anywhere” as the global workforce is now remote.
What I have shared in my experience in Oil & Gas subsurface data management, with Interica PRM, exemplifies that at some point, we need to return to the level of control, understanding and details of where our files are living today. Albeit difficult, the consolidation of data and file is indispensable in control, security, governance and maximize a one true copy for analytics, machine learning and eventually intelligent automation of future IT.
This week, I leave you with 3 contrasting articles (6 years apart) about data consolidation and having a relevant single source of truth.
- [ November 2015 ] The dark ages of data is coming
- [ May 2020 ] A dialogue between 2 drives
- [ July 2021 ] OpenAI shuts down its robotics team due to lack of data
Digital Transformation can only progress with a vibrant, and active grip of where the best data files are, and where the files are living in the premises of private, public and edge data locations of the organization.
It is time to find out where are your files living right now.