[Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]
I, for one perhaps have seen far too many “file lifecycle and data management” software solutions that involved tiering, hierarchical storage management, ILM or whatever you call them these days. If I do a count, I would have managed or implemented at least 5 to 6 products, including a home grown one.
The whole thing is a very crowded market and I have seen many which have come and gone, and so when the opportunity to have a session with Komprise came at Storage Field Day 19, I did not carry a lot of enthusiasm.
Counting those old wine in new bottles
In my career, I have been involved with many NAS solutions. Naturally the file lifecycle and data management solutions are an extension to the NAS file systems. I try to recall the past solutions I was involved in:
- A retrofitted Sun Microsystems SAMFS “cache” with a Sony Petasite tape library crafted by the Malaysian Sun/Schlumberger outfit for an Oil & Gas major in the late 90s
- NetApp Virtual File Manager (VFM) but under the covers was NuView StorageX in mid-2000. NuView was acquired by Brocade.
- EMC DiskXtender integrated with EMC Centera, its Content Addressable Storage in 2007 for Schlumberger. 2 months later after I completed the project, DiskXtender was killed off. This was also my first foray into Object Storage, and learning about the HTTP and the infamous PEA (Pool Entry Authorization) file.
- EMC Rainfinity File Management Appliance in 2008 where I was the product manager, focusing mostly in the Oil & Gas market.
- A home grown Python script written by my engineer at Real Data Matrix for Petrofac Malaysia, circa 2009-2011.
- Interica SmartMove when I was the Asia Business Development Director in 2013-2014
In addition, some competitors I remembered in that space (where I had researched) were Veritas Enterprise Vault and Moonwalk Universal. Yeah, there were too many out there.
Komprise powerful technologies
Listed in their website, Komprise has 2 products.
- Intelligent Data Managemet
- Analytics Driven Data Migration
At its core, the Intelligent Data Management has 2 key components – Observers and Directors.
The Observer VMs are placed on-premises to “observe” all NFS and SMB/CIFS traffic and activities in the network. The non-intrusive Observer VM(s) is out-of-band and can be easily by point the Observer VM(s) to the NAS source(s). Within 15 minutes, analytics about how much data is on the NAS boxes, how they are being used, types of data and the data owners can be revealed in the Komprise Director Dashboard, as shown below:
The Observer VMs form a distributed grid of persistent data stores, gathering and amassing metadata, activities data and usage patterns, and communicating with Komprise Director. The Director is the central admin, the User Interface and Dashboard to the Komprise universe, and also the administration and control of the File Lifecycle & Data Management Policy Manager. A high level overview of the architecture is shown below:
User defined policies configured at the Director UI will replicate, move, tier and archive the files from the NAS sources to secondary NAS and object storage targets. Komprise is storage agnostic and works with many NAS, Object Storage and Cloud partners. Here is a list from their website and I am sure the list will continue to grow.
For security, all communications and data transfers are encrypted with AES-256 symmetric key in the form of chunks or compressed chunks.
Komprise won fans and 2 for me
After the opening by CEO Kumar Goswami, the chalk talk session helmed by Mike Peercy CTO and Mohit Dhawan, VP of Engineering was a series of rapid fire questions from the delegates. It was this part of the Q&A which has won Komprise big fans among the Field Day delegates. In my books, they were the best of the week among all the other sponsors. 2 pieces stood out for me.
First, the Observer VMs grid-like architecture provides the scalability that most other solutions lacks. This scale-out agentless and database-less approach positions Komprise high performant for the petascale data management and trillion files lifecycle era. At the same time, the Grid load balances the work and also is fault tolerant.
Second is Komprise’s patented Transparent Move Technology (TMT™). Komprise do not use stub files often implemented by other solutions. Instead once the source file has been moved, a dynamic symlink to the file is generated. This dynamic symlink is associated to Komprise Access Address (KAA), a DNS-like lookup service in the Observers’ Grid. This link persists throughout the life of the file regardless of present and future locations. When the users or applications access the moved file, the dynamic link is presented and the users or applications can access the file at source or at target. The lines between the files on the NAS platforms and the objects, on-premises or in the clouds are completely transparent, and that is why TMT™ is special. Here’s a look TMT™ in action.
In a nutshell, as quoted by Komprise, “TMT™ provides a redundant, transparent, hierarchical file system that bridges file and object data without superimposing a single namespace or metadata layer”.
What is next?
Data Management and File Lifecycle is a crowded market and data growth has been unrelentless. In the old ways of doing things, voluminous data movements hit bottleneck everywhere. This has exacerabated workflows and conversions between disparate files and cloud platforms. The Komprise approach is solid, addressing the skepticism of many (including me), but more importantly the need of the market.
Competitors are beginning to show up and one on my radar is Igneous, which appears to be in the same arena. Stornext may be an alternative too, if Quantum decides to align their venerable Swiss Army data management software to this market. The 3 rounds of funding has brought Komprise a total of USD42 million.
Their partnerships with HPE and IBM are a good start for the market presence and branding but the challenge is to get the kind of air time they deserve in the whole gamut of IBM’s and HPE’s stable of “data management” solutions. They also have to build their use cases portfolio quickly in this competitive space in the current geography they operate. Healthcare & Life Sciences, Oil & Gas, Media & Entertainment are heavy data and file usage are the perennial markets in constant need of solid data management and file lifecycle, but again there are many, many players in that space.
I did not delve much on Komprise Analytics Driven Data Migration. It could be a game changer but I did not have the deep knowledge to share more in this blog. But in the end, the question begs. What is their strategy and what are their plans in a very crowded market? Hope to know more in the future.
I was asking myself how to compare file based tiering (as Komprise do) versus block based tiering solutions, like for instance the NetApp’s FabricPool. Of course FabricPool is available for all flash fas systems. Komprise was present at NetApp Insight, but how it will fit with NetApp’s tiering techniques?
IMHO, it depends on what you want to achieve. Block-based tiering are really for operation efficiency – mostly IOPS performance based on the “hotness” of the data. There are algorithms of the storage OS to decide this.
For file based archiving, there are other criteria such as based on users/age/size/file types/key word search/some simplistic form of file analytics which can be in user-defined policies and get into concepts like ILM/HSM/Archiving/Legal hold etc. NetApp works with StorageX at the file-level tiering if I am not wrong and these file-based tiering are easier to be integrated into IT and applications workflows.
Hope this helps. Thanks for support my blog.
Chin-Fah, To add to your comments, another key benefit of file-based tiering is that the entire file is archived – including its metadata, so the data can be directly accessed from the secondary storage without rehydration, and without lock-in. And Komprise makes this transparent – so the file looks like it is still on the primary storage even when its on object/cloud. The transparent approach from Komprise is different from legacy migration-based solutions like StorageX which require the user to look for archived data in a different place and can create user resistance. Also, Komprise is a NetApp partner and we have joint customer references.
Thank you for your kind comment. It really helps to put Komprise in a better perspective for me and the reader.
Looking forward to work with Komprise in the near future. Brewing an opportunity in Malaysia …
Pingback: Komprise Is a Winner - Tech Field Day
Pingback: Komprise – Non-Disruptive Data Management | PenguinPunk.net
Pingback: Storage Field Day 19 – Wrap-up and Link-o-rama | PenguinPunk.net