Falconstor Software Defined Data Preservation for the Next Generation

Falconstor® Software is gaining momentum. Given its arduous climb back to the fore, it is beginning to soar again.

Tape technology and Digital Data Preservation

I mentioned that long term digital data preservation is a segment within the data lifecycle which has merits and prominence. SNIA® has proved that this is a strong growing market segment through its 2007 and 2017 “100 Year Archive” surveys, respectively. 3 critical challenges of this long, long-term digital data preservation is to keep the archives

  • Accessible
  • Undamaged
  • Usable

For the longest time, tape technology has been the king of the hill for digital data preservation. The technology is cheap, mature, and many enterprises has built their long term strategy around it. And the pulse in the tape technology market is still very healthy.

The challenges of tape remain. Every 5 years or so, companies have to consider moving the data on the existing tape technology to the next generation. It is widely known that LTO can read tapes of the previous 2 generations, and write to it a generation before. The tape transcription process of migrating digital data for the sake of data preservation is bad because it affects the structural integrity and quality of the content of the data.

In my times covering the Oil & Gas subsurface data management, I have seen NOCs (national oil companies) with 500,000 tapes of all generations, from 1/2″ to DDS, DAT to SDLT, 3590 to LTO 1-7. And millions are spent to transcribe these tapes every few years and we have folks like Katalyst DM, Troika and more hovering this landscape for their fill.

Yet many comprise the data integrity of the digital assets by ignoring the tape’s shortcomings because it is the cheapest medium. Once the integrity of the content is broken, although accessible, is unusable. This defeats the purpose and the objectives of digital data preservation.

Why Software Defined Data Preservation (SDDP)?

Software transcends the physical and often technological obsolescence of the infrastructure. Virtualizing the underlying hardware alleviate the challenges of the storage technology and medium that store the digital data assets, which Falconstor® calls the “Data Centric” approach. Inherent to having a software-defined approach are features such as

  • Scalability in terms of performance and capacity
  • Portability and storage medium agnostic
  • Security strengthening the access, audit and governance
  • Agility to the reinstatement of the data and modernization of technologies
  • eDiscovery and Analytics

Thus SDDP makes a strong statement to overcome the challenges brought up against tapes. The Falconstor® VTL technology has been an anchoring technology of the company as they redefine the SDDP market segment. They introduced their VTL (virtual tape library) technology in the early 2000s and has continued to ply in the market where many have dropped off. IBM® had ProtecTIER; Hitachi Data Systems had Sepaton; NetApp® had Alacritus. All gone! The only software VTL vendors which remained in my radar are Dell™ PowerProtect DD (Data Domain), Starwind VTL, and Falconstor® VTL.

Falconstor® StorSafe™

Building on the shoulder of their venerable VTL technology is the new StorSafe™. StorSafe™ Persistent Virtual Storage Container (VSC) technology executes an application-level data mobility and brokerage virtualization which disaggregates the data from the underlying storage repositories and storage type. The data ingested from various data sources through its VTLs are deduped, compressed and encrypted at its Single Instance Repository (SIR) cluster gateways for data footprint capacity optimization. Variable capacity optimization can be achieved through Falconstor® advanced archive optimization technology, based on data types.

The optimized data is placed into a unique Portable Storage Format (PSF), indexed, checksummed, and placed into StorSafe VSC persistent storage. PSF removes the shackles of the underlying technology, formats and mediums, all which are subject to obsolescence.

The persistent data in the VSC is then sharded and partitioned into “mini containers” protected by an optimized erasure coding scheme. The partitions or shards can be moved and laid across multiple locations, both on-premises and in the clouds. This multi-cloud-ready approach enhances data portability, scalability, recoverability, and agility. Digital data assets can be placed according to the most economical cost of the repositories, cloud or local storage, via the de facto S3 API. Egress fees can be significantly reduced or removed as well, because the capacity optimized data does not need to be rehydrated before egress or migration. This gives organizations smorgasbord ways to move, archive and preserve digital data assets, with the flexibilities that tape technology does not inherently possess.

Adding on to the big list of features, StorSafe™ can tag data for various legal and compliance reasons and provide the capability to tier data protection levels as well. This is absolutely vital in data privacy and data sovereignty regulations such as EU’s GDPR (General Data Protection Regulations), California’s new CCPA (California Consumer Privacy Act) and many other PDPAs (personal data privacy acts) around the world.

StorSafe™ is patent-pending and is built upon the Linux Container (LXC) open source technology. Falconstor® claims it as the Redundant Array of Independent Cloud (RAIC) method, and we shall see if that marketing moniker will stick over time.

Recoverable to be usable

In data preservation, recovering the digital assets decades later may be easy but ensuring the integrity of the data is paramount for it to be usable. Erasure coding across multi-cloud enhances the reliability of accessing the data, with the protection of losing 2 “mini containers”, rebuilding the data set from the parity-like forward error correction, just like RAID-6, albeit a more efficient one depending who you talk to and their use cases. The recovered data can be reinstated in the StorSafe™ library and ready for data access. The recovery process is shown below:

Zooming in, the content of the mini containers are checksummed and validated on a regular basis, ensuring that the content is not modified or bit flipped throughout its lifespan. This is confirmed against a Secure Journal Log for data integrity, as shown below:

This is another level of preservation protection of the content for the long term archive.

Meaningful data preservation

We are hoarding too much data. We do not know well enough if the data is of value at present or 100 years later. Thus I talked about self-describing data personalities, where the data is tagged with unique data characteristics including long term archive and preservation, at source. That is why the deed to ensure that valuable data for all industries and our present and future lives is to keep the digital data assets knowledge and wisdom throughout lifespans and generations.

Falconstor® holds a novel and unique technology approach as the gatekeeper of time capsuling our digital assets for the future and the significance of data preservation is growing rapidly. Thus with its sound strategy, it is likely they will prevail and soar again as the mighty Falcon.

 

 

 

 

Tagged , , , , , , , , , , , , , , , , , , , , . Bookmark the permalink.

About cfheoh

I am a technology blogger with 25+ years of IT experience. I write heavily on technologies related to storage networking and data management because that is my area of interest and expertise. I introduce technologies with the objectives to get readers to *know the facts*, and use that knowledge to cut through the marketing hypes, FUD (fear, uncertainty and doubt) and other fancy stuff. Only then, there will be progress. I am involved in SNIA (Storage Networking Industry Association) and as of October 2013, I have been appointed as SNIA South Asia & SNIA Malaysia non-voting representation to SNIA Technical Council. I currently run a small system integration and consulting company focusing on storage and cloud solutions, with occasional consulting work on high performance computing (HPC).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.