Magic happening

[Preamble: I am a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation are paid for by GestaltIT, the organizer and I am not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

The magic is happening.

Dropbox, the magical disruptor, is going IPO.

When Dropbox first entered into the market which eventually termed as BYOD (Bring your Own Device), it was a phenomenon. There was nothing else that matched its simplicity and ease-of-use. A file uploaded into the cloud was instantaneously available on the tablets and smart phones. It was on every storage vendor’s presentation slides, using Dropbox as the perennial name dropping tactic to get end users buy-in.

Dropbox was more than that, and it went on to define a whole new market segment known as Enterprise File Synchronization and Sharing (EFSS), together with everybody else such as Box, Easishare (they are here in South East Asia), and just about everybody else. And the executive team at Dropbox knew they were special too, so much so that they rejected a buyout attempt by Apple in 2011.

Today, Dropbox is beyond BYOD and EFSS. They are a full fledged collaboration platform that includes project management, project workflow, file versioning, secure file transfer, smart file synchronization and Dropbox Paper. And they offer comprehensive plans from Basic, Plus and Professional to Business and Enterprise. Their upcoming IPO, I am sure, will give them far greater capital to expand, and realize their full potential as the foremost content-based collaboration platform in the world.

Dropbox began their exodus from AWS a couple of years ago. They wanted to control their destiny and have moved more than 500PB into their own private data center for their customer data. That was half-an-exabyte, people! And two years later, they saved $75million of operating costs after they exited AWS. Today, they have more than 1 Exabyte of customer data! That is just incredible.

And Dropbox’s storage architecture started with a simple foundational design called “Magic Pocket“. Magic Pocket is a “fixed-length, immutable” block storage layer.

The block size is fixed at 4MB chunks (for parallel performance and service resumption reasons), compressed and deduped (for capacity savings reasons), encrypted (for security reasons) and replicated (for high availability reasons).

Continue reading

The future is intelligent objects

We are used to block-based approach and also the file-based approach to data. The 2 diagrams below shows the basics of how we access data in both block-based and file-based data on the storage device.

 

For block-based , the storage of the blocks is merely in arrays of unrelated contiguous blocks. For file-based, as seen below,

 

there is another layer of abstraction, and this is called the file system. But if you seen both diagrams above, there are some random numbers in light blue and that is to represent the storage device, the hard disk drive’s export of “containers” to the file system or the application that is accessing the storage device. This is usually the LBA (Logical Block Addressing), which is basically set of schematics that defines the locations on the hard disk drives. LBA tells the location of where the data is stored. For more information about LBA, check out this Wikipedia definition. But the whole idea is LBA is dumb. It is pretty much static and exported to file systems and applications so that these guys can do something with it.

There’s something brewing in the background since 1994 and it is one of the many efforts to make intelligent storage devices. This new object-based interface was part of the research project done by Carnegie-Mellon University (CMU). Initially, it was known as Network Attached Secure Disk (NASD) but eventually made its way to the working group in SNIA, and developing it for ANSI T10 INCITS standard. ANSI T10 is the guardian of all SCSI standards. This is called Object Storage Device (OSD). The SCSI architecture diagram below shows the layer where OSD resides.

 

The motivation for this simple: To make storage devices of today to do more computational work, in particularly I/O, relieving the hosts and the local systems to concentrate other computational processing work. And the same time, the local systems must have some level of interactivity and management between the storage object and the computational hosts.

In the diagram below which compares both block-based and OSD,

 

you can see the separation of file system management interface that is at the kernel-space of the local host/system and this is replaced by the OSD Management interface at the storage device.

What does this all mean? This means that using LBA type of addressing that we are familiar with in the block-based and file-based storage is no longer the way to go, because as I mentioned before, LBA is dumb.

OSD, in some way, replaces the LBA with OIDs (Object IDs). The existing local system and/or its file system will interact with the storage devices with OIDs and the OIDs links to its respective objects storage. And the object will carry a lot of metadata, that represents the object, giving it the intelligent and management capability of the object.

 

 

The prominence of the metadata in the OSD would mean that we can build much more intelligent systems in the future. The OIDs and the objects can be grouped together in a flat design or can be organized and categorized in a virtual, hierarchical model.

 

Object storage is an intelligent evolution of disk drives that can store and serve objects rather than simply place data on tracks and sectors. And it can bring the following benefits:

  • Intelligent space management in the storage layer
  • Data aware pre-fetching and caching
  • Robust shared access by multiple clients
  • Scalable performance using off-loaded data path
  • Reliability security

Several vendors such as EMC and NetApp are already supporting OSD.