I picked up a copy of latest Brad Stone’s book, “The Everything Store: Jeff Bezos and the Age of Amazon“ at the airport on my way to Beijing last Saturday. I have been reading it my whole time I have been in Beijing, reading in awe about the turbulent ups and downs of Amazon.com.
In its own serendipitous ways, Object-based Storage Devices (OSDs) have been floating in my universe in the past few weeks. Seems like OSDs have been getting a lot of coverage lately and suddenly, while in the shower, I just had an epiphany!
Are storage vendors now positioning Object-based Storage Devices (OSDs) as Everything Store?
2 significant events happened that led me to write about OSDs and the future that it can become. For one, I was preparing for a SNIA Malaysia Tech@Break event for weeks. At the event last week, I was giving a talk on a SNIA Tutorial topic of “Object Storage Technology” by Dr. Brent Welsh of Panasas.
A day after that, a friend of mine alerted me about something new from Seagate called Seagate Kinetic Open Storage, based on the idea of OSDs.
I mean, I have worked with OSDs all along, even the proprietary pre-cursor to the present OSDs, in the form of EMC Centera. After all, I was the EMC IP Storage product manager from 2007 – 2009 for South Asia. I even had the opportunity to have a sniff at the newly minted EMC Atmos just before I left EMC. I even blogged about the future is in intelligent objects. Read it from my previous blog in November 2011.
OSDs have certainly evolved and slowly taking a more prominent position in many storage architect’s view. It has become a suitable archiving alternative to tapes, sitting in between the primary storage platform and the tape platform and gaining wider acceptance as well. Thanks to the many marketing efforts, I am seeing other players besides the Big 6, showing off their Object Storage.
SGI (yes SGI), has LiveArc AE layered over their InfiniteStorage platform; Quantum just announced their Lattus Object Storage some months ago. And of course, Seagate has gotten into the picture too with Kinetic. And I am sure there are plenty more out there and more coming soon.
As OSDs gain more prominent, our legacy thinking and mindsets have to change as well. The DAS, SAN and NAS architecture have been very storage location-centric. Applications have to work with file systems, block-storage, and storage manager to determine the addressable location at the destination storage. We work with logical block address (LBA), its byte- or block-offset, or in NAS, the network file system’s file semantics to locate the data.
These location-centric architecture basically has served us well over the years but what it does is continue to introduce layers-upon-layers of communications and buffering in the name of abstraction and virtualization. In Malaysia, we have a nice savoury happy food called “Kuih Lapis“, which literally meant Layered Cake (Picture below)
In IT speak, from the Seagate Kinetic website, they have a nice little diagram that explains these layers.
While the layers of abstraction have been good to how DAS, SAN and NAS have worked for the overall IT landscape, the stack is putting too much fat in between the compute layer on top and the storage layer at the bottom. It has introduced complexity as the size of the data footprint becomes larger and larger.
Hence OSDs are becoming a solution-in-waiting and looking more and more as a potential saver. Many unstructured and semi-structured data kept in the Internet and the Cloud are beginning to become permanent content, and the operations acted upon these content are now either search, create, delete or move in its entirety. It has become less common to edit a web content object and make changes to the unstructured content. And that practice bodes well with the basic operations on persistent storage of CRUD (Create, Retrieve, Update, Delete). As seen in the table below, CRUD is easily understood:
That simplicity lies in how OSDs are presented to the applications (and its APIs) compared to legacy block devices. The SNIA presentation from Dr. Brent Welsh (the presentation I presented last week at SNIA Tech@Break) has a wonderful, self-explanatory slide comparing both:
In the diagram above, the separation of the Filesystem Namespace Component and the Filesystem Storage Component makes OSD possible. The Storage Device is responsible on how the data content is stored in the location, while at the compute layer, the application and its respective APIs are only concerned about locating the data content. The location to the data content is simplified to an Object ID (OID), and possibly the Object Partition ID while the nitty-gritty block- or byte-or even array-based addressing is left to the storage devices that own them.
There are a lot more about OSD that meet the eye but it is not ready to be an everything store yet. A lot of enterprise businesses and operations still very much rely on applications which are transactional and relational. They are still developed and subjected to how the programming codes interact with the block-based storage and file semantics of networked file system.
Things are changing and certainly move towards OSDs. It is my wish and hope that eventually, OSD will become Everthing Store.
Do you think OpenStack Swift leading it being Open Source and already in production at Hp/Rackspace?
The kinetic announcement you mentioned is available for deployment with OpenStack Swift. Also Swift gives you Geo replication feature to build your own multi geographical location datacentre & share seamless objects.
Hi Andy
At this point, from what I see, OpenStack Swift is jostling with many other competitors to become the widely accepted standard. Seagate’s Kinetic is one of them as well.
AWS S3 is the de-facto standard everyone wants to emulate in the cloud, even though it is not open. Many vendors provide “hooks” to S3. Even SNIA CDMI specification has integration with S3.
But on a private scale and on-premise, the object-based storage (OSD) integration with present applications such as DBs, emails, file-systems are still nascent. Many vendors still provide NFS/CIFS gateways to interface with the applications whilst converting those network file system requests to storage objects. Hence, I believe we nearing the disruption point of OSDs but clearly we are not 100% there yet. The way applications are developed are still NOT for OSDs.
Back to your question. It is clear that as an Open Source implementation, Swift is in the lead. 😉
I see a good future for Swift.
Thank you
/Chin-Fah
Things about object storage are not clear. What about object storage devices? e.g. operating systems like linux are not ready to support them completely. Am i right?
Hello Diama
There isn’t really a single accepted specification of what an object storage is. Perhaps the best way to consider an object storage is how the applications stores its bytes into a storage “container”.
Today, most of the interfaces rely on a fixed and structured model which involves block sizes of the file systems or database, which writes to a LUN or a folder because the contiguous bytes are laid with some block sizes, sector sizes on the disks. This means almost every element in the I/O access path being is pre-formatted and data is send/received with a confines of a formatted chunk (be it a block, a sector or a track).
In object storage, we can just view the storage container lacking the formatting. Therefore, an application can write a string of bytes (with being “chopped” to fit the structure”) directly to the storage medium.
Furthermore, the data in an object storage can be anywhere.
Compare the way you access a file in Windows –> C:\folder name\sub directory\file name
Compare the way you access a file in Dropbox –> Click on the icon and you get the data that you want. There is no path name but there is indeed a object ID. The ID is the unique identifier to access the data from anywhere. From a torrent file perspective, it allows a “file” to be retrieved from multiple sources asynchronously to build the composite file.
That is how I view Object Storage right now. There are other elements such as data protection level of an object and other requirements such as compliance properties, access control and so on.
Hope this helps. And thank you
/Chin-Fah