We are used to block-based approach and also the file-based approach to data. The 2 diagrams below shows the basics of how we access data in both block-based and file-based data on the storage device.
For block-based , the storage of the blocks is merely in arrays of unrelated contiguous blocks. For file-based, as seen below,
there is another layer of abstraction, and this is called the file system. But if you seen both diagrams above, there are some random numbers in light blue and that is to represent the storage device, the hard disk drive’s export of “containers” to the file system or the application that is accessing the storage device. This is usually the LBA (Logical Block Addressing), which is basically set of schematics that defines the locations on the hard disk drives. LBA tells the location of where the data is stored. For more information about LBA, check out this Wikipedia definition. But the whole idea is LBA is dumb. It is pretty much static and exported to file systems and applications so that these guys can do something with it.
There’s something brewing in the background since 1994 and it is one of the many efforts to make intelligent storage devices. This new object-based interface was part of the research project done by Carnegie-Mellon University (CMU). Initially, it was known as Network Attached Secure Disk (NASD) but eventually made its way to the working group in SNIA, and developing it for ANSI T10 INCITS standard. ANSI T10 is the guardian of all SCSI standards. This is called Object Storage Device (OSD). The SCSI architecture diagram below shows the layer where OSD resides.
The motivation for this simple: To make storage devices of today to do more computational work, in particularly I/O, relieving the hosts and the local systems to concentrate other computational processing work. And the same time, the local systems must have some level of interactivity and management between the storage object and the computational hosts.
In the diagram below which compares both block-based and OSD,
you can see the separation of file system management interface that is at the kernel-space of the local host/system and this is replaced by the OSD Management interface at the storage device.
What does this all mean? This means that using LBA type of addressing that we are familiar with in the block-based and file-based storage is no longer the way to go, because as I mentioned before, LBA is dumb.
OSD, in some way, replaces the LBA with OIDs (Object IDs). The existing local system and/or its file system will interact with the storage devices with OIDs and the OIDs links to its respective objects storage. And the object will carry a lot of metadata, that represents the object, giving it the intelligent and management capability of the object.
The prominence of the metadata in the OSD would mean that we can build much more intelligent systems in the future. The OIDs and the objects can be grouped together in a flat design or can be organized and categorized in a virtual, hierarchical model.
Object storage is an intelligent evolution of disk drives that can store and serve objects rather than simply place data on tracks and sectors. And it can bring the following benefits:
- Intelligent space management in the storage layer
- Data aware pre-fetching and caching
- Robust shared access by multiple clients
- Scalable performance using off-loaded data path
- Reliability security
Several vendors such as EMC and NetApp are already supporting OSD.