Storage Gaga
Storage Gaga
Going Ga-ga over storage networking technologies ….
  Menu
Skip to content
  • Home
  • About
  • Cookie Policy
  • FreeNAS 11.2/11.3 eBook

Tag Archives: metadata

Is there no end to the threat of ransomware?

By cfheoh | June 20, 2022 - 8:00 am |June 20, 2022 Appliance, Artificial Intelligence, Backup, Business Continuity, Cloud, Cohesity, Data, Data Archiving, Data Availability, Data Management, Data Privacy, Data Protection, Data Security, Digital Transformation, Disaster Recovery, Druva, Filesystems, HDS, Hitachi Vantara, ILM, iRODS, Racktop Systems, Reliability, Rubrik, SASE, Security, Sophos, Storage Field Day, Tape, Tech Field Day
Leave a comment

I find it blasphemous that with all the rhetoric of data protection and cybersecurity technologies and solutions in the market today, the ransomware threats and damages have grown proportionately larger each year. In a recent report by Kaspersky on Anti-Ransomware Day May 12th, 9 out of 10 of organizations previously attacked by ransomware are willing to pay again if attacked again. A day before my scheduled talk in Surabaya East Java 2 weeks’ back, the chatter through the grapevine was one bank in Indonesia was attacked by ransomware on that day. These news proved how virulent and dangerous the ransomware scourge is and has become.

And the question that everyone wants an answer to is … why are ransomware threats getting bigger and more harmful and there are no solutions to it? 

Digital transformation and its data are very attractive targets

Today, all we hear from the data protection and storage vendors are recovery, restore that data blah, blah, blah and more blah, blah, blahs. The end point EDR (endpoint detection and response) solutions say they can stop it; the cybersecurity experts preach depth in defense; and the network security guys say use perimeter fencing. And the anti-phishing chaps say more awareness and education required. One or all have not worked effectively these few years. Ransomware’s threats and damages are getting worse. Why?

Continue reading →

Share this:

  • Facebook
  • LinkedIn
  • Twitter
Tagged Acronis, arcserve, Asigra, Barracuda Networks, cybersecurity, data lifecycle, data protection, file security, Kaspersky, metadata, metadata injection, metadata tagging, NIST Cybersecurity Framework, ransomware

The other pandemic – Datanemic

By cfheoh | April 5, 2021 - 9:00 am |April 4, 2021 Algorithm, Analytics, Artificial Intelligence, Business Continuity, Cloud, Data, Data Corruption, Data Management, Data Privacy, Data Protection, Data Security, Deep Learning, Digital Transformation, Disaster Recovery, IoT, Machine Learning, Uncategorized
Leave a comment

It is a disaster. No matter what we do, the leaks and the cracks are appearing faster than we are fixing it. It is a global pandemic.

I am not talking about COVID-19, the pandemic that has affected our lives and livelihood for over a year. I am talking about the other pandemic – the compromise of security of data.

In the past 6 months, the data leaks, the security hacks, the ransomware scourge have been more devastating than ever. Here are a few big ones that happened on a global scale:

  • [ Thru 2020 ] Solarwinds Supply Chain Hack (aka Sunburst)
  • [ March 2021 ] Microsoft® Exchange Hack
  • [ March 2021 ] Acer® Ransomware Attack
  • [ April 2021 ] Asteelflash Electronics Ransomware Attack

Data Security Breach, Cyber Attack, Ransomware

Closer to home, here in South East Asia, we have

  • [ March 2021 ] Malaysia Airlines Data Breach
  • [ March 2021 ] Singapore Airlines Data Security Breach

Continue reading →

Share this:

  • Facebook
  • LinkedIn
  • Twitter
Tagged cloud computing, cybersecurity threats, data compliance, Data Encryption, data federation, data modeling, data personalization, data virtualization, deep fakes, hack, hacked, iRODS, metadata, ransomware, supply chain attack

Trusting your storage – It’s not about performance

By cfheoh | December 28, 2020 - 9:00 am |December 26, 2020 Acquisition, Appliance, Backup, Business Continuity, Cloud, Data, Data Corruption, Data Protection, Elastifile, EMC, Filesystems, Flash, NetApp, OpenZFS, RAID
6 Comments

I have taken some downtime from my blog since late October. Part of my “hiatus” was my illness which had affected my right kidney but I am happy to announce that I am well again. During this period, I spent a lot of the time reading the loads of storage technologies announcements and their marketing calls and almost every single one of them touts Performance as if it is the single “sellable” feature of the respective storage vendor. None ever positions data integrity and the technology behind it in what I believe as the most important and most fundamental feature of any storage technology – Reading the right data exactly it was written into the storage array.

[ Note: Data integrity is even more critical in cloud storage and data corruption, especially the silent ones are even more acute in the clouds ]

Sure, this fundamental feature sounds like it is a given thing in any storage array but believe me, there are enterprise storage arrays which have failed to deliver this simple feature properly. I have end users coming to me through out my storage career that they have database corruption, or file corruption and unable to access their data in an acceptable manner. Data corruption is real folks!

Data corruption.

After several weeks of reading these stuff, I got jaded with so many storage vendors playing leapfrog announcements with their millions of IOPS boasts.

The 3 legged stool

Rewind to circa 2012, just about the time when EMC® acquired XtremIO™. XtremIO™ was a nascent All-Flash startup, and many, including yours truly, really saw the EMC® acquisition was about a high performant storage array. I was having an email conversation with Shahar Frank, one of the co-founders of XtremIO™, and expressing my views about their performance. What Shahar replied surprised me.

The fundamentals of the strength of a storage array was a like a 3-legged stool. 2 legs of the stool would be Performance, and Protection, but with 2 legs, the person sitting on the stool would fall. The 3rd leg would stabilize the balance of the stool, and this 3rd leg was Reliability. This stumped me because XtremIO™’s most sellable feature was Performance. But the wisdom of Shahar pointed to Reliability, the least exciting feature and the most dull of the 3. He was brilliant, of course and went on to found ElastiFile (acquired by Google™), but that’s another story for another day.

Continue reading →

Share this:

  • Facebook
  • LinkedIn
  • Twitter
Tagged checksum validation, data integrity, data reliability, file system tree, IOPS, metadata, self healing, Silent Data Corruption, XtremIO, ZFS

The Edge is coming! The Edge is coming!

By cfheoh | October 12, 2020 - 9:15 am |October 11, 2020 100Gigabit Ethernet, Analytics, Big Data, Containers, Data, Deep Learning, Edge Computing, Flash, Industry 4.0, InfluxDB, Linux, Machine Learning, Mellanox, Mellanox Technologies, Minio, nVidia, NVMe, Pravega, SNIA, Solid State Devices
Leave a comment

Actually, Edge Computing is already here. It has been here on everyone’s lips for quite some time, but for me and for many others, Edge Computing is still a hodgepodge of many things. The proliferation of devices, IoT, sensor, end points being pulled into the ubiquitous term of Edge Computing has made the scope ever changing, and difficult to pin down. And it is this proliferation of edge devices that will generate voluminous amount of data. Obvious questions emerge:

  • How to do you store all the data?
  • How do you process all the data?
  • How do you derive competitive value from the data from these edge devices?
  • How do you securely transfer and share the data?

From the storage technology perspective, it might be easier to observe what are the traits of the data generated on the edge device. In this blog, we also observe what could some new storage technologies out there that could be part of the Edge Computing present and future.

Edge Computing overview - Cloud to Edge to Endpoint

Edge Computing overview – Cloud to Edge to Endpoint

Storage at the Edge

The mantra of putting compute as close to the data and processing it where it is stored is the main crux right now, at least where storage of the data is concerned. The latency to the computing resources on the cloud and back to the edge devices will not be conducive, and in many older settings, these edge devices in factory may not be even network enabled. In my last encounter several years ago, there were more than 40 interfaces, specifications and protocols, most of them proprietary, for the edge devices. And there is no industry wide standard for these edge devices too.

Continue reading →

Share this:

  • Facebook
  • LinkedIn
  • Twitter
Tagged 451 report, 5G, AI, ASIC, computational storage, Computational Storage TWG, CPU, data classification, data personality, data processing unit, DPU, edge devices, FPGA, GPU, Industry 4.0, metadata, Moore's Law, self encrypted drives, smart city, SmartNIC, SNIA CMSI, SOC, system-on-chip, Xilinx

Time to Advocate Common Data Personality

By cfheoh | November 25, 2019 - 12:33 pm |November 25, 2019 Amazon Web Services, Analytics, API, Artificial Intelligence, Big Data, Commvault, Containers, Data Archiving, Data Availability, Data Management, Data Privacy, Data Protection, Data Security, Deep Learning, Digital Transformation, eDiscovery, Hadoop, Hadoop Clusters, HDS, Hedvig, Hitachi Vantara, InfluxDB, Machine Learning, MapReduce, Object Storage
3 Comments

The thought of it has been on my mind since Commvault GO 2019. It was sparked when Don Foster, VP of Storage Solutions of Commvault answered a question posted by one of the analysts. What he said made a connection, as I was searching for the better insights to how Commvault and Hedvig would end up to be together.

Data Deluge is a swamp thing now

Several years ago, I heard Stephen Brobst, CTO of Teradata brought up the term “Data Swamp“. It was the anti- part of the Data Lakes, and this was back when Data Lakes and Hadoop were all the rage. His comments were raw, honest and it was leading to the truth out there.

Source: https://www.deviantart.com/rhineville/art/God-that-Crawls-Detail-2-291228644

I was enamoured by his thoughts at the time, and today, his comments about the Data Swamp manifested itself. Continue reading →

Share this:

  • Facebook
  • LinkedIn
  • Twitter
Tagged data enrichment, data foundations, data handling, data ingestion, data lakes, data personality, data preparation, data swamp, Hitachi Content Platform, intent-based data, JSON, metadata, XML

Sleepless in Malaysia with Object Storage

By cfheoh | January 22, 2019 - 1:42 pm |January 22, 2019 Amazon, Amazon Web Services, Analytics, API, Big Data, Cloud, Cloudian, Clusters, Data Management, Deep Learning, DellEMC, Dropbox, Filesystems, Google, Hadoop, Hadoop Clusters, HDS, IDC, IoT, iSCSI, Linux, Machine Learning, Microsoft, Minio, NAS, NFS, Object Storage, OpenIO, Redhat, Security, swiftstack
Leave a comment

Object Storage? What’s that?

For the past couple of months, I have been speaking with a few parties in Malaysia about object storage technology. And I was fairly surprised with the responses.

The 2 reports

For a start, I did not set out to talk about object storage. It kind of fell onto my lap. 2 recent Hitachi Vantara reports revealed that countries like Australia, Hong Kong and even South East Asian countries were behind in their understanding of what object storage was, and the benefits it brought to the new generation of web scale and enterprise applications.

In the first report, an IDC survey sponsored by Hitachi Vantara, mentioned that 41% of the enterprises in Australia are not aware of object storage technology. In a similar survey, this one pointing towards Hong Kong and China, the percentages were 38% and 35% respectively. I would presume that the percentages for countries in South East Asia would not fall too far from the apple tree.

How is Malaysia doing?

However, I worry that the percentage number could be far more dire in Malaysia. In the past 2 months, responses from several conversations painted a darker hue about object storage technology with the companies in Malaysia. These included a reasonable sized hosting company, a well-established systems integrator, a software development company, several storage practitioners in Openstack and a DellEMC’s regional consultant for unstructured data. The collective conclusion was object storage technology was relatively unknown (probably similar to the percentages to the IDC/Hitachi Vantara reports), but it appeared to be shunned at this juncture. In web scale applications, Redhat Ceph block and files appeared popular in contrast to Openstack Swift. In enterprise applications, it was a toss of iSCSI and NFS.

Image from https://zdnet4.cbsistatic.com/hub/i/r/2018/04/24/c79e9dfb-b4a9-46bb-b831-f2c57fdf8a1d/resize/470xauto/5e4846d1bc7a034c382baf6dcbb612ed/cloud-storage.jpg

Continue reading →

Share this:

  • Facebook
  • LinkedIn
  • Twitter
Tagged Anand Babu, Astro TV, Box, Dropbox, EFSS, GlusterFS, Google Drive, Hadoop Ozone, metadata, object storage, OneDrive, OpenStack Swift, private cloud storage, public cloud storage, Redhat Ceph

Hammering Next Gen Hybrid Clouds

By cfheoh | October 18, 2018 - 8:51 pm |October 19, 2018 Acquisition, Analytics, Appliance, Artificial Intelligence, CIFS, Cloud, Data, Data Fabric, Data Management, Deduplication, Disaster Recovery, Filesystems, Hammerspace, High Performance Computing, Hyperconvergence, Machine Learning, MapReduce, NAS, NetApp, NFS, Object Storage, Performance Caching, Reliability, Software-defined Datacenter, Storage Field Day, Storage Tiering, Tech Field Day, Virtualization
2 Comments

[Preamble: I have been invited by  GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are paid by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

Hammerspace came out of stealth 2 days ago. Their objective? To rule the world of data for hybrid clouds and multi-clouds, and provide “unstructured data anywhere on-demand“. That is a bold statement, for a company that is relatively unknown, except for its deep ties with the now defunct Primary Data. Primary Data’s Chairman, David Flynn, is the head honcho at Hammerspace.

The Hammerspace technology has come the right time in my opinion because the entire cloud, multi-cloud and hybrid cloud stories have become fractured, siloed. The very thing that cloud computing touted to fix has brought back the same set of problems. At the same time, not every application was developed for the cloud. Applications rely on block storage services, or NAS protocols, or the de facto S3 protocols for storage repositories. However, the integration and communication between applications break down when these on-premises applications are moving to the cloud, or when applications residing the cloud are moved back to on-premises for throughput delivery, or even applications residing at the edge.

Continue reading →

Share this:

  • Facebook
  • LinkedIn
  • Twitter
Tagged CIFS, data control, Data Management, data virtualization, Hammerspace, Hybrid Cloud, metadata, multicloud, NAS, NFS, object storage, pNFS, S3, SMB

Huawei Dorado – All about Speed

By cfheoh | April 1, 2018 - 11:18 am |April 1, 2018 Analytics, Appliance, Big Data, Data, Data Fabric, Data Management, Deduplication, Disks, Filesystems, Flash, High Performance Computing, Hitachi Vantara, Huawei, Hyperconvergence, Machine Learning, NetApp, Performance Benchmark, Performance Caching, Reliability, Scale-out architecture, Snapshots, Solid State Devices, Storage Field Day, Storage Optimization, Virtualization
3 Comments

[Preamble: I was a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

Since Storage Field Day 15 3 weeks ago, the thoughts of the session with Huawei lingered. And one word came to describe Huawei Dorado V3, their flagship All-Flash storage platform is SPEED.

My conversation with Huawei actually started the night before our planned session at their Santa Clara facility the next day. We had a evening get-together at Bourbon Levi’s Stadium. I was with my buddy, Ammar Zolkipli, who coincidentally was in the Silicon Valley for work. Ammar is from Hitachi Vantara Japan, and has been a good friend of mine for over 17 years now.

Shortly, the Huawei team arrived to join the camaraderie. And we introduced ourselves to Chun Liu, not knowing that he is the Chief Architect at Huawei. A big part of that evening was our conversation with him. Ammar and I have immersed in the Oil & Gas EP (Exploration & Production) data management and petrotechnical applications when he was in Schlumberger and after that a reseller of NetApp. I was a Consulting Engineer with NetApp back then. So, the 2 of us started blabbering (yeah, that would be us when we get together to talk technology).

I observed that Chun was very interested to find learn about real world application use cases that would push storage performance to its limits. And I guessed that the best type of I/O characteristics would be small block, random I/O and billions of them, with near-real time latency. After that evening I did some research and could only think of a few, such as deep analytics or some applications with needs for Monte Carlo simulations. Oh, well, maybe I would share that with Chun the following day.

The moment the session started, it was already about the speed prowess of Huawei Storage. It was like the greyhounds unleashed going after the rabbit. In the lineup, the Dorado series stood out.

Continue reading →

Share this:

  • Facebook
  • LinkedIn
  • Twitter
Tagged dorado, Flash, garbage collection, global deduplication, high performance, Inline Compression, Inline Deduplication, metadata, optimized, simd, SSD

MASSive, Impressive, Agile, TEGILE

By cfheoh | November 20, 2014 - 3:03 pm |November 20, 2014 Analytics, Appliance, CIFS, Cloud, Data, Deduplication, Fibre Channel, Filesystems, iSCSI, NetApp, NFS, NVMe, PCIe, Performance Benchmark, Performance Caching, RAID, Scale-out architecture, SMB, Snapshots, Software Defined Storage, Storage Optimization, Tegile, Unified Storage, Virtualization, VMware
1 Comment

Ah, my first blog after Storage Field Day 6!

It was a fantastic week and I only got to fathom the sensations and effects of the trip after my return from San Jose, California last week. Many thanks to Stephen Foskett (@sfoskett), Tom Hollingsworth (@networkingnerd) and Claire Chaplais (@cchaplais) of Gestalt IT for inviting me over for that wonderful trip 2 weeks’ ago. Tegile was one of the companies I had the privilege to visit and savour.

In a world of utterly confusing messaging about Flash Storage, I was eager to find out what makes Tegile tick at the Storage Field Day session. Yes, I loved Tegile and the campus visit was very nice. I was also very impressed that they have more than 700 customers and over a thousand systems shipped, all within 2 years since they came out of stealth in 2012. However, I was more interested in the essence of Tegile and what makes them stand out.

I have been a long time admirer of ZFS (Zettabyte File System). I have been a practitioner myself and I also studied the file system architecture and data structure some years back, when NetApp and Sun were involved in a lawsuit. A lot of have changed since then and I am very pleased to see Tegile doing great things with ZFS.

Tegile’s architecture is called IntelliFlash. Here’s a look at the overview of the IntelliFlash architecture:

Tegile IntelliFlash Architecture

So, what stands out for Tegile? I deduce that there are 3 important technology components that defines Tegile IntelliFlash ™ Operating System.

  • MASS (Metadata Accelerator Storage System)
  • Media Management
  • Inline Compression and Inline Deduplication

What is MASS? Tegile has patented MASS as an architecture that allows optimized data path to the file system metadata.

Often a typical file system metadata are stored together with the data. This results in a less optimized data access because both the data and metadata are given the same priority. However, Tegile’s MASS writes and stores the filesystem metadata in very high speed, low latency DRAM and Flash SSD. The filesystem metadata probably includes some very fine grained and intimate details about the mapping of blocks and pages to the respective capacity Flash SSDs and the mechanical HDDs. (Note: I made an educated guess here and I would be happy if someone corrected me)

Going a bit deeper, the DRAM in the Tegile hybrid storage array is used as a L1 Read Cache, while Flash SSDs are used as a L2 Read and Write Cache. Tegile takes further consideration that the Flash SSDs used for this caching purpose are different from the denser and higher capacity Flash SSDs used for storing data. These Flash SSDs for caching are obviously the faster, lower latency type of eMLCs and in the future, might be replaced by PCIe Flash optimized by NVMe.

Tegile DRAM-Flash Caching

This approach gives absolute priority, and near-instant access to the filesystem’s metadata, making the Tegile data access incredibly fast and efficient.

Tegile’s Media Management capabilities excite me. This is because it treats every single Flash SSD in the storage array with very precise organization of 3 types of data patterns.

  1. Write caching, which is high I/O is focused on a small segment of the drive
  2. Metadata caching, which has both Read and Write I/O  is targeted to a slight larger segment of the drive
  3. Data is laid out on the rest of the capacity of the drive

Drilling deeper, the write caching (in item 1 above) high I/O writes are targeted at the drive segment’s range which is over-provisioned for greater efficiency and care. At the same time, the garbage collection(GC) of this segment is handled by the respective drive’s controller. This is important because the controller will be performing the GC function without inducing unnecessary latency to the storage array processing cycles, giving further boost to Tegile’s already awesome prowess.

In addition to that, IntelliFlash ™ aligns every block and every page exactly to each segment and each page boundary of the drives. This reduces block and page segmentation, and thereby reduces issues with file locality and free blocks locality. It also automatically adjust its block and page alignments to different drive types and models. Therefore, I believe, it would know how to align itself to a 512-bytes or a 520-bytes sector drives.

The Media Management function also has advanced cell care. The wear-leveling takes on a newer level of advancement where how the efficient organization of blocks and pages to the drives reduces additional and often unnecessary erase and rewrites. Furthermore, the use of Inline Compression and Inline Deduplication also reduces the number of writes to drives media, increasing their longevity.

Tegile Inline Compression and Deduplication

Compression and deduplication are 2 very important technology features in almost all flash arrays. Likewise, these 2 technologies are crucial in the performance of Tegile storage systems. They are both inline i.e – Inline Compression and Inline Deduplication, and therefore both are boosted by the multi-core CPUs as well as the fast DRAM memory.

I don’t have the secret sauce formula of how Tegile designed their inline compression and deduplication. But there’s a very good article of how Tegile viewed their method of data reduction for compression and deduplication. Check out their blog here.

The metadata of data access of each and every customer is probably feeding into their Intellicare, a cloud-based customer care program. Intellicare is another a strong differentiator in Tegile’s offering.

Oh, did I mentioned they are unified storage as well with both SAN and NAS, including SMB 3.0 support?

I left Tegile that afternoon on November 5th feeling happy. I was pleased to catch up with Narayan Venkat, my old friend from NetApp, who is now their Chief Marketing Officer. I was equally pleased to see Tegile advancing ZFS further than the others I have known. With so much technological advancement and more coming, the world is their oyster.

Share this:

  • Facebook
  • LinkedIn
  • Twitter
Tagged CIFS, DRAM, Filesystem, Flash SSD, Inline Compression, Inline Deduplication, Intellicare, IntelliFlash, media management, metadata, NVMe, PCIe, SMB 3.0, Tegile, ZFS

What should be a Cloud Storage?

By cfheoh | December 8, 2011 - 1:43 pm |October 27, 2012 Analytics, Big Data, Filesystems, Object Storage
2 Comments

For us filesystem guys, NAS is the way to go. We are used to store files into network file systems via NFS and CIFS protocols and treating the NAS storage array like a refrigerator – taking stuff out and putting stuff back it. All that is fine and well as long as the data is what I would term as corporate data.

Corporate data is generated by employees, applications and users of the company and for a long time, the power of data creation lies in the hands of the enterprise. That is why storage solutions are designed to address the needs of the enterprise where the data is structured and well defined. How the data is stored; the data is formatted; and how is being accessed are the “boundary” of how the data is being used. Typically a database is used to “restrict” the data created so that the information can be retrieved and modified quickly. And of course, SAN guys will tell you to put these structured data of the data base into their SAN.

For the unstructured data in the enterprise, NAS file systems hold that responsibility. Files such as documents and presentations have a more loosely defined “boundaries”, and hence filesystems are a better natural fit for unstructured data. Filesystems are like a free-for-all container, and able to store and provide access to any files in the enterprise.

But today, as the Web 2.0 applications are already taking over the enterprise, the power of data creation does not necessary lie in the hands of the enterprise applications and users. In fact, it is estimated that the percentage of enterprise data now has exceeded 50% of the enterprise’s total data capacity. With the proliferation of personal devices such as tablets, Blackberries, smart phones, PDAs and so on, individual contributors are generating plenty of data. This situation has been made more acute with Web 2.0 applications, such as Facebook, blogs, social networking, Twitter, Google Search and so on.

Unfortunately, file systems in the NAS category still pretty much the traditional file systems, while the needs of a new type of file system could not be met by the traditional file systems. The paradigm is definitely shifting.  The new unstructured data world needs a new storage concept. I would term this type of storage as “Cloud Storage” because it breaks down the traditional concepts of NAS.

So what basically defined a Cloud Storage? I already mentioned that the type of unstructured data has changed. And the new requirements for unstructured data type  are:

  • The unstructured data type is capable of globally distributed.
  • There will be billions and billions of unstructured data objects created but each object, be it a Twitter tweet, or a uploaded mobile video, or even the clandestine data collected by CarrierIQ, can be accessed easily via a single namespace
  • The storage file system foundation for these new unstructured data type is easily provisioned and management. Look at Facebook. It is easy to setup, get going and the user (and probably the data administrator) can easily manage the user interface and the platform
  • For the service provider of Cloud Storage, the file system must be secure and support multi-tenancy and virtualization of storage resources
  • There should be some form of policy-driven content management. That is why development platforms such as Joomla!, Drupal, WordPress are slowing become enterprise driven to address these unstructured data types.
  • Highly searchable and have a high degree of search optimization. A Google search do have a strong degree of intelligence and relevance to the data being search as well as generating tons of by-product data that feeds the need to understand the consumers or the users better. Hail Big Data!

So when I compare traditional NAS storage solutions such as Netapp or EMC VNX or BlueArc, I ask the question of whether their NAS solutions has these capabilities to meet the requirements of these new unstructured data type.

Most of them, no matter how they package it, is still relying on files as the granular object of storage. And today, most files may have some form of metadata such as file name, owner, size etc, DO NOT, possess the capability of content-aware. Here’s an example when I want to show you:

 

The file properties (part of the file metadata) tell you about the file but little about the content of the file. Today, it requires more than that and the new unstructured data type should look more like this:

If you look at the diagram below, the object on the right (which is the new unstructured data type), display much more information than a typical file in a NAS file system. There additional information becomes the fodder to other applications such as search engines, RSS feeds, robots and spiders and of course, big data analytics.

Here’s another example of what I mean about these extended metadata, and a Cloud Storage storage array is required to work with these new set of parameters and a new set of requirement.

 

There’s a new unstructured data type in town. Traditional NAS systems may not have the right features to work with this new paradigm.

Don’t be white washed by the fancy talk of storage vendors in town. Learn the facts, and find out what is really a Cloud Storage.

It’s time to think differently. It’s time to think of what should be a Cloud Storage.

 

Share this:

  • Facebook
  • LinkedIn
  • Twitter
Tagged big data, Cloud Storage, databases, metadata, structured data, unstructured data
  • Recent Posts

    • Project COSI
    • Fibre Channel Protocol in a Zero Trust world
    • Societies in crisis. Data at Fault
    • Should you bash your Storage competitor?
    • Open Source on my mind
  • Sponsored Ads

  • Google Adsense

  • Recent Comments

    • Storage Field Day 19: Getting Back to My Roots - Digital Sunshine Solutions on Is General Purpose Object Storage disenfranchised?
    • Random Short Take #83 | PenguinPunk.net on Fibre Channel Protocol in a Zero Trust world
    • mycatstinx on First looks into Interplanetary File System
    • cfheoh on I built a 6-node Gluster cluster with TrueNAS SCALE
    • Tyler Rieves on I built a 6-node Gluster cluster with TrueNAS SCALE
  • Google Adsense

Storage Gaga | Powered by Mantra & WordPress.