distributed file systems Archives

Where are your files living now?

By cfheoh | August 2, 2021 - 9:00 am |August 1, 2021 Algorithm, Analytics, Appliance, Artificial Intelligence, Backup, Big Data, Business Continuity, BYOD, Cloud, Data Availability, Data Fabric, Data Management, Data Privacy, Data Protection, Data Security, Deep Learning, Digital Transformation, Edge Computing, EMC, Filesystems, Google, Google Anthos, Hyperconvergence, Interica, Machine Learning, Microsoft, Microsoft Azure, NAS, NetApp, WAFS, Wide Area File System

Down the rabbit hole with Kubernetes Storage

By cfheoh | May 19, 2020 - 9:30 am |May 16, 2020 Acquisition, Algorithm, Amazon Web Services, Analytics, API, Artificial Intelligence, Ceph, Cloud, Clusters, Containers, Data Management, Edge Computing, Elastifile, Filesystems, Flash, Google, Hyperconvergence, Kubernetes, Linux, Minio, NFS, Object Storage

Dell EMC Isilon is an Emmy winner!

By cfheoh | March 16, 2020 - 7:41 am |March 17, 2020 100Gigabit Ethernet, Acquisition, Analytics, Appliance, CIFS, Cloud, Clusters, Containers, Data Availability, Deduplication, deduplication, Deep Learning, Dell, DellEMC, Disks, EMC, Flash, Gartner, High Performance Computing, Isilon, Mellanox, Mellanox Technologies, NAS, NetApp, NFS, Performance Caching, Pure Storage, Qumulo, Scale-out architecture, SMB, Snapshots, Software Defined Storage, Solid State Devices, Storage Field Day, Storage Market Share, Storage Optimization, Storage Tiering, Tech Field Day, WekaIO

2 Comments

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at this event. The content of this blog is of my own opinions and views ]

And the Emmy® goes to …

Yes, the Emmy® goes to Dell EMC Isilon! It was indeed a well deserved accolade and an honour!

Dell EMC Isilon had just won the Technology & Engineering Emmy® Awards a week before Storage Field Day 19, for their outstanding pioneering work on the NAS platform tiering technology of media and broadcasting content according to business value.

A lasting true clustered NAS

This is not a blog to praise Isilon but one that instill respect to a real true clustered, scale-out file system. I have known of OneFS for a long time, but never really took the opportunity to really put my hands on it since 2006 (there is a story). So here is a look at history …

Back in early to mid-2000, there was a lot of talks about large scale NAS. There were several players in the nascent scaling NAS market. NetApp was the filer king, with several competitors such as Polyserve, Ibrix, Spinnaker, Panasas and the young upstart Isilon. There were also Procom, BlueArc and NetApp’s predecessor Auspex. By the second half of the 2000 decade, the market consolidated and most of these NAS players were acquired.

NetApp acquired Spinnaker in 2003
Part of Auspex was acquired by NetApp in 2003; The other by Glasshouse Technologies
Procom was picked up by Sun Microsystems in 2005
Polyserve went to HP in 2007
Ibrix joined HP as well in 2009
Isilon got acquired by EMC in 2010
BlueArc gobbled up by HDS in 2011

Continue reading →

Of Object Storage, Filesystems and Multi-Cloud

By cfheoh | November 22, 2017 - 12:25 pm |November 22, 2017 Amazon, CIFS, Cloud, Cloudian, Data Availability, Data Fabric, Data Management, Elastifile, Filesystems, High Performance Computing, Hyperconvergence, Nasuni, NFS, Object Storage, OpenIO, Openstack, Performance Benchmark, Performance Caching, Reliability, Scale-out architecture, Scality, Server SAN, SMB, Software Defined Storage, Software-defined Datacenter, Storage Optimization, swiftstack, Uncategorized, Virtualization

1 Comment

Data storage silos everywhere. The early clarion call was to eliminate IT data storage silos by moving to the cloud. Fast forward to the present. Data storage silos are still everywhere, but this time, they are in the clouds. I blogged about this.

Object Storage was all the rage when it first started. AWS, with its S3 (Simple Storage Service) offering, started the cloud storage frenzy. Highly available, globally distributed, simple to access, and fitted superbly into the entire AWS ecosystem. Quickly, a smorgasbord of S3-compatible, S3-like object-based storage emerged. OpenStack Swift, HDS HCP, EMC Atmos, Cleversafe (which became IBM SpectrumScale), Inktank Ceph (which became RedHat Ceph), Bycast (acquired by NetApp to be StorageGrid), Quantum Lattus, Amplidata, and many more. For a period of a few years prior, it looked to me that the popularity of object storage with an S3 compatible front has overtaken distributed file systems.

What’s not to like? Object storage are distributed, they are metadata rich (at a certain structural level), they are immutable (hence secure from a certain point of view), and some even claim self-healing (depending on data protection policies). But one thing that object storage rarely touted dominance was high performance I/O. There were some cases, but they were either fronted by a file system (eg. NFSv4.1 with pNFS extensions), or using some host-based, SAN-client agent (eg. StorNext or Intel Lustre). Object-based storage, in its native form, has not been positioned as high performance I/O storage.

A few weeks ago, I read an article from Storage Soup, Dave Raffo. When I read it, it felt oxymoronic. SwiftStack was just nominated as a visionary in the Gartner Magic Quadrant for Distributed File Systems and Object Storage. But according to Dave’s article, Swiftstack did not want to be “associated” with object storage that much, even though Swiftstack’s technology underpinning was all object storage. Strange.

Continue reading →

Big data is big headache

By cfheoh | October 28, 2011 - 8:13 pm |October 23, 2012 Analytics, Big Data

1 Comment

IBM claims that we are responsible of for creating 2.5 quintillion bytes of data every day. How much is 1 quintilion?

According to the web,

1 quintillion = 1,000,000,000,000,000,000

After billion, it is trillion, then quadrillion, and then quintillion. That’s what 1 quintillion is, with 18 zeroes!

These data comes from everything from social networking updates, meteorology (weather reports), remote sensing maps (Google Maps, GPS, Geographical Information Systems), photos (Flickr), videos (YouTube), Internet search (Google) and so on. The big data terminology, according to Wikipedia, is data that are too large to be handled and processed by conventional data management tools. This presents a new set of difficulties when it comes to collected these data, storing them and sharing them. Indexing and searching big data would require special technologies to be able to mine and extract valuable information from big data datasets, within an acceptable period of time.

According to Wiki, “Technologies being applied to big data include massively parallel processing (MPP) databases, datamining grids, distributed file systems, distributed databases, cloud computing platforms, the Internet, and scalable storage systems.” That is why EMC has paid big money to acquire GreenPlum and IBM acquired Netezza. Traditional data warehousing players such Teradata, Oracle and Ingres are in the picture as well, setting a collision course between the storage and infrastructure companies and the data warehousing solutions companies.

The 2010 Gartner Magic Quadrant has seen non-traditional players such as IBM/Netezza and EMC/Greenplum, in its leaders quadrant.

And the key word that is already on everyone’s lips is “ANALYTICS“.

The ability to extract valuable information that helps determines what the next future trend is and personalized profiling will be something that may already arrived as companies are clamouring to get more and more out of our personalities so that they can sell you more of their wares.

Meteorological organizations are using big data analytics to find out about weather patterns and climate change. Space exploration becomes more acute and precise from the tons and tons of data collected from space explorations. Big data analytics are also helping pharmaceutical companies develop new biological and pharmaceutical breakthroughs. And the list goes on.

I am a new stranger into big data and I do not proclaim to know a lot. But terms such as scale-out NAS, distributed file systems, grid computing, massively parallel processing are certainly bringing the data storage world into a new frontier, and it is something we as storage professionals have to adapt to. I am eager to learn and know more about big data. It is a big headache but change is inevitable.

Tag Archives: distributed file systems

Where are your files living now?

Of Object Storage, Filesystems and Multi-Cloud

Big data is big headache

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense

Share this:

Share this:

And the Emmy® goes to …

A lasting true clustered NAS

Share this:

Share this:

Share this:

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense