Deep Learning – Page 3

Where are your files living now?

By cfheoh | August 2, 2021 - 9:00 am |August 1, 2021 Algorithm, Analytics, Appliance, Artificial Intelligence, Backup, Big Data, Business Continuity, BYOD, Cloud, Data Availability, Data Fabric, Data Management, Data Privacy, Data Protection, Data Security, Deep Learning, Digital Transformation, Edge Computing, EMC, Filesystems, Google, Google Anthos, Hyperconvergence, Interica, Machine Learning, Microsoft, Microsoft Azure, NAS, NetApp, WAFS, Wide Area File System

Leave a comment

[ This is Part One of a longer conversation ]

EMC² (before the Dell® acquisition) in the 2000s had a tagline called “Where Information Lives™“^**. This was before the time of cloud storage. The tagline was an adage of enterprise data storage, proper and contemporaneous to the persistent narrative at the time – Data Consolidation. Within the data consolidation stories, thousands of files and folders moved about the networks of the organizations, from servers to clients, clients to servers. NAS (Network Attached Storage) was, and still is the work horse of many, many organizations.

[ **Side story ] There was an internal anti-EMC joke within NetApp® called “Information has a new address”.

EMC tagline “Where Information Lives”

This was a time where there were almost no concerns about Shadow IT; ransomware were less known; and most importantly, almost everyone knew where their files and folders were, more or less (except in Oil & Gas upstream – to be told in later in this blog). That was because there were concerted attempts to consolidate data, and inadvertently files and folders, in the organization.

Even when these organizations were spread across the world, there were distributed file technologies at the time that could deliver files and folders in an acceptable manner. Definitely not as good as what we have today in a cloudy world, but acceptable. I personally worked a project setting up Andrew File Systems for Intel® in Penang in the mid-90s, almost joined Tacit Networks in the mid-2000s, dabbled on Microsoft® Distributed File System with NetApp® and Windows File Servers while fixing the mountains of issues in deploying the worldwide GUSto (Global Unified Storage) Project in Shell 2006. Somewhere in my chronological listings, Acopia Networks (acquired by F5) and of course, EMC² Rainfinity and NetApp® NuView OEM, Virtual File Manager.

The point I am trying to make here is most IT organizations had a good grip of where the files and folders were. I do not think this is very true anymore. Do you know where your files and folders are living today?

Continue reading →

Storage IO straight to GPU

By cfheoh | July 5, 2021 - 9:00 am |July 3, 2021 100Gigabit Ethernet, Algorithm, Analytics, API, Artificial Intelligence, Composable Infrastructure, compression, CXL, Deduplication, Deep Learning, Filesystems, High Performance Computing, Hyperconvergence, Machine Learning, Mellanox, Mellanox Technologies, Microsoft, nVidia, NVMe, RDMA, Vast Data, WekaIO

Leave a comment

The parallel processing power of the GPU (Graphics Processing Unit) cannot be denied. One year ago, nVidia® overtook Intel® in market capitalization. And today, they have doubled their market cap lead over Intel®, [as of July 2, 2021] USD$510.53 billion vs USD$229.19 billion.

Thus it is not surprising that storage architectures are changing from the CPU-centric paradigm to take advantage of the burgeoning prowess of the GPU. And 2 announcements in the storage news in recent weeks have caught my attention – Windows 11 DirectStorage API and nVidia® Magnum IO GPUDirect® Storage.

nVidia GPU

Exciting the gamers

The Windows DirectStorage API feature is only available in Windows 11. It was announced as part of the Xbox® Velocity Architecture last year to take advantage of the high I/O capability of modern day NVMe SSDs. DirectStorage-enabled applications and games have several technologies such as D3D Direct3D decompression/compression algorithm designed for the GPU, and SFS Sampler Feedback Streaming that uses the previous rendered frame results to decide which higher resolution texture frames to be loaded into memory of the GPU and rendered for the real-time gaming experience.

Continue reading →

Rethinking File Security Fundamentals

By cfheoh | May 24, 2021 - 9:00 am |May 24, 2021 Algorithm, Analytics, API, Artificial Intelligence, Business Continuity, Data Availability, Data Corruption, Data Management, Data Privacy, Data Protection, Data Security, Deep Learning, Digital Transformation, Disaster Recovery, eDiscovery, Filesystems, iRODS, Machine Learning, Object Storage, Snapshots, Virtualization

Leave a comment

I took a week off blogging last week but the lazy days were inundated by bad news. A few more devastating ransomware attacks. This time, Colonial Pipeline in the US was hacked and its networks were shutdown by ransomware. These ransomware threats are never ending, and they are getting more damaging than ever. It is like trying to plug a leaking boat with your hands, and more leaks appear as you plug them.

More ransomware news hitting healthcare around the world last week:

[ May 15, 2021 ] Ireland’s health service hit by ‘significant’ ransomware attack
[ May 20, 2021 ] Irish hospitals are latest to be hit by ransomware attacks
[ May 19, 2021 ] Ransomware attacks hit AXA’s Asia unit, New Zealand health provider
[ May 20, 2021 ] Ransomware attacks are spiking. Is your company prepared?
[ May 20, 2021 ] RansomCloud: It’s new, it’s here now and it’s coming to a server near you

We are forever chasing for a solution, forever losing because almost all technology defenses to protect the data against ransomware are reactive. Why is ransomware still such a big threat then? Time to rethink file security fundamentals.

Data everywhere

Continue reading →

The other pandemic – Datanemic

By cfheoh | April 5, 2021 - 9:00 am |April 4, 2021 Algorithm, Analytics, Artificial Intelligence, Business Continuity, Cloud, Data, Data Corruption, Data Management, Data Privacy, Data Protection, Data Security, Deep Learning, Digital Transformation, Disaster Recovery, IoT, Machine Learning, Uncategorized

Leave a comment

It is a disaster. No matter what we do, the leaks and the cracks are appearing faster than we are fixing it. It is a global pandemic.

I am not talking about COVID-19, the pandemic that has affected our lives and livelihood for over a year. I am talking about the other pandemic – the compromise of security of data.

In the past 6 months, the data leaks, the security hacks, the ransomware scourge have been more devastating than ever. Here are a few big ones that happened on a global scale:

[ Thru 2020 ] Solarwinds Supply Chain Hack (aka Sunburst)
[ March 2021 ] Microsoft® Exchange Hack
[ March 2021 ] Acer® Ransomware Attack
[ April 2021 ] Asteelflash Electronics Ransomware Attack

Data Security Breach, Cyber Attack, Ransomware

Closer to home, here in South East Asia, we have

[ March 2021 ] Malaysia Airlines Data Breach
[ March 2021 ] Singapore Airlines Data Security Breach

Continue reading →

Discovering OpenZFS Fusion Pool

By cfheoh | January 25, 2021 - 9:00 am |January 21, 2021 Appliance, Data Availability, Deduplication, deduplication, Deep Learning, Delphix, Disks, Filesystems, FreeNAS, iXsystems, Linux, NetApp, OpenZFS, RAID, Solid State Devices, TrueNAS

1 Comment

Fusion Pool excites me, but unfortunately this new key feature of OpenZFS is hardly talked about. I would like to introduce the Fusion Pool feature as iXsystems™ expands the TrueNAS® Enterprise storage conversations.

I would not say that this technology is revolutionary. Other vendors already have the similar concept of Fusion Pool. The most notable (to me) is NetApp® Flash Pool, and I am sure other enterprise storage vendors have the same. But this is a big deal (for me) for an open source file system in OpenZFS.

What is Fusion Pool (aka ZFS Allocation Classes)?

To understand Fusion Pool, we have to understand the basics of the ZFS zpool. A zpool is the aggregation (borrowing the NetApp® terminology) of vdevs (virtual devices), and vdevs are a collection of physical drives configured with the OpenZFS RAID levels (RAID-0, RAID-1, RAID-Z1, RAID-Z2, RAID-Z3 and a few nested RAID permutations). A zpool can start with one vdev, and new vdevs can be added on-the-fly, expanding the capacity of the zpool online.

There are several types of vdevs prior to Fusion Pool, and this is as of pre-TrueNAS® version 12.0. As shown below, these are the types of vdevs available to the zpool at present.

OpenZFS zpool and vdev types – Credit: Jim Salter and Arstechnica

Fusion Pool is a zpool that integrates with a new, special type of vdev, alongside other normal vdevs. This special vdev is designed to work with small data blocks between 4-16K, and is highly efficient in handling random reading and writing of these small blocks. This bodes well with the OpenZFS file system metadata blocks and other blocks of small files. And the random nature of the Read/Write I/Os works best with SSDs (can be read or write intensive SSDs).

Continue reading →

The Edge is coming! The Edge is coming!

By cfheoh | October 12, 2020 - 9:15 am |October 11, 2020 100Gigabit Ethernet, Analytics, Big Data, Containers, Data, Deep Learning, Edge Computing, Flash, Industry 4.0, InfluxDB, Linux, Machine Learning, Mellanox, Mellanox Technologies, Minio, nVidia, NVMe, Pravega, SNIA, Solid State Devices

Leave a comment

Actually, Edge Computing is already here. It has been here on everyone’s lips for quite some time, but for me and for many others, Edge Computing is still a hodgepodge of many things. The proliferation of devices, IoT, sensor, end points being pulled into the ubiquitous term of Edge Computing has made the scope ever changing, and difficult to pin down. And it is this proliferation of edge devices that will generate voluminous amount of data. Obvious questions emerge:

How to do you store all the data?
How do you process all the data?
How do you derive competitive value from the data from these edge devices?
How do you securely transfer and share the data?

From the storage technology perspective, it might be easier to observe what are the traits of the data generated on the edge device. In this blog, we also observe what could some new storage technologies out there that could be part of the Edge Computing present and future.

Edge Computing overview – Cloud to Edge to Endpoint

Storage at the Edge

The mantra of putting compute as close to the data and processing it where it is stored is the main crux right now, at least where storage of the data is concerned. The latency to the computing resources on the cloud and back to the edge devices will not be conducive, and in many older settings, these edge devices in factory may not be even network enabled. In my last encounter several years ago, there were more than 40 interfaces, specifications and protocols, most of them proprietary, for the edge devices. And there is no industry wide standard for these edge devices too.

Continue reading →

Storage in a shiny multi-cloud space

By cfheoh | September 14, 2020 - 6:22 pm |September 14, 2020 Amazon, Amazon Web Services, Analytics, API, Artificial Intelligence, Backup, Big Data, Business Continuity, Cloud, Clumio, Containers, Data, Data Archiving, Data Availability, Data Fabric, Data Management, Data Privacy, Data Protection, Data Security, Deep Learning, Digital Transformation, Disaster Recovery, Docker, Druva, Gartner, Google, Google Anthos, High Performance Computing, Kubernetes, Machine Learning, Microsoft, Microsoft Azure, Object Storage, Oracle, Oracle Cloud, Rackspace, Software Defined Storage, Software-defined Datacenter, Storage Tiering, Wasabi Cloud

Leave a comment

The multi-cloud for infrastructure-as-a-service (IaaS) era is not here (yet). That is what the technology marketers want you to think. The hype, the vapourware, the frenzy. It is what they do. The same goes to technology analysts where they describe vision and futures, and the high level constructs and strategies to get there. The hype of multi-cloud is often thought of running applications and infrastructure services seamlessly in several public clouds such as Amazon AWS, Microsoft® Azure and Google Cloud Platform, and linking it to on-premises data centers and private clouds. Hybrid is the new black.

Multicloud connectivity to public cloud providers and on-premises private cloud

Multi-Cloud, on-premises, public and hybrid clouds

And the aspiration of multi-cloud is the right one, when it is truly ready. Gartner® wrote a high level article titled “Why Organizations Choose a Multicloud Strategy“. To take advantage of each individual cloud’s strengths and resiliency in respective geographies make good business sense, but there are many other considerations that cannot be an afterthought. In this blog, we look at a few of them from a data storage perspective.

In the beginning there was …

For this storage dinosaur, data storage and compute have always coupled as one. In the mainframe DASD days. these 2 were together. Even with the rise of networking architectures and protocols, from IBM SNA, DECnet, Ethernet & TCP/IP, and Token Ring FC-SAN (sorry, this is just a joke), the SANs, the filers to the servers were close together, albeit with a network buffered layer.

A decade ago, when the public clouds started appearing, data storage and compute were mostly inseparable. There was demarcation of public clouds and private clouds. The notion of hybrid clouds meant public clouds and private clouds can intermix with on-premise computing and data storage but in almost all cases, this was confined to a single public cloud provider. Until these public cloud providers realized they were not able to entice the larger enterprises to move their IT out of their on-premises data centers to the cloud convincingly. So, these public cloud providers decided to reverse their strategy and peddled their cloud services back to on-prem. Today, Amazon AWS has Outposts; Microsoft® Azure has Arc; and Google Cloud Platform launched Anthos.

Continue reading →

Intel is still a formidable force

By cfheoh | August 17, 2020 - 9:15 am |August 17, 2020 Algorithm, Analytics, Artificial Intelligence, Big Data, Clusters, Composable Infrastructure, Cray Inc, Deep Learning, Disks, Edge Computing, Filesystems, Flash, High Performance Computing, Industry 4.0, Intel, IoT, Linux, Machine Learning, Performance Benchmark, Performance Caching, Scale-out architecture, SNIA, Solid State Devices, Storage Field Day, Storage Market Share, Storage Optimization, Storage Tiering, Tech Field Day

1 Comment

It is easy to kick someone who is down. Bad news have stronger ripple effects than the good ones. Intel® is going through a rough patch, and perhaps the worst one so far. They delayed their 7nm manufacturing process, one which could have given Intel® the breathing room in the CPU war with rival AMD. And this delay has been pushed back to 2021, possibly 2022.

Intel Apple Collaboration and Partnership started in 2005

Their association with Apple® is coming to an end after 15 years, and more security flaws surfaced after the Spectre and Meltdown debacle. Extremetech probably said it best (or worst) last month:

We’ve never seen Intel® struggle like this

If we look deeper (and I am sure you have), all these negative news were related to their processors. Intel® is much, much more than that.

Their Optane™ storage prowess

I have years of association with the folks at Intel® here in Malaysia dating back 20 years. And I hardly see Intel® beating it own drums when it comes to storage technologies but they are beginning to. The Optane™ revolution in storage, has been a game changer. Optane™ enables the implementation of persistent memory or storage class memory, a performance tier that sits between DRAM and the SSD. The speed and more notable the latency of Optane™ are several times faster than the Enterprise SSDs.

Intel pyramid of tiers of storage medium

If you want to know more about Optane™’s latency and speed, here is a very geeky article from Intel®:

Restoring the Balance between Bandwidth and Latency

The list of storage vendors who have embedded Intel® Optane™ into their gears is long. Vast Data, StorOne™, NetApp® MAX Data, Pure Storage® DirectMemory Modules, HPE 3PAR and Nimble Storage, Dell Technologies PowerMax, PowerScale, PowerScale and many more, cement Intel® storage prowess with Optane™.

3D Xpoint, the Phase Change Memory technology behind Optane™ was from the joint venture between Intel® and Micron®. That partnership was dissolved in 2019, but it has not diminished the momentum of next generation Optane™. Alder Stream and Barlow Pass are going to be Gen-2 SSD and Persistent Memory DC DIMM respectively. A screenshot of the Optane™ roadmap appeared in Blocks & Files last week.

Intel next generation Optane roadmap

Continue reading →

Dell EMC Isilon is an Emmy winner!

By cfheoh | March 16, 2020 - 7:41 am |March 17, 2020 100Gigabit Ethernet, Acquisition, Analytics, Appliance, CIFS, Cloud, Clusters, Containers, Data Availability, Deduplication, deduplication, Deep Learning, Dell, DellEMC, Disks, EMC, Flash, Gartner, High Performance Computing, Isilon, Mellanox, Mellanox Technologies, NAS, NetApp, NFS, Performance Caching, Pure Storage, Qumulo, Scale-out architecture, SMB, Snapshots, Software Defined Storage, Solid State Devices, Storage Field Day, Storage Market Share, Storage Optimization, Storage Tiering, Tech Field Day, WekaIO

2 Comments

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at this event. The content of this blog is of my own opinions and views ]

And the Emmy® goes to …

Yes, the Emmy® goes to Dell EMC Isilon! It was indeed a well deserved accolade and an honour!

Dell EMC Isilon had just won the Technology & Engineering Emmy® Awards a week before Storage Field Day 19, for their outstanding pioneering work on the NAS platform tiering technology of media and broadcasting content according to business value.

A lasting true clustered NAS

This is not a blog to praise Isilon but one that instill respect to a real true clustered, scale-out file system. I have known of OneFS for a long time, but never really took the opportunity to really put my hands on it since 2006 (there is a story). So here is a look at history …

Back in early to mid-2000, there was a lot of talks about large scale NAS. There were several players in the nascent scaling NAS market. NetApp was the filer king, with several competitors such as Polyserve, Ibrix, Spinnaker, Panasas and the young upstart Isilon. There were also Procom, BlueArc and NetApp’s predecessor Auspex. By the second half of the 2000 decade, the market consolidated and most of these NAS players were acquired.

NetApp acquired Spinnaker in 2003
Part of Auspex was acquired by NetApp in 2003; The other by Glasshouse Technologies
Procom was picked up by Sun Microsystems in 2005
Polyserve went to HP in 2007
Ibrix joined HP as well in 2009
Isilon got acquired by EMC in 2010
BlueArc gobbled up by HDS in 2011

Continue reading →

StorageGRID gets gritty

By cfheoh | March 9, 2020 - 7:06 am |March 9, 2020 Acquisition, Amazon Web Services, Analytics, API, Appliance, Artificial Intelligence, Backup, Big Data, Cloud, Clusters, Data Archiving, Data Fabric, Data Management, Data Protection, Deep Learning, Filesystems, HDS, Hitachi Vantara, ILM, Machine Learning, NAS, NetApp, Object Storage, Software Defined Storage, Storage Field Day, Storage Market Share, Storage Optimization, Tech Field Day

2 Comments

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at the event. The content of this blog is of my own opinions and views ]

NetApp® presented StorageGRID® Webscale (SGWS) at Storage Field Day 19 last month. It was timely when the general purpose object storage market, in my humble opinion, was getting disillusioned and almost about to deprive itself of the value of what it was supposed to be.

“Cheap and deep“, “Race to Zero” were some of the less storied calls I have come across when discussing about object storage, and it was really de-valuing the merits of object storage as vendors touted their superficial glory of being in the IDC Marketscape for Object-based Storage 2019.

Almost every single conversation I had in the past 3 years was either explaining what object storage is or “That is cheap storage right?”

Continue reading →

Category Archives: Deep Learning

Where are your files living now?

Storage IO straight to GPU

Exciting the gamers

Rethinking File Security Fundamentals

Discovering OpenZFS Fusion Pool

What is Fusion Pool (aka ZFS Allocation Classes)?

The Edge is coming! The Edge is coming!

Storage at the Edge

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense

Share this:

Exciting the gamers

Share this:

Share this:

Share this:

What is Fusion Pool (aka ZFS Allocation Classes)?

Share this:

Storage at the Edge

Share this:

In the beginning there was …

Share this:

Their Optane™ storage prowess

Share this:

And the Emmy® goes to …

A lasting true clustered NAS

Share this:

Share this:

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense