Tech Field Day Archives

Making Immutability the key factor in a Resilient Data Protection strategy

By cfheoh | May 20, 2024 - 7:30 am |May 19, 2024 Analytics, Appliance, Artificial Intelligence, Backup, Business Continuity, Cloud, Commvault, Data, Data Archiving, Data Availability, Data Corruption, Data Governance, Data Management, Data Protection, Data Security, Disaster Recovery, Futurum Group, Object Storage, RAID, Reliability, Security, Tape storage, Tech Field Day, Veeam, Veritas

It is no longer 3-2-1 anymore, Toto.

When it comes to backup, I always start with 3-2-1 backup rule. 3 copies of the data; 2 different media; 1 offsite. This rule has been ingrained in me since the day I entered the industry over 3 decades ago. It is still the most important opening line for a data protection specialist or a solution architect. 3-2-1 is the table stakes.

Yet, over the years, the cybersecurity threat landscape has moved closer and closer to the data protection, backup and recovery realm. This is now a merged super-segment pangea called cyber resilience. With it, the conversation from the 3-2-1 backup rule in these last few years is now evolving into something like 3-2-1-1-0 backup rule, a modern take of the 3-2-1 backup rule. Let’s take a look at the 3-2-1-1-0 rule (simplified by me).

The 3-2-1-1-0 Backup rule (Credit: https://www.dataprise.com/services/disaster-recovery/baas/)

Continue reading →

AI is pushing storage and data management harder than ever

By cfheoh | April 29, 2024 - 7:30 am |April 28, 2024 100Gigabit Ethernet, Algorithm, Analytics, Artificial Intelligence, Big Data, Clusters, Data, Data Governance, Data Management, Data Privacy, Data Protection, Data Security, Deep Learning, Digital Transformation, DMTF, High Performance Computing, Machine Learning, nVidia, Scale-out architecture, Solidigm, Tech Field Day

2 Comments

I am on a learning streak again. The most prominent technology that keeps landing on my tray at present is, of course, Artificial Intelligence (AI).

AI is hot. Very hot. And overhyped. Everyone is an expert nowadays. Yeah, right. Not me.

Underneath that glossy veneer of the AI hype, there are much going on behind the scenes to make AI great. The 2 areas I have been involved in and practiced for a long time are data infrastructure a.k.a. storage, and data management. And both are playing prominent parts in the advancement of the AI ecosystem. This makes me very excited.

I am no expert, but learning from various sources is already telling me that AI is pushing both storage and data management harder than ever before, much harder than traditional enterprise on-premises use cases and even the cloud computing applications. I ask myself, “where do I start my learning again?” as I journal my process.

Storage performance in a Data Pipeline

Speed of how AI responds is Trust. The faster it is to the accurate and relevant responses will build trust in AI. To get to the speed that we want is not an easy thing, and storage a.k.a. data infrastructure is doing its part. I pick up my learning from understanding the AI pipeline. One early help comes from my friend, Gina Rosenthal, who attended the Solidigm‘s presentation at AI Field Day in February 2024. Her article, titled “Why storage matters for AI – Solidigm“, kickstarted my learning juices again.

I was particularly captured by this slide in Gina’s article. It defines the laborious path data takes to become useful for AI applications.

Storage and the 5 stages of AI Work (reference: Solidigm presentation on AI Field Day)

Continue reading →

Disaggregation and Composability vital for AI/DL models to scale

By cfheoh | July 26, 2023 - 8:22 am |July 26, 2023 100Gigabit Ethernet, Algorithm, Analytics, API, Artificial Intelligence, Cloud, Clusters, Composable Infrastructure, Containers, CXL, Data Management, Deep Learning, Docker, Hyperconvergence, Liqid, Machine Learning, nVidia, NVMe, PCIe, Scale-out architecture, Software-defined Datacenter, Storage Field Day, Tech Field Day, Virtualization, VMware

Is there no end to the threat of ransomware?

By cfheoh | June 20, 2022 - 8:00 am |June 20, 2022 Appliance, Artificial Intelligence, Backup, Business Continuity, Cloud, Cohesity, Data, Data Archiving, Data Availability, Data Management, Data Privacy, Data Protection, Data Security, Digital Transformation, Disaster Recovery, Druva, Filesystems, HDS, Hitachi Vantara, ILM, iRODS, Racktop Systems, Reliability, Rubrik, SASE, Security, Sophos, Storage Field Day, Tech Field Day

Intel is still a formidable force

By cfheoh | August 17, 2020 - 9:15 am |August 17, 2020 Algorithm, Analytics, Artificial Intelligence, Big Data, Clusters, Composable Infrastructure, Cray Inc, Deep Learning, Disks, Edge Computing, Filesystems, Flash, High Performance Computing, Industry 4.0, Intel, IoT, Linux, Machine Learning, Performance Benchmark, Performance Caching, Scale-out architecture, SNIA, Solid State Devices, Storage Field Day, Storage Market Share, Storage Optimization, Storage Tiering, Tech Field Day

1 Comment

It is easy to kick someone who is down. Bad news have stronger ripple effects than the good ones. Intel® is going through a rough patch, and perhaps the worst one so far. They delayed their 7nm manufacturing process, one which could have given Intel® the breathing room in the CPU war with rival AMD. And this delay has been pushed back to 2021, possibly 2022.

Intel Apple Collaboration and Partnership started in 2005

Their association with Apple® is coming to an end after 15 years, and more security flaws surfaced after the Spectre and Meltdown debacle. Extremetech probably said it best (or worst) last month:

We’ve never seen Intel® struggle like this

If we look deeper (and I am sure you have), all these negative news were related to their processors. Intel® is much, much more than that.

Their Optane™ storage prowess

I have years of association with the folks at Intel® here in Malaysia dating back 20 years. And I hardly see Intel® beating it own drums when it comes to storage technologies but they are beginning to. The Optane™ revolution in storage, has been a game changer. Optane™ enables the implementation of persistent memory or storage class memory, a performance tier that sits between DRAM and the SSD. The speed and more notable the latency of Optane™ are several times faster than the Enterprise SSDs.

Intel pyramid of tiers of storage medium

If you want to know more about Optane™’s latency and speed, here is a very geeky article from Intel®:

Restoring the Balance between Bandwidth and Latency

The list of storage vendors who have embedded Intel® Optane™ into their gears is long. Vast Data, StorOne™, NetApp® MAX Data, Pure Storage® DirectMemory Modules, HPE 3PAR and Nimble Storage, Dell Technologies PowerMax, PowerScale, PowerScale and many more, cement Intel® storage prowess with Optane™.

3D Xpoint, the Phase Change Memory technology behind Optane™ was from the joint venture between Intel® and Micron®. That partnership was dissolved in 2019, but it has not diminished the momentum of next generation Optane™. Alder Stream and Barlow Pass are going to be Gen-2 SSD and Persistent Memory DC DIMM respectively. A screenshot of the Optane™ roadmap appeared in Blocks & Files last week.

Intel next generation Optane roadmap

Continue reading →

Dell EMC Isilon is an Emmy winner!

By cfheoh | March 16, 2020 - 7:41 am |March 17, 2020 100Gigabit Ethernet, Acquisition, Analytics, Appliance, CIFS, Cloud, Clusters, Containers, Data Availability, deduplication, Deduplication, Deep Learning, Dell, DellEMC, Disks, EMC, Flash, Gartner, High Performance Computing, Isilon, Mellanox, Mellanox Technologies, NAS, NetApp, NFS, Performance Caching, Pure Storage, Qumulo, Scale-out architecture, SMB, Snapshots, Software Defined Storage, Solid State Devices, Storage Field Day, Storage Market Share, Storage Optimization, Storage Tiering, Tech Field Day, WekaIO

2 Comments

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at this event. The content of this blog is of my own opinions and views ]

And the Emmy® goes to …

Yes, the Emmy® goes to Dell EMC Isilon! It was indeed a well deserved accolade and an honour!

Dell EMC Isilon had just won the Technology & Engineering Emmy® Awards a week before Storage Field Day 19, for their outstanding pioneering work on the NAS platform tiering technology of media and broadcasting content according to business value.

A lasting true clustered NAS

This is not a blog to praise Isilon but one that instill respect to a real true clustered, scale-out file system. I have known of OneFS for a long time, but never really took the opportunity to really put my hands on it since 2006 (there is a story). So here is a look at history …

Back in early to mid-2000, there was a lot of talks about large scale NAS. There were several players in the nascent scaling NAS market. NetApp was the filer king, with several competitors such as Polyserve, Ibrix, Spinnaker, Panasas and the young upstart Isilon. There were also Procom, BlueArc and NetApp’s predecessor Auspex. By the second half of the 2000 decade, the market consolidated and most of these NAS players were acquired.

NetApp acquired Spinnaker in 2003
Part of Auspex was acquired by NetApp in 2003; The other by Glasshouse Technologies
Procom was picked up by Sun Microsystems in 2005
Polyserve went to HP in 2007
Ibrix joined HP as well in 2009
Isilon got acquired by EMC in 2010
BlueArc gobbled up by HDS in 2011

Continue reading →

Will there be Trust at Digital Events?

By cfheoh | March 13, 2020 - 5:07 am |March 13, 2020 Amazon Web Services, Analytics, Cloud, Digital Transformation, Storage Field Day, Tech Field Day

1 Comment

[ This article was published on LinkedIn on March 8, 2020. The original article link is here ]

關係 (Guan Xi) is ingrained into the psyche of many Asian cultures and businesses. It is fundamental to build connections and relationships, and consequently forging trust in those relationships. And it is best when it involves a face-to-face communication and building the common foundational belief of one another.

The COVID-19 outbreak is wreaking havoc and may become a global pandemic if the situation continue unabating in the coming months. In light of safety, many vendors are either canceling the physical event or switching to digital events or virtual events. On my radar this past week, there are Dell Tech World, AWS Singapore Summit and Google Cloud Next, to name a few. How do we build trust from these digital and virtual events?

All about the experience

The experience to engage at physical technology events is priceless. Putting the face to the name, to shake the hand and rub shoulders to connect cannot be quantified by just being present. Sharing war stories over coffee or beer, and exchanging good jokes and bad ones over dinner, are experiences which cannot be taken away in our lifetime. That is why I have always thoroughly enjoyed my Field Day experiences since 2014.

I am old school. I believe in 關係, because the kind of camaraderie, the fellowship, the brotherhood or sisterhood built from trust is immeasurable. The chemistry mix of experience would be hard to reproduced. An old hand at EMC once said to my team and I, “I would go to war with you guys any day!“.

The question today is “Can Digital or Virtual Events replicate that experience and build trust?”.

Continue reading →

StorageGRID gets gritty

By cfheoh | March 9, 2020 - 7:06 am |March 9, 2020 Acquisition, Amazon Web Services, Analytics, API, Appliance, Artificial Intelligence, Backup, Big Data, Cloud, Clusters, Data Archiving, Data Fabric, Data Management, Data Protection, Deep Learning, Filesystems, HDS, Hitachi Vantara, ILM, Machine Learning, NAS, NetApp, Object Storage, Software Defined Storage, Storage Field Day, Storage Market Share, Storage Optimization, Tech Field Day

2 Comments

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at the event. The content of this blog is of my own opinions and views ]

NetApp® presented StorageGRID® Webscale (SGWS) at Storage Field Day 19 last month. It was timely when the general purpose object storage market, in my humble opinion, was getting disillusioned and almost about to deprive itself of the value of what it was supposed to be.

“Cheap and deep“, “Race to Zero” were some of the less storied calls I have come across when discussing about object storage, and it was really de-valuing the merits of object storage as vendors touted their superficial glory of being in the IDC Marketscape for Object-based Storage 2019.

Almost every single conversation I had in the past 3 years was either explaining what object storage is or “That is cheap storage right?”

Continue reading →

Paradigm shift of Dev to Storage Ops

By cfheoh | March 2, 2020 - 5:47 am |March 2, 2020 Amazon Web Services, API, Artificial Intelligence, Ceph, Cloud, Composable Infrastructure, Containers, Data Management, Deep Learning, Docker, Drivescale, Edge Computing, Filesystems, Hadoop Clusters, High Performance Computing, IBM, Kubernetes, Linux, Liqid, Machine Learning, Minio, Object Storage, Performance Benchmark, Redhat, Scale-out architecture, Software Defined Storage, Storage Field Day, Tech Field Day, VMware

2 Comments

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at the event. The content of this blog is of my own opinions and views ]

A funny photo (below) came up on my Facebook feed a couple of weeks back. In an honest way, it depicted how a developer would think (or the lack of thinking) about the storage infrastructure designs and models for the applications and workloads. This also reminded me of how DBAs used to diss storage engineers. “I don’t care about storage, as long as it is RAID 10“. That was aeons ago 😉

The world of developers and the world of infrastructure people are vastly different. Since cloud computing birthed, both worlds have collided and programmable infrastructure-as-code (IAC) have become part and parcel of cloud native applications. Of course, there is no denying that there is friction.

Welcome to DevOps!

The Kubernetes factor

Containerized applications are quickly defining the cloud native applications landscape. The container orchestration machinery has one dominant engine – Kubernetes.

In the world of software development and delivery, DevOps has taken a liking to containers. Containers make it easier to host and manage life-cycle of web applications inside the portable environment. It packages up application code other dependencies into building blocks to deliver consistency, efficiency, and productivity. To scale to a multi-applications, multi-cloud with th0usands and even tens of thousands of microservices in containers, the Kubernetes factor comes into play. Kubernetes handles tasks like auto-scaling, rolling deployment, computer resource, volume storage and much, much more, and it is designed to run on bare metal, in the data center, public cloud or even a hybrid cloud.

Continue reading →

DellEMC Project Nautilus Re-imagine Storage for Streams

By cfheoh | February 24, 2020 - 5:56 am |February 25, 2020 Algorithm, Analytics, API, Artificial Intelligence, Big Data, Cloud, Confluent, Data, Data Management, Deep Learning, Dell, DellEMC, Edge Computing, EMC, Fog Computing, Industry 4.0, InfluxDB, IoT, Isilon, Kubernetes, Linux, Machine Learning, Pravega, Storage Field Day, Tech Field Day

2 Comments

Cloud computing will have challenges processing data at the outer reach of its tentacles. Edge Computing, as it melds with the Internet of Things (IoT), needs a different approach to data processing and data storage. Data generated at source has to be processed at source, to respond to the event or events which have happened. Cloud Computing, even with 5G networks, has latency that is not sufficient to how an autonomous vehicle react to pedestrians on the road at speed or how a sprinkler system is activated in a fire, or even a fraud detection system to signal money laundering activities as they occur.

Furthermore, not all sensors, devices, and IoT end-points are connected to the cloud at all times. To understand this new way of data processing and data storage, have a look at this video by Jay Kreps, CEO of Confluent for Kafka® to view this new perspective.

Data is continuously and infinitely generated at source, and this data has to be compiled, controlled and consolidated with nanosecond precision. At Storage Field Day 19, an interesting open source project, Pravega, was introduced to the delegates by DellEMC. Pravega is an open source storage framework for streaming data and is part of Project Nautilus.

Rise of streaming time series Data

Processing data at source has a lot of advantages and this has popularized Time Series analytics. Many time series and streams-based databases such as InfluxDB, TimescaleDB, OpenTSDB have sprouted over the years, along with open source projects such as Apache Kafka®, Apache Flink and Apache Druid.

The data generated at source (end-points, sensors, devices) is serialized, timestamped (as event occurs), continuous and infinite. These are the properties of a time series data stream, and to make sense of the streaming data, new data formats such as Avro, Parquet, Orc pepper the landscape along with the more mature JSON and XML, each with its own strengths and weaknesses.

You can learn more about these data formats in the 2 links below:

DIY is difficult

Many time series projects started as DIY projects in many organizations. And many of them are still DIY projects in production systems as well. They depend on tribal knowledge, and these databases are tied to an unmanaged storage which is not congruent to the properties of streaming data.

At the storage end, the technologies today still rely on the SAN and NAS protocols, and in recent years, S3, with object storage. Block, file and object storage introduce layers of abstraction which may not be a good fit for streaming data.

Continue reading →

Category Archives: Tech Field Day

Making Immutability the key factor in a Resilient Data Protection strategy

It is no longer 3-2-1 anymore, Toto.

Is there no end to the threat of ransomware?

Will there be Trust at Digital Events?

All about the experience

DellEMC Project Nautilus Re-imagine Storage for Streams

Rise of streaming time series Data

DIY is difficult

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense

It is no longer 3-2-1 anymore, Toto.

Share this:

Storage performance in a Data Pipeline

Share this:

Share this:

Share this:

Their Optane™ storage prowess

Share this:

And the Emmy® goes to …

A lasting true clustered NAS

Share this:

All about the experience

Share this:

Share this:

The Kubernetes factor

Share this:

Rise of streaming time series Data

DIY is difficult

Share this:

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense