Linux – Page 5 – Storage Gaga

Open Source and Open Standards open the Future

By cfheoh | February 3, 2020 - 7:19 am |February 3, 2020 API, Big Data, Cloud, Composable Infrastructure, Data Fabric, Data Security, Disks, Filesystems, Intel, Linux, Memory Cloud, nVidia, PCIe, RDMA, Solid State Devices, Storage Field Day, Tech Field Day

4 Comments

[Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

Western Digital dived into Storage Field Day 19 in full force as they did in Storage Field Day 18. A series of high impact presentations, each curated for the diverse requirements of the audience. Several open source initiatives were shared, all open standards to address present inefficiencies and designed and developed for a greater future.

Zoned Storage

One of the initiatives is to increase the efficiencies around SMR and SSD zoning capabilities and removing the complexities and overlaps of both mediums. This is the Zoned Storage initiatives a technical working proposal to the existing NVMe standards. The resulting outcome will give applications in the user space more control on the placement of data blocks on zone aware devices and zoned SSDs, collectively as Zoned Block Device (ZBD). The implementation in the Linux user and kernel space is shown below:

Continue reading →

Zoned Technologies with Western Digital

By cfheoh | January 14, 2020 - 10:29 am |January 14, 2020 API, Composable Infrastructure, Disks, Drivescale, Dropbox, Filesystems, Flash, IoT, Linux, Liqid, Open Compute Project, Reliability, SATA, SCSI, Seagate, Solid State Devices, Storage Field Day, Tech Field Day, Western Digital

2 Comments

[Disclosure: I am invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees will be covered by GestaltIT, the organizer and I am not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

Storage Field Day 19 is a week away. And one of the vendors presenting is Western Digital, who also presented at Storage Field Day 18 almost a year ago. Here is my blog where I received the full force of Western Digital. In that 10 months or so, Western Digital has sold off their IntelliFlash assets to Data Direct Networks and leaving their ActiveScale object storage platform in limbo.

What is in store from Western D?

I am eager to find out what coming from Western Digital. They have tons of storage technologies that I have yet to encounter, and this anticipation is keeping me excited for the Western D session at Storage Field Day 19.

For a few years I have been keen on a few Western D’s technologies which were moving up the value chain. They are:

Symbotics Design™ (although I think they changed their marketing messaging)
OpenFlex architecture, Fabric devices and enclosures
KingFish™ API for composable infrastructure

In my patch, the signals of the 3 Western D’s technologies have gone weak in the past year. However, there is a lot of momentum right now for Zoned Storage and Zoned Name Space and I believe this could be what is in store for the storage propeller heads like us at Storage Field Day 19.

Continue reading →

ZFS Replication and Recovery with FreeNAS

By cfheoh | November 2, 2019 - 9:51 am |November 2, 2019 Appliance, Backup, Business Continuity, CIFS, Data Availability, Data Corruption, Data Management, Data Protection, Data Security, Disaster Recovery, Disks, Filesystems, FreeNAS, Linux, Microsoft, NAS, NetApp, NFS, Reliability, Snapshots, Virtualization

2 Comments

We get requests to recover data from a secondary platform all the time. RPO (recovery point objective) of 30 minutes can be challenging to small to medium sized companies, especially if there is an SLA (service level agreement) to meet.

This week, my team and I took some time to create a FreeNAS replication demo for a potential client. I thought I document the whole thing about ZFS replication, the key steps to set it up and show how recovery is done.

ZFS Snapshots

ZFS replication relies on periodic ZFS snapshots. ZFS snapshot is an inherent feature from the ZFS file system, and often used as a point-in-time copy of the existing ZFS file system tree in memory. Once a snapshot has been triggered, either manually or on schedule (periodic), the file system tree and its metadata in the memory are committed to disk to ensure an updated and consistent state of the file system at all times.

To start, a running snapshot policy on a schedule must be in place. This snapshot policy can be on a specific dataset or zvol, or even the entire zpool. Yeah, I am using quite a few ZFS terminology here – zpool, zvol, dataset. You can read more about each of the structures and more here.

Once the ZFS replication task has been setup, every snapshot occurred in the snapshot policy is automatically duplicated and copied to the target ZFS dataset. Usually, the target ZFS dataset is on a secondary FreeNAS storage server, serving as a disaster recovery platform. Sending and receiving data in the snapshots rely on SSH service.

This is the network diagram explaining the FreeNAS ZFS replication setup.

Continue reading →

The waning light of OpenStack Swift

By cfheoh | August 17, 2019 - 4:11 pm |August 17, 2019 API, Ceph, Cloud, Clusters, Containers, Data Management, Edge Computing, Filesystems, iSCSI, Kubernetes, Linux, NAS, NFS, Object Storage, Openstack, Redhat, Software Defined Storage, swiftstack, Unified Storage

The writing is on the wall

Through the storage lens, I already griped about the conundrum of OpenStack storage in Malaysia in last year’s 8th anniversary. And at the thick of this conundrum is OpenStack Swift. The granddaddy of OpenStack storage has not gotten much attention from technology vendors and service providers alike. For one, storage vendors have their own object storage offering, and has little incentive to place OpenStack Swift into their technology development. Continue reading →

WekaIO controls their performance destiny

By cfheoh | March 17, 2019 - 5:33 pm |March 17, 2019 Amazon Web Services, Analytics, Appliance, Big Data, CIFS, Cloud, Deep Learning, Filesystems, Flash, High Performance Computing, Infiniband, Linux, Lustre, Machine Learning, Mellanox Technologies, NAS, NetApp, NFS, NVMe, Object Storage, PCIe, Performance Benchmark, Performance Caching, RDMA, Scale-out architecture, SMB, Software Defined Storage, Storage Field Day, Storage Optimization, Storage Tiering, Tech Field Day, Virtualization, WekaIO, Western Digital

3 Comments

[Preamble: I have been invited by GestaltIT as a delegate to their Tech Field Day for Storage Field Day 18 from Feb 27-Mar 1, 2019 in the Silicon Valley USA. My expenses, travel and accommodation were covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

I was first introduced to WekaIO back in Storage Field Day 15. I did not blog about them back then, but I have followed their progress quite attentively throughout 2018. 2 Storage Field Days and a year later, they were back for Storage Field Day 18 with a new CTO, Andy Watson, and several performance benchmark records.

Blowout year

2018 was a blowout year for WekaIO. They have experienced over 400% growth, placed #1 in the Virtual Institute IO-500 10-node performance challenge, and also became #1 in the SPEC SFS 2014 performance and latency benchmark. (Note: This record was broken by NetApp a few days later but at a higher cost per client)

The Virtual Institute for I/O IO-500 10-node performance challenge was particularly interesting, because it pitted WekaIO against Oak Ridge National Lab (ORNL) Summit supercomputer, and WekaIO won. Details of the challenge were listed in Blocks and Files and WekaIO Matrix Filesystem became the fastest parallel file system in the world to date.

Control, control and control

I studied WekaIO’s architecture prior to this Field Day. And I spent quite a bit of time digesting and understanding their data paths, I/O paths and control paths, in particular, the diagram below:

Starting from the top right corner of the diagram, applications on the Linux client (running Weka Client software) and it presents to the Linux client as a POSIX-compliant file system. Through the network, the Linux client interacts with the WekaIO kernel-based VFS (virtual file system) driver which coordinates the Front End (grey box in upper right corner) to the Linux client. Other client-based protocols such as NFS, SMB, S3 and HDFS are also supported. The Front End then interacts with the NIC (which can be 10/100G Ethernet, Infiniband, and NVMeoF) through SR-IOV (single root IO virtualization), bypassing the Linux kernel for maximum throughput. This is with WekaIO’s own networking stack in user space. Continue reading →

StorPool – Block storage managed well

By cfheoh | March 13, 2019 - 8:34 am |March 13, 2019 100Gigabit Ethernet, API, Cloud, Clusters, Data Availability, Data Management, Disks, Filesystems, High Performance Computing, iSCSI, Linux, Lustre, Nexenta, NVMe, Openstack, Performance Benchmark, Performance Caching, Scale-out architecture, Server SAN, Software Defined Storage, Storage Field Day, Storage Optimization, Storpool, Tech Field Day, Virtualization, VMware

2 Comments

Storage technology is complex. Storage infrastructure and data management operations are not trivial, despite what the hyperscalers like Amazon Web Services and Microsoft Azure would like you to think. As the adoption of cloud infrastructure services grow, the small and medium businesses/enterprises (SMB/SME) are usually left to their own devices to manage the virtual storage infrastructure. Cloud Service Providers (CSPs) addressing the SMB/SME market are looking for easier, worry-free, software-defined storage to elevate their value to their customers.

Managed high performance block storage

Enter StorPool.

StorPool is a scale-out block storage technology, capable of delivering 1 million+ IOPS with sub-milliseconds response times. As described by fellow delegate, Ray Lucchesi in his recent blog, they were able to achieve these impressive performance numbers in their demo, without the high throughput RDMA network or the storage class memory of Intel Optane. Continue reading →

Minio – the minimalist object storage technology

By cfheoh | February 7, 2019 - 10:57 am |February 7, 2019 Amazon Web Services, Analytics, API, Appliance, Big Data, Cloud, Data, Data Archiving, Data Fabric, Data Management, Data Protection, Data Security, Deep Learning, Disaster Recovery, Edge Computing, Filesystems, Flash, FreeNAS, HDS, High Performance Computing, Hitachi Vantara, Industry 4.0, Katana Logic, Linux, Machine Learning, Minio, NAS, NetApp, NFS, Object Storage, Openstack, Redhat, Scale-out architecture, Tech Field Day

4 Comments

The Marie Kondo Konmari fever is sweeping the world. Her decluttering and organizing the home methods are leading to a new way of life – Minimalism.

Complicated Storage Experience

Storage technology and its architecture are complex. We layer upon layer of abstraction and virtualization into storage design until at some stage, choke points lead to performance degradation, and management becomes difficult.

I recalled a particular training I attended back in 2006. I just joined Hitachi Data Systems for the Shell GUSto project. I was in Baltimore for the Hitachi NAS course. This was not their HNAS (their BlueArc acquisition) but their home grown NAS based on Linux. In the training, we were setting up NFS service. There were 36 steps required to setup and provision NFS and if there was a misstep, you start from the first command again. Coming from NetApp at the time, it was horrendous. NetApp ONTAP NFS setup and provisioning probably took 3 commands, and this Hitachi NAS setup and configuration was so much more complex. In the end, the experience was just unworldly for me.

Introducing Minio to my world, to Malaysia

Continue reading →

Sleepless in Malaysia with Object Storage

By cfheoh | January 22, 2019 - 1:42 pm |January 22, 2019 Amazon, Amazon Web Services, Analytics, API, Big Data, Cloud, Cloudian, Clusters, Data Management, Deep Learning, DellEMC, Dropbox, Filesystems, Google, Hadoop, Hadoop Clusters, HDS, IDC, IoT, iSCSI, Linux, Machine Learning, Microsoft, Minio, NAS, NFS, Object Storage, OpenIO, Redhat, Security, swiftstack

Object Storage? What’s that?

For the past couple of months, I have been speaking with a few parties in Malaysia about object storage technology. And I was fairly surprised with the responses.

The 2 reports

For a start, I did not set out to talk about object storage. It kind of fell onto my lap. 2 recent Hitachi Vantara reports revealed that countries like Australia, Hong Kong and even South East Asian countries were behind in their understanding of what object storage was, and the benefits it brought to the new generation of web scale and enterprise applications.

In the first report, an IDC survey sponsored by Hitachi Vantara, mentioned that 41% of the enterprises in Australia are not aware of object storage technology. In a similar survey, this one pointing towards Hong Kong and China, the percentages were 38% and 35% respectively. I would presume that the percentages for countries in South East Asia would not fall too far from the apple tree.

How is Malaysia doing?

However, I worry that the percentage number could be far more dire in Malaysia. In the past 2 months, responses from several conversations painted a darker hue about object storage technology with the companies in Malaysia. These included a reasonable sized hosting company, a well-established systems integrator, a software development company, several storage practitioners in Openstack and a DellEMC’s regional consultant for unstructured data. The collective conclusion was object storage technology was relatively unknown (probably similar to the percentages to the IDC/Hitachi Vantara reports), but it appeared to be shunned at this juncture. In web scale applications, Redhat Ceph block and files appeared popular in contrast to Openstack Swift. In enterprise applications, it was a toss of iSCSI and NFS.

Image from https://zdnet4.cbsistatic.com/hub/i/r/2018/04/24/c79e9dfb-b4a9-46bb-b831-f2c57fdf8a1d/resize/470xauto/5e4846d1bc7a034c382baf6dcbb612ed/cloud-storage.jpg

Continue reading →

Sexy HPC storage is all the rage

By cfheoh | November 26, 2018 - 10:44 am |November 26, 2018 100Gigabit Ethernet, Analytics, API, Artificial Intelligence, BeeGFS, CIFS, Clusters, Data Management, Deep Learning, DellEMC, Disks, E8 Storage, EMC, Excelero, Filesystems, Hadoop Clusters, High Performance Computing, Hyperconvergence, IBM, Infiniband, Intel, Linux, Lustre, Machine Learning, Mellanox, Memory Cloud, NAS, NetApp, NFS, Panasas, Performance Benchmark, Performance Caching, Pure Storage, RDMA, Scale-out architecture, SMB, Software-defined Datacenter, Storage Field Day, Tech Field Day, ThinkParq, WekaIO

HPC is sexy

There is no denying it. HPC is sexy. HPC Storage is just as sexy.

Looking at the latest buzz from Super Computing Conference 2018 which happened in Dallas 2 weeks ago, the number of storage related vendors participating was staggering. Panasas, Weka.io, Excelero, BeeGFS, are the ones that I know because I got friends posting their highlights. Then there are the perennial vendors like IBM, Dell, HPE, NetApp, Huawei, Supermicro, and so many more. A quick check on the SC18 website showed that there were 391 exhibitors on the floor.

And this is driven by the unrelentless demand for higher and higher performance of computing, and along with it, the demands for faster and faster storage performance. Commercialization of Artificial Intelligence (AI), Deep Learning (DL) and newer applications and workloads together with the traditional HPC workloads are driving these ever increasing requirements. However, most enterprise storage platforms were not designed to meet the demands of these new generation of applications and workloads, as many have been led to believe. Why so?

I had a couple of conversations with a few well known vendors around the topic of HPC Storage. And several responses thrown back were to put Flash and NVMe to solve the high demands of HPC storage performance. In my mind, these responses were too trivial, too irresponsible. So I wanted to write this blog to share my views on HPC storage, and not just about its performance.

The HPC lines are blurring

I picked up this video (below) a few days ago. It was insideHPC Rich Brueckner interview with Dr. Goh Eng Lim, HPE CTO and renowned HPC expert about the convergence of both traditional and commercial HPC applications and workloads.

I liked the conversation in the video because it addressed the 2 different approaches. And I welcomed Dr. Goh’s invitation to the Commercial HPC community to work with the Traditional HPC vendors to help push the envelope towards Exascale SuperComputing.

Continue reading →

Pondering Redhat’s future with IBM

By cfheoh | October 30, 2018 - 5:19 am |October 30, 2018 Acquisition, Artificial Intelligence, Deep Learning, High Performance Computing, IBM, Linux, Machine Learning, Openstack, Redhat, Software-defined Datacenter, Virtualization

1 Comment

I woke up yesterday morning with a shocker of a news. IBM announced that they were buying Redhat for USD34 billion. Never in my mind that Redhat would sell but I guess that USD190.00 per share was too tempting. Redhat (RHT) was trading at USD116.68 on the previous Friday’s close.

Redhat is one of my favourite technology companies. I love their Linux development and progress, and I use a lot of Fedora and CentOS in my hobbies. I started with Redhat back in 2000, when I became obsessed to get my RHCE (Redhat Certified Engineer). I recalled on almost every weekend (Saturday and Sunday) back in 2002 when I was in the office, learning Redhat, and hacking scripts to be really good at it. I got certified with RHCE 4 with a 96% passing mark, and I was very proud of my certification.

One of my regrets was not joining Redhat in 2006. I was offered the job as an SE by Josep Garcia, and the very first position in Malaysia. Instead, I took up the Hitachi Data Systems job to helm the project implementation and delivery for the Shell GUSto project. It might have turned out differently if I did.

The IBM acquisition of Redhat left a poignant feeling in me. In many ways, Redhat has been the shining star of Linux. They are the only significant one left leading the charge of open source. They are the largest contributors to the Openstack projects and continue to support the project strongly whilst early protagonists like HPE, Cisco and Intel have reduced their support. They are of course, the perennial top 3 contributors to the Linux kernel since the very early days. And Redhat continues to contribute to projects such as containers and Kubernetes and made that commitment deeper with their recent acquisition of CoreOS a few months back.

Continue reading →

Category Archives: Linux