Linux – Page 4 – Storage Gaga

FreeNAS 11.2 & 11.3 eBook

By cfheoh | August 31, 2020 - 9:15 am |August 31, 2020 Apple, Appliance, CIFS, Cloud, Filesystems, FreeNAS, iconik, Intel, iSCSI, iXsystems, Linux, Microsoft, Minio, NAS, NFS, Object Storage, Oracle, RAID, SMB, Snapshots, TrueNAS, Virtualbox

Leave a comment

[ Full disclosure: I work for iXsystems™ Inc. This eBook was 3/4 completed when I joined on July 1, 2020 ]

I am releasing my FreeNAS™ eBook today. It was completed about 4 weeks ago, but I wanted the release date to be significant which is August 31, 2020.

FreeNAS logo

Why August 31st? Because today is Malaysia’s Independence Day.

Why the book?

I am an avid book collector. To be specific, IT and storage technology related books. Since I started working on FreeNAS™ several years ago, I wanted to find a book to learn. But the FreeNAS™ books in the market are based on an old version of FreeNAS™. And the FreeNAS™ documentation is a User Guide where it explains every feature without going deeper with integration of real life networking services, and situational applications such as SMB or NFS client configuration.

Since I have been doing significant amount of feature “testings” of FreeNAS™ from version 9.10 till the present version 11,3 on Virtualbox™, I have decided to fill that gap. I have decided to write a cookbook-style FreeNAS™ on Virtualbox™ that covers most of the real-life integration work with various requirements including Active Directory, cloud integration and so on. All for extending beyond the FreeNAS™ documentation.

Continue reading →

Intel is still a formidable force

By cfheoh | August 17, 2020 - 9:15 am |August 17, 2020 Algorithm, Analytics, Artificial Intelligence, Big Data, Clusters, Composable Infrastructure, Cray Inc, Deep Learning, Disks, Edge Computing, Filesystems, Flash, High Performance Computing, Industry 4.0, Intel, IoT, Linux, Machine Learning, Performance Benchmark, Performance Caching, Scale-out architecture, SNIA, Solid State Devices, Storage Field Day, Storage Market Share, Storage Optimization, Storage Tiering, Tech Field Day

1 Comment

It is easy to kick someone who is down. Bad news have stronger ripple effects than the good ones. Intel® is going through a rough patch, and perhaps the worst one so far. They delayed their 7nm manufacturing process, one which could have given Intel® the breathing room in the CPU war with rival AMD. And this delay has been pushed back to 2021, possibly 2022.

Intel Apple Collaboration and Partnership started in 2005

Their association with Apple® is coming to an end after 15 years, and more security flaws surfaced after the Spectre and Meltdown debacle. Extremetech probably said it best (or worst) last month:

We’ve never seen Intel® struggle like this

If we look deeper (and I am sure you have), all these negative news were related to their processors. Intel® is much, much more than that.

Their Optane™ storage prowess

I have years of association with the folks at Intel® here in Malaysia dating back 20 years. And I hardly see Intel® beating it own drums when it comes to storage technologies but they are beginning to. The Optane™ revolution in storage, has been a game changer. Optane™ enables the implementation of persistent memory or storage class memory, a performance tier that sits between DRAM and the SSD. The speed and more notable the latency of Optane™ are several times faster than the Enterprise SSDs.

Intel pyramid of tiers of storage medium

If you want to know more about Optane™’s latency and speed, here is a very geeky article from Intel®:

Restoring the Balance between Bandwidth and Latency

The list of storage vendors who have embedded Intel® Optane™ into their gears is long. Vast Data, StorOne™, NetApp® MAX Data, Pure Storage® DirectMemory Modules, HPE 3PAR and Nimble Storage, Dell Technologies PowerMax, PowerScale, PowerScale and many more, cement Intel® storage prowess with Optane™.

3D Xpoint, the Phase Change Memory technology behind Optane™ was from the joint venture between Intel® and Micron®. That partnership was dissolved in 2019, but it has not diminished the momentum of next generation Optane™. Alder Stream and Barlow Pass are going to be Gen-2 SSD and Persistent Memory DC DIMM respectively. A screenshot of the Optane™ roadmap appeared in Blocks & Files last week.

Intel next generation Optane roadmap

Continue reading →

The True Value of TrueNAS CORE

By cfheoh | July 20, 2020 - 9:00 am |July 21, 2020 Appliance, BeeGFS, CIFS, Clusters, Data Direct Networks, Datto, Filesystems, FreeNAS, High Performance Computing, iXsystems, Joyent, Linux, Lustre, NAS, NetApp, Nexenta, NFS, Pure Storage, QNAP, Scale-out architecture, Software Defined Storage, Tegile, TrueNAS

Leave a comment

A funny thing came up on my Twitter feed last week. There was an ongoing online voting battle pitting FreeNAS™ (now shall be known as TrueNAS® CORE) against Unraid. I wasn’t aware of it before that and I would not comment about Unraid because I have no experience with the software. But let me share with you my philosophy and my thoughts why I would choose TrueNAS® CORE over Unraid and of course TrueNAS® Enterprise along with it. We have to bear in mind that TrueNAS® SCALE is in development and will soon be here next year in 2021.

The new TrueNAS CORE logo

The real proving grounds

I have been in enterprise storage for a long time. If I were to count the days I entered the industry, that was more than 28 years ago. When people talked about their first PC (personal computer), they would say Atari or Commodore 64, or something retro that was meant for home use. Not me.

My first computer I was affiliated with was a SUN SPARC®station 2 (SS2). I took it home (from the company I was working with), opened it apart, and learned about the SBUS. My computer life started with a technology that was meant for the businesses, for the enterprise. Heck, I even installed and supported a few of the Sun E10000 for 2 years when I was with Sun Microsystems. Since that SS2, my pursuit of knowledge, experience and worldview evolved around storage technologies for the enterprise.

Open source software has also always interested me. I tried a few file systems including Lustre®, that parallel file system that powered some of the world’s supercomputers and I am a certified BeeGFS® Systems Engineer too. In the end, for me, and for many, the real proving grounds isn’t on personal and home use. It is about a storage systems and an OS that are built for the enterprise.

Continue reading →

Glusterific!

By cfheoh | May 25, 2020 - 9:30 am |May 22, 2020 Acquisition, Algorithm, Appliance, Ceph, CIFS, Cloud, Containers, Disks, Filesystems, FreeNAS, Gluster, High Performance Computing, Hyperconvergence, IBM, Infiniband, Intel, Isilon, iXsystems, Linux, Lustre, NAS, NetApp, NFS, Object Storage, Openstack, Panasas, Quantum Corporation, RAID, RDMA, Redhat, Scale-out architecture, Server SAN, SMB, Software Defined Storage, Storage Optimization, TrueNAS, Virtualization

Leave a comment

A conversation with a storage executive last week brought up Gluster, a clustered file system I have not explored in many years. I had one interaction months before its acquisition by RedHat® in 2011.

I remembered the Gluster demo at Jaring over a video call, because I was the lead consultant pitching the scale-out NAS solution. It did not go well, and there were “bugs” which made the Head of IT flinched in her seat. Despite Jaring being Malaysia’s technology trailblazer, the impression of Gluster was forgettable. I stayed on the GlusterFS architecture a little while and then it dropped off my radar.

Gluster Scale Out NAS

But after the conversation last week, I am elated to revive my interest in Gluster, knowing that something big and impressive in coming into the fore very soon. Studying the architecture (again!), there are 2 parts of Gluster which excite me. One is the Brick and the other is the lack of a Metadata service.

Continue reading →

Down the rabbit hole with Kubernetes Storage

By cfheoh | May 19, 2020 - 9:30 am |May 16, 2020 Acquisition, Algorithm, Amazon Web Services, Analytics, API, Artificial Intelligence, Ceph, Cloud, Clusters, Containers, Data Management, Edge Computing, Elastifile, Filesystems, Flash, Google, Hyperconvergence, Kubernetes, Linux, Minio, NFS, Object Storage

Leave a comment

Kubernetes is on fire. Last week VMware® released the State of Kubernetes 2020 report which surveyed companies with 1,000 employees and above. Results were not surprising as the adoptions of this nascent technology are booming. But persistent storage remained the nagging concern for the Kubernetes serving the infrastructure resources to applications instances running in the containers of a pod in a cluster.

The standardization of storage resources have settled with CSI (Container Storage Interface). Storage vendors have almost, kind of, sort of agreed that the API objects such as PersistentVolumes, PersistentVolumeClaims, StorageClasses, along with the parameters would be the way to request the storage resources from the Pre-provisioned Volumes via the CSI driver plug-in. There are already more than 50 vendor specific CSI drivers in Github.

Kubernetes and the CSI (Container Storage Interface) logos

The CSI plug-in method is the only way for Kubernetes to scale and keep its dynamic, loadable storage resource integration with external 3rd party vendors, all clamouring to grab a piece of this burgeoning demands both in the cloud and in the enterprise.

Continue reading →

Cloud Sync Prowess of FreeNAS

By cfheoh | May 4, 2020 - 9:00 am |May 4, 2020 Appliance, Business Continuity, CIFS, Cloud, Data Availability, Data Management, Data Protection, Data Security, Disaster Recovery, Dropbox, Filesystems, FreeNAS, Google, iconik, iXsystems, Katana Logic, Linux, NAS, NetApp, NFS, SMB, Snapshots, Software Defined Storage, TrueNAS, Wasabi Cloud

1 Comment

The COVID-19 situation has driven technology to find new ways to adapt to the new digital workspace. Difficulty in remote access to content files and media assets has disrupted the workflow of the practitioners of many business segments. Many are trying to find ways to get the files and folders into their home computers and laptops to do work when they were used to getting them from the regular NAS shared drives.

These challenges have put hybrid cloud file sharing into the forefront, making it the best possible option to access the NAS folders and files inside and outside the boundaries of the company’s network. However, end users are pressured to invest into new technologies to adjust to this new normal. It does not have to be this way, because FreeNAS™ (and in that aspect TrueNAS®) has plenty of cloud help to offer. Most of the features are Free!

TrueNAS Core replacing FreeNAS in version 12.0

[ Note: FreeNAS™ will become TrueNAS® Core in the release 12. News was announced 2 months ago ]

FreeNAS™ Cloud Sync

One of the underrated features of FreeNAS™ is Cloud Sync. It was released in version 11.1 and it is invaluable extending the hybrid cloud file sharing to the masses. Cloud Sync makes the shares available to public cloud services such as AWS S3, Dropbox, Google Cloud Storage, Google Drive, Microsoft Blob Storage, Microsoft OneDrive, pCloud, Wasabi™ Cloud and more. This means that the files and folders used within the NAS space in the LAN, can synchronized and used through the public cloud services mentioned.

There are 2 steps to setup Cloud Sync.

Add the Cloud Credentials for the cloud provider to use
Create the Cloud Sync Task

Continue reading →

Falconstor Software Defined Data Preservation for the Next Generation

By cfheoh | April 27, 2020 - 9:56 am |April 27, 2020 Amazon Web Services, Analytics, API, Appliance, Artificial Intelligence, Backup, Big Data, Business Continuity, Cloud, Clusters, Composable Infrastructure, compression, Containers, Data, Data Archiving, Data Availability, Data Corruption, Data Domain, Data Fabric, Data Management, Data Privacy, Data Protection, Data Security, deduplication, Deduplication, Digital Transformation, Disaster Recovery, Disks, eDiscovery, Falconstor, HDS, Linux, LTFS, LTO, LTO-8, Microsoft, Microsoft Azure, NetApp, RAID, Scale-out architecture, Software Defined Storage, Software-defined Datacenter, Starwind, Storage Tiering, Tape storage, virtual tape library, Virtualization, VTL

Leave a comment

Falconstor® Software is gaining momentum. Given its arduous climb back to the fore, it is beginning to soar again.

Tape technology and Digital Data Preservation

I mentioned that long term digital data preservation is a segment within the data lifecycle which has merits and prominence. SNIA® has proved that this is a strong growing market segment through its 2007 and 2017 “100 Year Archive” surveys, respectively. 3 critical challenges of this long, long-term digital data preservation is to keep the archives

Accessible
Undamaged
Usable

For the longest time, tape technology has been the king of the hill for digital data preservation. The technology is cheap, mature, and many enterprises has built their long term strategy around it. And the pulse in the tape technology market is still very healthy.

The challenges of tape remain. Every 5 years or so, companies have to consider moving the data on the existing tape technology to the next generation. It is widely known that LTO can read tapes of the previous 2 generations, and write to it a generation before. The tape transcription process of migrating digital data for the sake of data preservation is bad because it affects the structural integrity and quality of the content of the data.

In my times covering the Oil & Gas subsurface data management, I have seen NOCs (national oil companies) with 500,000 tapes of all generations, from 1/2″ to DDS, DAT to SDLT, 3590 to LTO 1-7. And millions are spent to transcribe these tapes every few years and we have folks like Katalyst DM, Troika and more hovering this landscape for their fill.

Continue reading →

btrfs butter gone bad?

By cfheoh | March 30, 2020 - 7:49 am |March 30, 2020 Appliance, CIFS, Cloud, Data Corruption, Filesystems, FreeNAS, Linux, NAS, NFS, QNAP, Redhat, Reliability, SMB, Snapshots, Software Defined Storage, SuSE

Leave a comment

I wrote about btrfs 8 years ago.

Since then, it has made its way into several small to mid-end storage solutions (more NAS inclined solutions) including Rockstor, Synology, Terramaster, and Asustor. In the Linux world, SUSE® Linux Enterprise Server and OpenSUSE® use btrfs as the default OS file system. I have decided to revisit btrfs filesystem to give some thoughts about its future.

Have you looked under the hood?

The sad part is not many people look under the hood anymore, especially for the market the btrfs storage vendors are targeting. The small medium businesses just want a storage which is cheap. But cheap comes at a risk where the storage reliability and data integrity are often overlooked.

The technical conversation is secondary and thus the lack of queries for strong enterprise features may be leading btrfs to be complacent in its development.

Continue reading →

Paradigm shift of Dev to Storage Ops

By cfheoh | March 2, 2020 - 5:47 am |March 2, 2020 Amazon Web Services, API, Artificial Intelligence, Ceph, Cloud, Composable Infrastructure, Containers, Data Management, Deep Learning, Docker, Drivescale, Edge Computing, Filesystems, Hadoop Clusters, High Performance Computing, IBM, Kubernetes, Linux, Liqid, Machine Learning, Minio, Object Storage, Performance Benchmark, Redhat, Scale-out architecture, Software Defined Storage, Storage Field Day, Tech Field Day, VMware

2 Comments

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at the event. The content of this blog is of my own opinions and views ]

A funny photo (below) came up on my Facebook feed a couple of weeks back. In an honest way, it depicted how a developer would think (or the lack of thinking) about the storage infrastructure designs and models for the applications and workloads. This also reminded me of how DBAs used to diss storage engineers. “I don’t care about storage, as long as it is RAID 10“. That was aeons ago 😉

The world of developers and the world of infrastructure people are vastly different. Since cloud computing birthed, both worlds have collided and programmable infrastructure-as-code (IAC) have become part and parcel of cloud native applications. Of course, there is no denying that there is friction.

Welcome to DevOps!

The Kubernetes factor

Containerized applications are quickly defining the cloud native applications landscape. The container orchestration machinery has one dominant engine – Kubernetes.

In the world of software development and delivery, DevOps has taken a liking to containers. Containers make it easier to host and manage life-cycle of web applications inside the portable environment. It packages up application code other dependencies into building blocks to deliver consistency, efficiency, and productivity. To scale to a multi-applications, multi-cloud with th0usands and even tens of thousands of microservices in containers, the Kubernetes factor comes into play. Kubernetes handles tasks like auto-scaling, rolling deployment, computer resource, volume storage and much, much more, and it is designed to run on bare metal, in the data center, public cloud or even a hybrid cloud.

Continue reading →

DellEMC Project Nautilus Re-imagine Storage for Streams

By cfheoh | February 24, 2020 - 5:56 am |February 25, 2020 Algorithm, Analytics, API, Artificial Intelligence, Big Data, Cloud, Confluent, Data, Data Management, Deep Learning, Dell, DellEMC, Edge Computing, EMC, Fog Computing, Industry 4.0, InfluxDB, IoT, Isilon, Kubernetes, Linux, Machine Learning, Pravega, Storage Field Day, Tech Field Day

2 Comments

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at this event. The content of this blog is of my own opinions and views ]

Cloud computing will have challenges processing data at the outer reach of its tentacles. Edge Computing, as it melds with the Internet of Things (IoT), needs a different approach to data processing and data storage. Data generated at source has to be processed at source, to respond to the event or events which have happened. Cloud Computing, even with 5G networks, has latency that is not sufficient to how an autonomous vehicle react to pedestrians on the road at speed or how a sprinkler system is activated in a fire, or even a fraud detection system to signal money laundering activities as they occur.

Furthermore, not all sensors, devices, and IoT end-points are connected to the cloud at all times. To understand this new way of data processing and data storage, have a look at this video by Jay Kreps, CEO of Confluent for Kafka® to view this new perspective.

Data is continuously and infinitely generated at source, and this data has to be compiled, controlled and consolidated with nanosecond precision. At Storage Field Day 19, an interesting open source project, Pravega, was introduced to the delegates by DellEMC. Pravega is an open source storage framework for streaming data and is part of Project Nautilus.

Rise of streaming time series Data

Processing data at source has a lot of advantages and this has popularized Time Series analytics. Many time series and streams-based databases such as InfluxDB, TimescaleDB, OpenTSDB have sprouted over the years, along with open source projects such as Apache Kafka®, Apache Flink and Apache Druid.

The data generated at source (end-points, sensors, devices) is serialized, timestamped (as event occurs), continuous and infinite. These are the properties of a time series data stream, and to make sense of the streaming data, new data formats such as Avro, Parquet, Orc pepper the landscape along with the more mature JSON and XML, each with its own strengths and weaknesses.

You can learn more about these data formats in the 2 links below:

DIY is difficult

Many time series projects started as DIY projects in many organizations. And many of them are still DIY projects in production systems as well. They depend on tribal knowledge, and these databases are tied to an unmanaged storage which is not congruent to the properties of streaming data.

At the storage end, the technologies today still rely on the SAN and NAS protocols, and in recent years, S3, with object storage. Block, file and object storage introduce layers of abstraction which may not be a good fit for streaming data.

Continue reading →

Category Archives: Linux