Crash consistent data recovery for ZFS volumes

While TrueNAS® CORE and TrueNAS® Enterprise are more well known for its NAS (network attached storage) prowess, many organizations are also confidently placing their enterprise applications such as hypervisors and databases on TrueNAS® via SANs (storage area networks) as well. Both iSCSI and Fibre Channel™ (selected TrueNAS® Enterprise storage models) protocols are supported well.

To reliably protect these block-based applications via the SAN protocols, ZFS snapshot is the key technology that can be dependent upon to restore the enterprise applications quickly. However, there are still some confusions when it comes to the state of recovery from the ZFS snapshots. On that matter, this situations are not unique to the ZFS environments because as with many other storage technologies, the confusion often stem from the (mis)understanding of the consistency state of the data in the backups and in the snapshots.

Crash Consistency vs Application Consistency

To dispel this misunderstanding, we must first begin with the understanding of a generic filesystem agnostic snapshot. It is a point-in-time copy, just like a data copy on the tape or in the disks or in the cloud backup. It is a complete image of the data and the state of the data at the storage layer at the time the storage snapshot was taken. This means that the data and metadata in this snapshot copy/version has a consistent state at that point in time. This state is frozen for this particular snapshot version, and therefore it is often labeled as “crash consistent“.

In the event of a subsystem (application, compute, storage, rack, site, etc) failure or a power loss, data recovery can be initiated using the last known “crash consistent” state, i.e. restoring from the last good backup or snapshot copy. Depending on applications, operating systems, hypervisors, filesystems and the subsystems (journals, transaction logs, protocol resiliency primitives etc) that are aligned with them, some workloads will just continue from where it stopped. It may already have some recovery mechanisms or these workloads can accept data loss without data corruption and inconsistencies.

Some applications, especially databases, are more sensitive to data and state consistencies. That is because of how these applications are designed. Take for instance, the Oracle® database. When an Oracle® database instance is online, there is an SGA (system global area) which handles all the running mechanics of the database. SGA exists in the memory of the compute along with transaction logs, tablespaces, and open files that represent the Oracle® database instance. From time to time, often measured in seconds, the state of the Oracle® instance and the data it is processing have to be synched to non-volatile, persistent storage. This commit is important to ensure the integrity of the data at all times.

Continue reading

The future of Fibre Channel in the Cloud Era

The world has pretty much settled that hybrid cloud is the way to go for IT infrastructure services today. Straddled between the enterprise data center and the infrastructure-as-a-service in public cloud offerings, hybrid clouds define the storage ecosystems and architecture of choice.

A recent Blocks & Files article, “Broadcom server-storage connectivity sales down but recovery coming” caught my attention. One segment mentioned that the server-storage connectivity sales was down 9% leading me to think “Is this a blip or is it a signal that Fibre Channel, the venerable SAN (storage area network) protocol is on the wane?

Fibre Channel Sign

Thus, I am pondering the position of Fibre Channel SANs in the cloud era. Where does it stand now and in the near future? Continue reading

The rise of RDMA

I have known of RDMA (Remote Direct Memory Access) for quite some time, but never in depth. But since my contract work ended last week, and I have some time off to do some personal development, I decided to look deeper into RDMA. Why RDMA?

In the past 1 year or so, RDMA has been appearing in my radar very frequently, and rightly so. The speedy development and adoption of NVMe (Non-Volatile Memory Express) have pushed All Flash Arrays into the next level. This pushes the I/O and the throughput performance bottlenecks away from the NVMe storage medium into the legacy world of SCSI.

Most network storage interfaces and protocols like SAS, SATA, iSCSI, Fibre Channel today still carry SCSI loads and would have to translate between NVMe and SCSI. NVMe-to-SCSI bridges have to be present to facilitate the translation.

In the slide below, shared at the Flash Memory Summit, there were numerous red boxes which laid out the SCSI connections and interfaces where SCSI-to-NVMe translation (and vice versa) would be required.

Continue reading

Brocade is ripe again

Like seasonable fruits, Brocade is ready to be plucked from the Fibre Channel tree (again). A few years ago, it put itself up for sale. There were suitors but no one offered to take up Brocade. Over the last few days, the rumour mill is at it again, and while Brocade did not comment, the news is happening again.

Why is Brocade up for sale? One can only guess. Over the past year, their stock has been pounded in the past months and as of last Friday, stood at USD4.51. The news mentioned that Brocade market capitalization is around USD2.7-2.8 billion, low enough to be acquired.

Brocade has been a fantastic Fibre Channel company in the past, and still pretty much is. They have survived the first Fibre Channel shake-up, and companies like Vixel, Gadzoox, and Ancor are no longer in the Fibre Channel’s industry map. They have thrived throughout, until Cisco MDS started to make dents into Brocade’s armour.

Today, a big portion of their business still relies on Fibre Channel to drive revenues and profits. A few years ago in 2008, they acquired Foundry Networks, an Gigabit Ethernet company and it was the right move as the world was converging towards 10 Gigabit. However, it is only in the past 2-3 years, that Brocade has come out with a more direct approach rather than spending most of their time on their OEM business in this region. Perhaps this laggard approach and their inaction in the past have cost them their prime position and now they are primed to be swooped up by probable suitors.

Who, will be the probable suitors now? IBM, Oracle, Juniper and even possibly Cisco could be strong candidates. IBM makes a lot of sense because I believe IBM wants to own technology and Brocade has a lot of technology and patents to offer. Oracle, hmm … they are not a hardware company. It is true that they bought Sun, but from my internal sources, Oracle is not cool with hardware innovations. They just want to sell more Oracle software licenses, keeping R&D and innovation on a short leash, and keeping R&D costs on Sun’s hardware low.

Juniper makes sense too, because they have a sizeable Ethernet business. I was a tad bit disappointed when I got to know that Juniper started selling entry-level Gigabit switches, because I have always placed them at lofty heights with their routers. But I guess, as far as business goes, Juniper did the only natural thing – If there money to be made, why not? If Juniper takes up Brocade, they can have 2 formidable storage networking businesses, Fibre Channel and Data Center Ethernet (DCE). The question now is – Does Juniper want the storage business?

If Cisco buys Brocade, that would mean alarm bells everywhere. It would trigger the US side to look into anti-competitive implications of the purchase. Unfortunately, Cisco has become a stagnant giant, and John Chambers, their CEO is dying to revive the networking juggernaut. There were also rumours of Cisco breaking up to unlock the value of the many, many companies and technologies they acquired in the past. I believe, buying Brocade does not help Cisco, because as they have done in the past with other acquisitions, there are too many technology similarities to extract Brocade’s value.

We will not know how Brocade will fare in 2012, suitors or not, because they are indeed profitable. Unfortunately, the stock options scandal last year plus the poor track record of their acquisitions such as NuView, Silverback, and even Foundry Networks, are not helping to put Brocade in a different light.

If the rumours are true, putting itself up for sale only cheapens the Brocade image. Quid proxima, Brocade?