Ryussi MoSMB – High performance SMB

I am back in the Silicon Valley as a Storage Field Day 12 delegate.

One of the early presenters was Ryussi, who was sharing a proprietary SMB server implementation of Linux and Unix systems. The first thing which comes to my mind was why not SAMBA? It’s free; It works; It has the 25 years maturity. But my experience with SAMBA, even in the present 4.x, does have its quirks and challenges, especially in the performance of large file transfers.

One of my customers uses our FreeNAS box. It’s a 50TB box for computer graphics artists and a rendering engine. After running the box for about 3 months, one case escalated to us was the SMB shares couldn’t be mapped all of a sudden. All the Windows clients were running version 10. Our investigation led us to look at the performance of SMB in the SAMBA 4 of FreeNAS.

This led to other questions such as the vfs_aio_pthread, FreeBSD/FreeNAS implementation of asynchronous I/O to overcome the performance weaknesses of the POSIX AsyncIO interface. The FreeNAS forum is flooded with sightings of missing SMB service that during large file transfer. Without getting too deep into the SMB performance issue, we decided to set the “Server Minimum Protocol” and “Server Maximum Protocol” to be SMB 2.1. The FreeNAS box at the customer has been stable now for the past 5 months.

Continue reading

Disaster Recovery has changed

Simple and affordable Disaster Recovery? Sounds oxymoronic, right?

I have thronged the small medium businesses (SMBs) space in the past few months. I have seen many SMBs resort to the cheapest form they can get their hands on. It could be a Synology here or a QNAP there, and that’s their backup plan. That’s their DR plan. When disaster strikes, they just shrug their shoulders and accept their fate. It could be a human error, accidental data deletion, virus infection, data corruption and recently, RANSOMware! But these SMBs do not have the IT resources to deal with the challenges these “disasters” bring.

Recently I attended a Business Continuity Institute forum organized by the Malaysian Chapter. Several vendors and practitioners spoke about the organization’s preparedness and readiness for DR. And I would like to stress the words “preparedness” and “readiness”. In the infrastructure world, we often put redundancy into the DR planning, and this means additional cost. SMBs cannot afford this redundancy. Furthermore, larger organizations have BC and DR coordinators who are dedicated for the purpose of BC and DR. SMBs probably has a person who double up an the IT administrator.

However, for IT folks, virtualization and cloud technologies are beginning to germinate a new generation of DR solutions. DR solutions which are able to address the simplicity of replication and backup, and at the same time affordable. Many are beginning to offer DR-as-a-Service and indeed, DR-as-a-Service has become a Gartner Magic Quadrant category. Here’s a look at the 2016 Gartner Magic Quadrant for DR-as-a-Service.


And during these few months, I have encountered 3 vendors in this space. They are sitting in the Visionaries quadrant. One came to town and started smashing laptops to jazz up their show (I am not going to name that vendor). Another kept sending me weird emails, sounding kind of sleazy like “Got time for a quick call?”

Continue reading

The transcendence of Data Fabric

The Register wrote a damning piece about NetApp a few days ago. I felt it was irresponsible because this is akin to kicking a man when he’s down. It is easy to do that. The writer is clearly missing the forest for the trees. He was targeting NetApp’s Clustered Data ONTAP (cDOT) and missing the entire philosophy of NetApp’s mission and vision in Data Fabric.

I have always been a strong believer that you must treat Data like water. Just like what Jeff Goldblum famously quoted in Jurassic Park, “Life finds a way“, data as it moves through its lifecycle, will find its way into the cloud and back.

And every storage vendor today has a cloud story to tell. It is exciting to listen to everyone sharing their cloud story. Cloud makes sense when it addresses different workloads such as the sharing of folders across multiple devices, backup and archiving data to the cloud, tiering to the cloud, and the different cloud service models of IaaS, PaaS, SaaS and XaaS.

Continue reading

The reverse wars – DAS vs NAS vs SAN

It has been quite an interesting 2 decades.

In the beginning (starting in the early to mid-90s), SAN (Storage Area Network) was the dominant architecture. DAS (Direct Attached Storage) was on the wane as the channel-like throughput of Fibre Channel protocol coupled by the million-device addressing of FC obliterated parallel SCSI, which was only able to handle 16 devices and throughput up to 80 (later on 160 and 320) MB/sec.

NAS, defined by CIFS/SMB and NFS protocols – was happily chugging along the 100 Mbit/sec network, and occasionally getting sucked into the arguments about why SAN was better than NAS. I was already heavily dipped into NFS, because I was pretty much a SunOS/Solaris bigot back then.

When I joined NetApp in Malaysia in 2000, that NAS-SAN wars were going on, waiting for me. NetApp (or Network Appliance as it was known then) was trying to grow beyond its dot-com roots, into the enterprise space and guys like EMC and HDS were frequently trying to put NetApp down.

It’s a toy”  was the most common jibe I got in regular engagements until EMC suddenly decided to attack Network Appliance directly with their EMC CLARiiON IP4700. EMC guys would fondly remember this as the “NetApp killer“. Continue reading

Praying to the hypervisor God

I was reading a great article by Frank Denneman about storage intelligence moving up the stack. It was pretty much in line with what I have been observing in the past 18 months or so, about the storage pendulum having swung back to DAS (direct attached storage). To be more precise, the DAS form factor I am referring to are physical server hardware that houses many disk drives.

Like it or not, the hypervisor has become the center of the universe in the IT space. VMware has become the indomitable force in the hypervisor technology, with Microsoft Hyper-V playing catch-up. The seismic shift of these 2 hypervisor technologies are leading storage vendors to place them on to the altar and revering them as deities. The others, with the likes of Xen and KVM, and to lesser extent Solaris Containers aren’t really worth mentioning.

This shift, as the pendulum swings from networked storage back to internal “direct-attached” storage are dictated by 4 main technology factors:

  • The x86 server architecture
  • Software-defined
  • Scale-out architecture
  • Flash-based storage technology

Anyone remember Thumper? Not the Disney character from the Bambi movie!


When the SunFire X4500 (aka Thumper) was first released in (intermission: checking Wiki for the right year) in 2006, I felt that significant wound inflicted in the networked storage industry. Instead of the usual 4-8 hard disk drives in the all the industry servers at the time, the X4500 4U chassis housed 48 hard disk drives. The design and architecture were so astounding to me, I even went and bought a 1U SunFire X4150 for my personal server collection. Such was my adoration for Sun’s technology at the time.

Continue reading

No Flash in the pan

The storage networking market now is teeming with flash solutions. Consumers are probably sick to their stomach getting a better insight which flash solution they should be considering. There are so much hype, fuzz and buzz and like a swarm of bees, in the chaos of the moment, there is actually a calm and discerning pattern slowly, but surely, emerging. Storage networking guys would probably know this thing well, but for the benefit of the other readers, how we view flash (and other solid state storage) becomes clear with the picture below: Flash performance gap

(picture courtesy of  http://electronicdesign.com/memory/evolution-solid-state-storage-enterprise-servers)

Right at the top, we have the CPU/Memory complex (labelled as Processor). Our applications, albeit bytes and pieces of them, run in this CPU/Memory complex.

Therefore, we can see Pattern #1 showing up. Continue reading

SMB Witness Protection Program

No, no, FBI is not in the storage business and there are no witnesses to protect.

However, SMB 3.0 has introduced a RPC-based mechanism to inform the clients of any state change in the SMB servers. Microsoft calls it Service Witness Protocol [SWP], and its objective is provide a much faster notification service allow the SMB 3.0 clients to do a failover. In previous SMB 1.0 and even in SMB 2.x, the SMB clients rely on time-out services. The time-out services, either SMB or TCP, could take up as much as 30-45 seconds, and this creates a high latency that is disruptive to enterprise applications.

SMB 3.0, as mentioned in my previous post, had a total revamp, and is now enterprise ready. In what Microsoft calls “Continuously Available” File Service, the SMB 3.0 supports clustered or scale-out file servers. The SMB shares must be shared as “Continuously Available” shares and mapped to SMB 3.0 clients. As shown in the diagram below (provided by SNIA’s webinar),

SMB 3.0 CA Shares

Client A mapping to Server 1 share (\\srv1\CAshr). Client A has a share “handle” that establishes a connection with a corresponding state of the session. The state of the session is synchronously kept consistent with a corresponding state in Server 2.

The Service Witness Protocol is not responsible for the synchronization of the states in the SMB file server cluster. Microsoft has left the HA/cluster/scale-out capability to the proprietary technology method of the NAS vendor. However, SWP regularly observes the status of all services under its watch. Continue reading

SMB on steroids but CIFS lord isn’t pleased

I admit it!

I am one of the guilty parties who continues to use CIFS (Common Internet File System) to represent the Windows file sharing protocol. And a lot of vendors continue to use the “CIFS” word loosely without knowing that it was a something from a bygone era. One of my friends even pronounced it as “See Fist“, which sounded even funnier when he said it. (This is for you Adrian M!)

And we couldn’t be more wrong because we shouldn’t be using the CIFS word anymore. It is so 90’s man! And the tell-tale signs have already been there but most of us chose to ignore it with gusto. But a recent SNIA Webinar titled “SMB 3.0 – New opportunities for Windows Environment” aims to dispel our incompetence and change our CIFS-venture to the correct word – SMB (Server Message Block).

A selfie photo of Dennis Chapman, Senior Technical Director for Microsoft Solutions at NetApp from the SNIA webinar slides above, wants to inform all of us that … SMB History Continue reading

VMware in step 1 breaking big 6 hegemony

Happy Lunar New Year! This is the Year of the Water Snake, which just commenced 3 days ago.

I have always maintain that VMware has to power to become a storage killer. I mentioned that it was a silent storage killer in my blog post many moons ago.

And this week, VMware is not so silent anymore. Earlier this week, VMware had just acquired Virsto, a storage hypervisor technology company. News of the acquisition are plentiful on the web and can be found here and here. VMware is seriously pursuing its “Software-Defined Data Center (SDDC)” agenda and having completed its software-defined networking component with the acquisition of Nicira back in July 2012, the acquisition of Virsto represents another bedrock component of SDDC, software-defined storage.

Who is Virsto and what do they do? Well, in a nutshell, they abstract the underlying storage architecture and presents a single, global namespace for storage, a big storage pool for VM datastores. I got to know about their presence last year, when I was researching on the topic of storage virtualization.

I was looking at Datacore first, because I was familiar with Datacore. I got to know Roni Putra, Datacore’s CTO, through a mutual friend, when he was back in Malaysia. There was a sense of pride knowing that Roni is a Malaysian. That was back in 2004. But Datacore isn’t the only player in the game, because the market is teeming with folks like Tintri, Nutanix, IBM, HDS and many more. It just so happens that Virsto has caught the eye of VMware as it embarks its first high-profile step (the one that VMware actually steps on the toes of the Storage Big 6 literally) into the storage game. The Big 6 are EMC, NetApp, IBM, HP, HDS and Dell (maybe I should include Fujitsu as well, since it has been taking market share of late)

Virsto installs as a VSA (virtual storage appliance) into ESXi, and in version 2.0, it plugs right in as an almost-native feature of ESXi, not a vCenter tab like most other storage. It looks and feels very much like a vSphere functionality and this blurs the lines of storage and VM management. To the vSphere administrator, the only time it needs to be involved in storage administration is when he/she is provisioning storage or expanding it. Those are the only 2 common “touch-points” that a vSphere administrator has to deal with storage. This, therefore, simplifies the administration and management job.

Here’s a look at the Virsto Storage Hypervisor architecture (credits to Google Images):

What Virsto does, as I understand from high-level, is to take any commodity storage and provides a virtual storage layer and consolidate them into a very large storage pool. The storage pool is called vSpace (previously known as LiveSpace?) and “allocates” Virsto vDisks to each VMs. Each Visto vDisk will look like a native zeroed thick VMDK, with the space efficiency of Linked Clones, but without the performance penalty of provisioning them.  The Virsto vDisks are presented as NFS exports to each VM.

Another important component is the asynchronous write to Virsto vLogs. This is configured at the deployment stage, and this is basically a software-based write cache, quickly acknowledging all writes for write optimization and in the background, asynchronously de-staged to the vSpace. Obviously it will have its own “secret sauce” to optimize the writes.

Within the vSpace, as disk clone groups internal to the Virsto, storage related features such as tiering, thin provisioning, cloning and snapshots are part and parcel of it. Other strong features of Virsto are its workflow wizard in storage provisioning, and its intuitive built-in performance and management console.

As with most technology acquisitions, the company will eventually come to a fork where they have to decide which way to go. VMware has experienced it before with its Nicira acquisition. It had to decide between VxLAN (an IETF standard popularized by Cisco) or Nicira’s own STT (Stateless Transport Tunneling). There is no clear winner because choosing one over the other will have its rewards and losses.

Likewise, the Virsto acquisition will have to be packaged in a friendly manner by VMware. It does not want to step on all toes of its storage Big 6 partners (yet). It still has to abide to some industry “co-opetition” game rules but it has started the ball rolling.

And I see that 2 critical disruptive points about this acquisition in this:

  1. It has endorsed the software-defined storage/storage hypervisor/storage virtualization technology and started the commodity storage hardware technology wave. This could the beginning of the end of proprietary storage hardware. This is also helped by other factors such as the Open Compute Project by Facebook. Read my blog post here.
  2. It is pushing VMware into a monopoly ala-Microsoft of the yesteryear. But this time around, Microsoft Hyper-V could be the benefactor of the VMware agenda. No wonder VMware needs to restructure and streamline its business. News of VMware laying off about 900 staff can be read here. Its unfavourable news of its shares going down can be read here.

I am sure the Storage Big 6 is on the alert and is probably already building other technology and partnerships beyond VMware. It the natural thing to do but there is no stopping VMware if it wants to step on the Big 6 toes now!

Protogon File System

I was out shopping yesterday and I was tempted to have lunch at Bar-B-Q Plaza, a popular Thai, Japanese-style hot plate barbeque restaurant in this neck of the woods. The mascot of this restaurant is Bar-B-Gon, a dragon-like character and it is obviously a word play of barbeque and dragon.

As I was reading the news this morning about the upcoming Windows Server 8 launch, I found out that ever popular, often ridiculed NTFS (NT File System) of Windows will be going away. It will be replaced by Protogon, a codename for the new file system that Microsoft is about to release. Protogon? A word play of prototype and dragon?

The new file system, with backward compatibility with NTFS, will be called ReFS or Resilient File System. And the design objectives of what Microsoft calls “next generation” file system are clear and adept to the present day requirements. I notably mentioned present day requirements for a reason, because when I went through the key features of ReFS, the concepts and the ideas are not exactly “next generation“. Many of these features are already present with most storage vendors we know of, but perhaps for the people in the Windows world, these features might sound “next generation” to them.

ReFS, to me, is about time. NTFS has been around for a long, long time. It was first known in the wild in the 1993, and gain prominence and wide acceptance in Windows 2000 as the “enterprise-ready” file system. Indeed it was, because that was the time Microsoft Windows started its dominance into the data centers when the Unix vendors were still bickering about their version of open standards. Active Directory (AD) and NTFS were the 2 key technologies that slowly, but surely, removed Unix’s strengths in the data centers.

But over the years, as the storage networking technologies like SAN and NAS were developing and maturing, I see the NTFS being little developed to meet the strengths of these storage networking technologies and relevant protocols in the data world. When I did  a little bit of system administration on Windows (2000, 2003 notably), I could feel that NTFS was developed with direct-attached storage (DAS) or internal disks in mind. Definitely not full taking advantage of the strengths of Fibre Channel or iSCSI SAN. It was only in Windows Server 2008, that I felt Microsoft finally had enough pussyfooting with SAN and NAS, and introduced a more decent disk storage management that incorporates features that works well natively with SAN. Now, Microsoft can no longer sit quietly without acknowledging the need to build enterprise-ready technologies related to storage networking and data management. And the core in the new Microsoft Windows Server 8 engine for that is the ReFS.

One of the key technology objectives in the design of ReFS is backward compatibility. Windows has a huge market to address and they cannot just shove NTFS away. The way they did was to maintain the upper level API and file semantics and having a new core file system engine as shown in the diagram below:

ReFS is positioned with resiliency in mind. Here are a few resilient features:

  • Ability to isolate fault and perform data salvation on parts of the file system without taking the entire file system or volume offline. The goal of REFS here is to be ONLINE and serving data all the time!
  • Checksumming data and metadata for integrity. It verifies all data, and in some cases, auto-correcting corrupted data
  • Optional integrity streams that ensures protection for all forms of file-level data corruption. When enabled, whenever a file is changed, the modified copy is written to a different area of the disk than that of the original file. This way, even if the write operation is interrupted and the modified file is lost, the original file is still intact. (Doesn’t this sounds like COW with snapshots?) When combined with Storage Spaces (we will talk about this later), which can store a copy of all files in a storage array on more than one physical disk, ReFS gives Windows a way to automatically find and open an uncorrupted version of a file In the event that a file on one of the physical disks becomes corrupted. Microsoft does not recommend integrity streams for applications or systems with a specific type of storage layout or applications which want better control in the disk storage, for example databases.
  • Data scrubbing for latent disk errors. There is an tool, integrity.exe which runs and manages the data scrubbing and integrity policies. The file attribute, FILE_ATTRIBUTE_NO_SCRUB_DATA, will allow certain applications to skip this options and have these applications control integrity policies beyond what ReFS has to offer.
  • Shared storage pools across machines for additional fault tolerance and load balancing (ala Oracle RAC perhaps?)
  • Protection against bit rot. Silent data corruption, which I have blogged about many, many moons ago.

End-to-end resilient architecture is the goal in mind.

From a file structure standpoint, here’s how ReFS looks like:

ReFS is Copy-on-Write (COW). As you know, I am a big fan of any file systems but COW is one that I am most familiar with. NetApp’s Data ONTAP, Oracle Solaris, ZFS and the upcoming Linux BTRFS are all implementations of COW. Similar to BTRFS, ReFS uses a B+ tree implementation and as described in Wikipedia,

ReFS uses B+ trees for all on-disk structures including metadata and file data. The file size, total volume size, number of files in a directory and number of directories in a volume are limited by 64-bit numbers, which translates to maximum file size of 16 Exbibytes, maximum volume size of 1 Yobibyte (with 64 KB clusters), which allows large scalability with no practical limits on file and directory size (hardware restrictions still apply). Metadata and file data are organized into tables similar to relational database. Free space is counted by a hierarchal allocator which includes three separate tables for large, medium, and small chunks. File names and file paths are each limited to a 32 KB Unicode text string.

In ReFS, Microsoft introduces Storage Spaces. And the concept is very, very similar to what ZFS is, with the seamless implementation of a volume manager, RAID management, and highly resilient file system. And ZFS is 10 years old. So much for ReFS being “next generation“.  But here is a series of screenshots of how Storage Spaces looks like:

And similar to this “flexible volume management” ala ONTAP FlexVol and ZFS file systems, you can add disk drives on the fly, and grow your volumes online and real time.

ReFS inherits many of the NTFS features as it inches towards the Windows Server 8 launch date. Some of the features mentioned were the BitLocker encryption, Access Control List (ACL) for security (naturally), Symbolic Links, Volume Snapshots, File IDs and Opportunistic Locking (Oplocks).

ReFS is intended to scale to as what Microsoft says, “to extreme limits“. Here is a table describing those limits:

ReFS new technology will certainly bring Windows to the stringent availability and performance requirements of modern day file systems, but the storage networking world is also evolving into the cloud computing space. Object-based file systems are also getting involved as market trends dictate new requirements and file systems, in order to survive, must continue to evolve.

Microsoft’s file system, NTFS took a long time to come to this present version, ReFS, but can Microsoft continue to innovate to change the rules of the data storage game? We shall see …