Is there no one to challenge EMC?

It’s been a busy, busy month for me.

And when the IDC Worldwide Quarterly Disk Storage Systems Tracker for 3Q12 came out last week, I was reading in awe how impressive EMC was at the figures that came out. But most impressive of all is how the storage market continue to grow despite very challenging and uncertain business conditions. With the Eurozone crisis, China experiencing lower economic growth numbers and the uncertainty in the US economic sectors, it is unbelievable that the storage market grew 24.4% y-o-y. And for the first time, 7,104PB was shipped! Yes folks, more than 7 exabytes was shipped during that period!

In the Top 5 external disk storage market based on revenue, only EMC and HDS recorded respectable growth, recording 8.7% and 13.8% respectively. NetApp, my “little engine that could” seems to be running out of steam, earning only 0.9% growth. The rest of the field, IBM and HP, recorded negative growth. Here’s a look at the Top 5 and the rest of the pack:

HP -11% decline is shocking to me, and given the woes after woes that HP has been experiencing, HP has not seen the bottom yet. Let’s hope that the new slew of HP storage products and technologies announced at HP Discover 2012 will lift them up. It also looked like a total rebranding of the HP storage products as well, with a big play on the word “Store”. They have names like StoreOnce, StoreServ, StoreAll, StoreVirtual, StoreEasy and perhaps more coming.

The Open SAN market, which includes iSCSI has EMC again at Number 1, with 29.8%, followed by IBM (14%), HDS (12.2%) and HP (11.8%). When combined with NAS numbers, the NAS + Open SAN market, EMC has 33.5% while NetApp is 13.7%.

Of course, it is just not about external storage because the direct-attached storage numbers count too. With that, the server vendors of IBM, HP and Dell are still placed behind EMC. Here’s a look at that table from IDC:

There’s a highlight of Dell in the table above. Dell actually grew by 4.0% compared to decline in HP and IBM, gaining 0.1%. However, their numbers seem too tepid and led to the exit of Darren Thomas, Dell’s storage group head honco. News of Darren’s exit was on TheRegister.

I also want to note that NAS growth numbers actually outpaced Open SAN numbers including iSCSI.

This leads me to say that there is a dire need for NAS technical and technology expertise in the local storage market. As the adoption of NFSv4 under way and SMB 2.0 and 3.0 coming into the picture, I urge all storage networking professionals who are more pro-SAN to step out of their comfort zone and look into NAS as well. The world is changing and it is no longer SAN vs NAS anymore. And NFSv4.1 is blurring the lines even more with the concepts of layout.

But back to the subject to storage market, is there no one out there challenging EMC in a big way? NetApp was, some years ago, recorded double digit growth and challenging EMC neck-and-neck, but that mantle seems to be taken over by HDS. But both are long way to go to get close to EMC.

Kudos to the EMC team for damn good execution!

Linking Apple to SAN

Serendipity is what I would describe this encounter. I was introduced to Promise Technology Channel Sales Director early this week. When I saw his face, I realized that I already knew him, a Malaysian who previously worked at EMC in Taiwan, but has been residing in that country for many years. We laughed and joked like old buddies and hence, the story begins …

I have known of Promise through its popular VTrak storage, which many Apple users here ignorantly associate as an Apple product. Here it is, appearing on Apple’s website:

Yes, that’s the 3rd picture from your left.

Another very strong Apple product from Promise Technology is Pegasus, a storage line of direct-attached storage (DAS) sporting the Thunderbolt, a very fast interface that connects peripherals to Macs through its Mini DisplayPort. I found this strange having a graphics display port being used to connect to a storage device, but as I looked deeper into Thunderbolt, I found that the technology was meant to extend the PCIe bus with the DisplayPort into a conduit that delivers high throughput serially. Impressive!

The picture below shows the Thunderbolt link connections, in which
Intel will provide two types of Thunderbolt controllers, a 2 port type and a 1 port type.

But Thunderbolt is not a network-based technology. It is channel-based, and hence, connecting to a Fibre Channel SAN is like mixing oil with water. Apple is not known for accessing storage through Fibre Channel, and since Apple products do not have a Fibre Channel interface, Promise Technology has come up with Thunderbolt to Fibre Channel converter. They call it SANLink. And the picture below shows how it is done:

The SANLink can also be daisy-chained. In the example below, the Pegasus DAS is daisy-chained to a SANLink which then extends and expands its capacity from the Fibre Channel connected VTrak Ex30 or Ex10.

The connectivity can be 8Gbps for the VTrak Ex30, and 4Gbps for the Ex10, and it has been validated to work with MacOS X, Final Cut Studio and Apple’s Xsan.

This is targeted to the Apple’s presence in the video editing and video production environment. I have 1 customer using our storage for their Apple file sharing purpose, and I realized that these guys work in isolation most of the time. They are like a sheep-shearing house, taking one job, work on it a bit and then pass it on to another colleague for the next stage in the video production process. Sharing is clearly not well known in this type of environment. And Promise wants to change that by opening to those hermit-like video editors and producers to share and collaborate in their work.

I don’t know much about other vendors besides Apple that pushes the Thunderbolt technology. It is very high performance interface, capable of delivering 20Gbps but I am afraid that Thunderbolt may suffer from the Apple-only syndrome.

Apple tend to be very cutting-edge when it comes to most technologies that go into their products. That makes Apple high risk takers, and that puts Thunderbolt into that risky category when if Apple fails, Thunderbolt fails. So far, I have not seen Thunderbolt spreading like wildfire, but opening Apple to SAN, both iSCSI and Fibre Channel, is good. It is time Apple embrace more of the storage networking technologies and standards out there, rather than being steadfast with their proprietary implementation of storage. Apple File Protocol (AFP) and Thunderbolt (for now) comes to mind. It is good to be stubborn but …

Protogon File System

I was out shopping yesterday and I was tempted to have lunch at Bar-B-Q Plaza, a popular Thai, Japanese-style hot plate barbeque restaurant in this neck of the woods. The mascot of this restaurant is Bar-B-Gon, a dragon-like character and it is obviously a word play of barbeque and dragon.

As I was reading the news this morning about the upcoming Windows Server 8 launch, I found out that ever popular, often ridiculed NTFS (NT File System) of Windows will be going away. It will be replaced by Protogon, a codename for the new file system that Microsoft is about to release. Protogon? A word play of prototype and dragon?

The new file system, with backward compatibility with NTFS, will be called ReFS or Resilient File System. And the design objectives of what Microsoft calls “next generation” file system are clear and adept to the present day requirements. I notably mentioned present day requirements for a reason, because when I went through the key features of ReFS, the concepts and the ideas are not exactly “next generation“. Many of these features are already present with most storage vendors we know of, but perhaps for the people in the Windows world, these features might sound “next generation” to them.

ReFS, to me, is about time. NTFS has been around for a long, long time. It was first known in the wild in the 1993, and gain prominence and wide acceptance in Windows 2000 as the “enterprise-ready” file system. Indeed it was, because that was the time Microsoft Windows started its dominance into the data centers when the Unix vendors were still bickering about their version of open standards. Active Directory (AD) and NTFS were the 2 key technologies that slowly, but surely, removed Unix’s strengths in the data centers.

But over the years, as the storage networking technologies like SAN and NAS were developing and maturing, I see the NTFS being little developed to meet the strengths of these storage networking technologies and relevant protocols in the data world. When I did  a little bit of system administration on Windows (2000, 2003 notably), I could feel that NTFS was developed with direct-attached storage (DAS) or internal disks in mind. Definitely not full taking advantage of the strengths of Fibre Channel or iSCSI SAN. It was only in Windows Server 2008, that I felt Microsoft finally had enough pussyfooting with SAN and NAS, and introduced a more decent disk storage management that incorporates features that works well natively with SAN. Now, Microsoft can no longer sit quietly without acknowledging the need to build enterprise-ready technologies related to storage networking and data management. And the core in the new Microsoft Windows Server 8 engine for that is the ReFS.

One of the key technology objectives in the design of ReFS is backward compatibility. Windows has a huge market to address and they cannot just shove NTFS away. The way they did was to maintain the upper level API and file semantics and having a new core file system engine as shown in the diagram below:

ReFS is positioned with resiliency in mind. Here are a few resilient features:

  • Ability to isolate fault and perform data salvation on parts of the file system without taking the entire file system or volume offline. The goal of REFS here is to be ONLINE and serving data all the time!
  • Checksumming data and metadata for integrity. It verifies all data, and in some cases, auto-correcting corrupted data
  • Optional integrity streams that ensures protection for all forms of file-level data corruption. When enabled, whenever a file is changed, the modified copy is written to a different area of the disk than that of the original file. This way, even if the write operation is interrupted and the modified file is lost, the original file is still intact. (Doesn’t this sounds like COW with snapshots?) When combined with Storage Spaces (we will talk about this later), which can store a copy of all files in a storage array on more than one physical disk, ReFS gives Windows a way to automatically find and open an uncorrupted version of a file In the event that a file on one of the physical disks becomes corrupted. Microsoft does not recommend integrity streams for applications or systems with a specific type of storage layout or applications which want better control in the disk storage, for example databases.
  • Data scrubbing for latent disk errors. There is an tool, integrity.exe which runs and manages the data scrubbing and integrity policies. The file attribute, FILE_ATTRIBUTE_NO_SCRUB_DATA, will allow certain applications to skip this options and have these applications control integrity policies beyond what ReFS has to offer.
  • Shared storage pools across machines for additional fault tolerance and load balancing (ala Oracle RAC perhaps?)
  • Protection against bit rot. Silent data corruption, which I have blogged about many, many moons ago.

End-to-end resilient architecture is the goal in mind.

From a file structure standpoint, here’s how ReFS looks like:

ReFS is Copy-on-Write (COW). As you know, I am a big fan of any file systems but COW is one that I am most familiar with. NetApp’s Data ONTAP, Oracle Solaris, ZFS and the upcoming Linux BTRFS are all implementations of COW. Similar to BTRFS, ReFS uses a B+ tree implementation and as described in Wikipedia,

ReFS uses B+ trees for all on-disk structures including metadata and file data. The file size, total volume size, number of files in a directory and number of directories in a volume are limited by 64-bit numbers, which translates to maximum file size of 16 Exbibytes, maximum volume size of 1 Yobibyte (with 64 KB clusters), which allows large scalability with no practical limits on file and directory size (hardware restrictions still apply). Metadata and file data are organized into tables similar to relational database. Free space is counted by a hierarchal allocator which includes three separate tables for large, medium, and small chunks. File names and file paths are each limited to a 32 KB Unicode text string.

In ReFS, Microsoft introduces Storage Spaces. And the concept is very, very similar to what ZFS is, with the seamless implementation of a volume manager, RAID management, and highly resilient file system. And ZFS is 10 years old. So much for ReFS being “next generation“.  But here is a series of screenshots of how Storage Spaces looks like:

And similar to this “flexible volume management” ala ONTAP FlexVol and ZFS file systems, you can add disk drives on the fly, and grow your volumes online and real time.

ReFS inherits many of the NTFS features as it inches towards the Windows Server 8 launch date. Some of the features mentioned were the BitLocker encryption, Access Control List (ACL) for security (naturally), Symbolic Links, Volume Snapshots, File IDs and Opportunistic Locking (Oplocks).

ReFS is intended to scale to as what Microsoft says, “to extreme limits“. Here is a table describing those limits:

ReFS new technology will certainly bring Windows to the stringent availability and performance requirements of modern day file systems, but the storage networking world is also evolving into the cloud computing space. Object-based file systems are also getting involved as market trends dictate new requirements and file systems, in order to survive, must continue to evolve.

Microsoft’s file system, NTFS took a long time to come to this present version, ReFS, but can Microsoft continue to innovate to change the rules of the data storage game? We shall see …

Storage Architects no longer required

I picked up a new article this afternoon from SearchStorage – titled “Enterprise storage trends: SSDs, capacity optimization, auto tiering“. I cannot help but notice some of the things I have been writing about VMware being the storage killer and the rise of Cloud Computing which take away our jobs.

I did receive some feedback about what I wrote in the past and after reading the SearchStorage article, I can’t help but feeling justified. On the side bar, it wrote:

 

The rise of virtual machine-specific and cloud storage suggest that other changes are imminent. In both cases …. and would no longer require storage architects and managers.

Things are changing at an extremely fast pace and for those of us still languishing in the realms of NAS and SAN, our expertise could be rendered obsolete pretty quickly.

But all is not lost because it would be easier for a storage engineer, who already has the foundation to move into the virtualization space than a server virtualization engineer coming down to learn about the storage fundamentals. We can either choose to be dinosaur or be the species of the next generation.

The rise of the specialized appliance

Compute and storage are 2 components within the IT infrastructure which are surely converging. SAN and NAS are facing their greatest adversary yet, and could be made insignificant if the cloud and virtualization game had their way. This is giving rise to the a new breed of solution, a specialized appliance where both compute and storage are ONE. Rising from the ashes of shared storage (SAN and NAS, take note), we are beginning to see things going back to way of direct, internal storage.

There were some scuffles in the bushes about 5 years, where Sun (now Oracle) was ahead of its game. The Sun Fire X4500 (aka Thumper) was one of the strong candidates to challenge the SAN/NAS duopoly in this networked storage period. X4500 integrated both the server and the storage components together, using ZFS as a file system and volume manager to deliver a very high throughput on all the JBOD disks very efficiently. ZFS acted as the RAID, so there was no need to have specialized RAID hardware. This proved that a very high performance storage solution can be easily integrated using standard off-the-shelf infrastructure components and the x86 architecture. By combining both compute and storage together, there were hints that the industry was about to rise up to Direct-Attached Storage (DAS) again, despite its perceived weakness against SAN and NAS.

Unfortunately, the applications were not ready for DAS then. Besides ZFS, applications such as databases, emails and file servers were not ready to jump into the DAS bandwagon and watch them ride into the sunset. But the fairy tale seems to be retold again, and this time, the evidence that DAS could rise again is much stronger.

The catalyst to this disruptive force? Virtualization!

I mentioned that VMware is the silent storage killer a few blogs ago. Needless to say, that ruffled a few featheres among the readers. I have no doubt that virtualization is changing how we storage guys look at SAN and NAS. In a traditional setup, the SAN or NAS is setup to provision LUNs or mount points to the data storage for VMFS volumes in the VMware environment. It will then be the storage array to provide snapshots, replications, thin provisioning and so on.

Perhaps VMware is nit picking that managing storage arrays for VMFS volumes is difficult. From the VMware administrators view, they are right. They don’t want to know what’s going on below the VM-level. All they want is storage, any kind of storage and VMware will manage the volumes, snapshots, replication and thin  provisioning. Indeed they were already doing that since vStorage API was introduced. In the new release of VMware version 5.0, the ante has been upped even higher, making networked storage less and less significant.

If you want to know about vStorage API and stuff, below is a diagram of the integration of the various components at the VMware API level.

 

VMware can now use direct, internal storage look like shared storage. The Virtual Storage Appliance (VSA) does just that. VMware already has a thriving market from the community and hobbists for VMware Appliances.

The appliance market has now evolved into new infrastructure too. Using x86 architecture, off-the-shelf infrastructure components (sounds familiar?), companies such as Nutanix and Tintri are taking advantage of this booming trend to introduce specialized VMware appliances as shown in their advertisements on their respective web sites.

Here’s the Nutanix Ad:

 

Here’s the Tintri Ad:

 

Both Tintri and Nutanix are a new breed of appliances – specialized appliances for VMware.

At the same time, other applications are building these specialized appliances as well. I have mentioned Oracle Exadata many times in the past and Oracle Exadata is the perfect example an a fine-tuned, hardcore database engine to make the Oracle run at the best performance possible.

Likewise HP has announced their E5000 Messaging System for Microsoft Exchange. The E5000 is a specialized appliance optimized and well-tuned for the Microsoft Exchange Server 2010. From the words of HP,

“HP E5000 Messaging System is the industry’s first fully self-contained platform built for the next-generation of Microsoft Exchange to deliver enterprise-class messaging to businesses of all sizes. Built as a turnkey solution that can be up and running in a few hours vs. days, the HP E5000 Messaging System gives business users the experience they want most: large mailboxes, centralized archiving of mailboxes files and 24×7 access from any device. IT staffs benefit the solutions simplicity to setup, scale and manage and to meet new demands affordably. Ideal for multi-site enterprises as well as branch office and remote office environments, each HP Messaging System delivers greater simplicity and accelerates deployment with preconfigured solutions starting at 500 mailboxes up to 3000 mailboxes, while delivering large, 1 to 2.5GB mailbox sizes. Clients can grow by adding storage capacity or more appliances within the environment up from hundreds to thousands of mailboxes.”

What are the specs of this E5000 box, you say? Here you go:

 

And look at Row#2 in the table above … Direct, Internal Disks! Look at Row #4, Xeon CPUs! Both Compute and Storage in the same appliance!

While the HP E5000 announcement was recently, Hitachi Data Systems were already in the game early with their Unified Compute Platform and their Converged Platform for Microsoft Exchange with relatively the same idea – specialized appliances.

Perhaps the HDS solutions aren’t exactly direct, internal storage but the concept is still the same – specialized appliance. HDS Unified Compute Platform (UCP) has these components.

 

HDS Converged Platform for MS Exchange provides their specialized “appliance” with Reference Architectures that can support up to 68,000 Microsoft Exchange mailboxes. Here’s an architecture diagram of their “appliance”

 

There’s no denying that the networked storage landscape is changing. So are the computing platforms. We are already seeing the compute and storage components being integrated together, tighter than ever. The wave is rising for specialized appliances and it can only get more intense from now on.

No wonder HP’s Converged Infrastructure vision is betting on x86 architecture, simple storage platforms with SAS/SATA disks and Virtualization. Other vendors are doing the same as well – Cisco, NetApp and VMware with their FlexPod solution and EMC with their VBlocks of VMware, Cisco and EMC Storage.

Hail to the Rise of the Specialized Appliance!

Will SAN or NAS matter if your customer’s storage is in the Cloud?

An interesting question popped into my head yesterday. With all this push into the Cloud, the customer does not own most of the computer equipment. They are just getting services and when they want storage, do you think they care whether their storage is on a SAN or NAS?

I have mentioned this before, Cloud makes a lot of IT stuff irrelevant. Read my previous blog. This means that the demand for IT techies, sysadmins, consultant will suddenly be squeezed into who’s very good, good, not-so-good and the downright bad ones. Let’s the survival-of-the-fittest games begin!

Yes, the SAN and NAS, or even unified storage story doesn’t hold much weight anymore. However, to the cloud service provider, they will be out there looking for what is best for their bottom line, whether it will be a branded box or just a white box if they are willing to build the storage on their own. For those providers who have strong financials, obviously investing in premium brands like EMC, IBM, NetApp, and so on, makes sense because they need someone to blame and penalize when the shit hits the fan. For those who doesn’t have the financial prowess, this presents a whole new economy that resellers, partners, distributors can tap on to – build for these cloud providers at a cheaper price (hint, hint).

However, storage relies on a strong storage operating system to do just that. They are plenty of open source ones. Hey, you can practically build a simple iSCSI or NAS box with Linux. Consumer grade NAS such as NetGear, Synology and DLink have been using open-source Linux to penetrate the low-end, home storage market for years. The cloud providers will be a different ballgame, but the storage piece is fundamentally the same.

Things are changing folks, and for those consultants, product pre-sales, post-sales, sysadmins, operators of storage, you have to evolve to meet this new market. SAN and NAS do not matter anymore when customers are using the cloud services.

p/s: I have been spending time looking at some very, very cool cloud-ready storage operating systems. If you have the time, leave me a comment and we’ll talk. :D