SNIA – Storage Gaga

Open Source Storage Technology Crafters

By cfheoh | October 4, 2021 - 8:00 am |October 4, 2021 Appliance, Ceph, Cloud, Data Fabric, Data Management, Delphix, Filesystems, FreeNAS, Hadoop, Hadoop Clusters, iXsystems, Linux, Lustre, Minio, NAS, Object Storage, Openstack, OpenZFS, Oracle, Redhat, SMB, Snapshots, SNIA, SoftIron, Software Defined Storage, TrueNAS, Unified Storage, Virtualbox, Virtualization

Storage Assemblers

Anybody can “build” a storage system with open source storage software. Put the software together with any commodity x86 server, and it can function with the basic storage services. Most open source storage software can do the job pretty well. However, once the completed storage technology is put together, can it do the job well enough to serve a business critical end user? I have plenty of sob stories from end users I have spoken to in these many years in the industry related to so-called “enterprise” storage vendors. I wrote a few blogs in the past that related to these sad situations:

We have such storage offerings rigged with cybersecurity risks and holes too. In a recent Unit 42 report, 250,000 NAS devices are vulnerable and exposed to the public Internet. The brands in question are mentioned in the report.

I would categorize these as storage assemblers.

Continue reading →

Encryption Key Management in TrueNAS

By cfheoh | March 29, 2021 - 9:00 am |March 28, 2021 API, Data Protection, Data Security, FreeNAS, iXsystems, OpenZFS, Security, TrueNAS

Key Management Interoperability Protocol (KMIP)

One of the prominent cybersecurity features in TrueNAS® Enterprise is KMIP support in version 12.0.

What is KMIP? KMIP is a client-server framework for encryption key management. It is a standard released in 2010 and governed by OASIS Open. OASIS stands for Organization for the Advancement of Structured Information Standards.

Continue reading →

Can NetApp do it a bit better?

By cfheoh | March 11, 2017 - 11:21 pm |March 12, 2017 Acquisition, Amazon, Appliance, Backup, Business Continuity, Cloud, Data Archiving, Data Fabric, Data Management, Disaster Recovery, Elastifile, Gartner, Hyperconvergence, Interica, NetApp, Nimble Storage, NVMe, Object Storage, SNIA, Software Defined Storage, Software-defined Datacenter, Storage Market Share, Storage Tiering, Virtualization

1 Comment

[Preamble: I was a delegate of Storage Field Day 12. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented in this event]

In Day 2 of Storage Field Day 12, I and the other delegates were hustled to NetApp’s Sunnyvale campus headquarters. That was a homecoming for me, and it was a bit ironic too.

Just 8 months ago, I was NetApp Malaysia Country Manager. That country sales lead role was my second stint with NetApp. I lasted almost 1 year.

17 years ago, my first stint with NetApp was the employee #2 in Malaysia as an SE. That SE stint went by quickly for 5 1/2 years, and I loved that time. Those Fall Classics NetApp used to have at the Batcave and the Fortress of Solitude left a mark with me, and the experiences still are as vivid as ever.

Despite what has happened in both stints and even outside the circle, I am still one of NetApp’s active cheerleaders in the Asia Pacific region. I even got accused by being biased as a community leader in the SNIA Malaysia Facebook page (unofficial but recognized by SNIA), because I was supposed to be neutral. I have put in 10 years to promote the storage technology community with SNIA Malaysia. [To the guy named Stanley, my response was be “Too bad, pick a religion“.]

The highlight of the SFD12 NetApp visit was of course, having lunch with Dave Hitz, one of the co-founders and the one still remaining. But throughout the presentations, I was unimpressed.

For me, the only one which stood out was CloudSync. I have read about CloudSync since NetApp Insight 2016 and yes, it’s a nice little piece of data shipping service between on-premise and AWS cloud.

Here’s how CloudSync looks like:

Continue reading →

Can CDMI emancipate an interoperable medical records cloud ecosystem?

By cfheoh | December 28, 2015 - 7:37 pm |December 28, 2015 Acquisition, API, Appliance, Cloud, Data Archiving, Data Management, ILM, Linux, NAS, NetApp, NFS, Object Storage, Scale-out architecture, SNIA, Software Defined Storage

Comments Off

PREFACE: This is just a thought, an idea. I am by no means an expert in this area. I have researched this to inspire a thought process of how we can bring together 2 disparate worlds of medical records and imaging with the emerging cloud services for healthcare.

Healthcare has been moving out of its archaic shell in the past few years, and digital healthcare technology and services are booming. And this movement is part of the digital transformation which could eventually lead to a secure and compliant distribution and collaboration of health data, medical imaging and electronic medical records (EMR).

It is a blessing that today’s medical imaging industry has been consolidated with the DICOM (Digital Imaging and Communications in Medicine) standard. DICOM dictates the how medical imaging information and pictures are used, stored, printed, transmitted and exchanged. It is also a communication protocol which runs over TCP/IP, and links up different service class providers (SCPs) and service class users (SCUs), and the backend systems such as PACS (Picture Archiving & Communications Systems) and RIS (Radiology Information Systems).

Another well accepted standard is HL7 (Health Level 7), a similar Layer 7, application-level communication protocol for transferring and exchanging clinical and administrative data.

The diagram below shows a self-contained ecosystem involving the front-end HIS (Hospital Information Systems), and the integration of healthcare, medical systems and other DICOM modalities.

(Picture courtesy of Meddiff Technologies)

Continue reading →

Oops, excuse me but your silo is showing

By cfheoh | April 8, 2015 - 8:07 am |April 10, 2015 Appliance, Backup, Cloud, Data, Data Archiving, Data Availability, Data Management, Filesystems, High Performance Computing, Hyperconvergence, Jon Toigo, NAS, NetApp, Nutanix, Object Storage, Performance Benchmark, Reliability, Scale-out architecture, Security, Simplivity, SNIA, Software Defined Storage, Software-defined Datacenter, Solid State Devices, Storage Tiering, Unified Storage, Virtualization, VMware

2 Comments

It is the morning that the SNIA Global Steering Committee reporting session is starting soon. I am in the office extremely early waiting for my turn to share the happenings in SNIA Malaysia.

And of late, I have been getting a lot of calls to catch up on hot technologies, notably All Flash Storage arrays and hyper-converged infrastructure. Even though I am now working for Interica, a company that focuses on Oil & Gas exploration and production software, my free coffee sessions with folks from the IT side have not diminished. And I recalled a week back in mid-March where I had coffee overdose!

Flash storage and hyperconvergence are HOT! Despite the hypes and frenzies of both flash storage and hyperconvergence, I still believe that integrating either or, or both, still have an effect that many IT managers overlook. The effect is a data silo.

Continue reading →

The Prophet has arrived

By cfheoh | January 8, 2014 - 10:09 pm |January 8, 2014 Cloud, EMC, Falconstor, Filesystems, Object Storage, Openstack, Software-defined Datacenter, Storage Tiering, Unified Storage, Virtualization

And Cloud Storage will make us even stranger

By cfheoh | January 12, 2013 - 3:08 pm |January 12, 2013 Amazon, API, Cloud, Dropbox, NetApp, Object Storage, Openstack, Oracle

1 Comment

It was a dark and stormy night ….

I was in a car with my host in the stifling traffic jams on the streets of Jakarta. We had just finished dinner and his driver was taking me back to the hotel. It was about 9pm and we were making conversation trying to figure out how we can work together. My host, a wonderful Singaporean who has been residing in Jakarta for more than a decade and a half, owns a distributorship focusing mainly on IT security solutions. He had invited me over to Jakarta to give a talk on Cloud Storage at the Indonesia CIO Network event on January 9th 2013.

I was there to represent SNIA South Asia to give a talk about CDMI (Cloud Data Management Interface), and my host also took the opportunity to introduce Nutanix, a SAN-less 2-tier, high-performance, virtualized data center platform. (Note: That’s quite a mouthful, but gotta include all the buzz-words in there). It was my host’s first foray into storage networking solutions, away from his usual security solutions spread. As the conversation went on in the car, he said “You storage guys are so strange!“.

To many of the IT folks who have been involved in OS, applications, security, and networking, to say a few, storage is like a dark art, some mumbo jumbo, voodoo-like science known to a select few. That’s great, because this perception will keep us relevant, and still have the value and a job. To me, that just fine and dandy, and I like it that way. 🙂

In preparation to the event, I have to learn up SNIA CDMI. Cloud and Storage … Cloud and Storage … Cloud and Storage. Hmmm …. Continue reading →

The Storage Compass

By cfheoh | July 17, 2012 - 9:45 pm |July 17, 2012 SNIA

7 Comments

I am sure many people in IT get pissed with IT jargons and terminologies. More so if it is a customer, especially when he or she is not well versed with the fundamental concept behind the technology architecture.

Even after 20 years, with most of it in storage, I have a hard time switching from one vendor’s jargon to another (sometimes). But it has gotten harder for me lately, since I teach ONTAP courses for NetApp, EMC Cloud Infrastructure and doing my work with the ZFS stuff. Soon, I will take on EMC VNX, Information Storage Management (ISM), Big Data courses as well, and I also plan to do some Nexenta training too.

Who would know that an ONTAP NAS volume would be known as file system in EMC VNX for File (aka Celerra), and a data set in ZFS? Or a ONTAP aggregate is almost like a ZFS pool but with some differences or a clone might be called a replica in HDS and so on …

In fact, all the definitions above could be wrong because I am getting confused. 😉 You would be too if you have to switch from one vendor’s jargon to another. And the poor EMC pre-sales who has not been with any other vendor except for EMC all his career would have a hard time rewiring his brain if he had joined another vendor like NetApp. Or IBM, or Dell, or Oracle or anyone for that matter. No wonder the customers are pissed. Continue reading →

Not all SSDs are the same

By cfheoh | January 24, 2012 - 7:28 am |October 28, 2012 Disks, Performance Benchmark, SNIA, Uncategorized

3 Comments

Happy Lunar New Year! The Chinese around world has just ushered in the Year of the Water Dragon yesterday. To all my friends and family, and readers of my blog, I wish you a prosperous and auspicious Chinese New Year!

Over the holidays, I have been keeping up with the progress of Solid State Drives (SSDs). I am sure many of us are mesmerized by SSDs and the storage vendors are touting the best of SSDs have to offer. But let me tell you one thing – you are probably getting the least of what the best SSDs have to offer. You might be puzzled why I say things like this.

Let me share with a common sales pitch. Most (if not all) storage vendors will tout performance (usually IOPS) as the greatest benefits of SSDs. The performance numbers have to be compared to something, and that something is your regular spinning Hard Disk Drives (HDDs). The slowest SSDs in terms of IOPS is about 10-15x faster than the HDDs. A single SSD can at least churn 5,000 IOPS when compared to the fastest 15,000 RPM HDDs, which churns out about 200 IOPS (depending on HDD vendors). Therefore, the slowest SSDs can be 20-25x faster than the fastest HDDs, when measured in IOPS.

But the intent of this blogger is to share with you more about SSDs. There’s more to know because SSDs are not built the same. There are write-bias SSDs, read-bias SSDs; there are SLC (single level cell) and MLC (multi level cell) SSDs and so on. How do you differentiate them if Vendor A touts their SSDs and Vendor B touts their SSDs as well? You are not comparing SSDs and HDDs anymore. How do you know what questions to ask when they show you their performance statistics?

SNIA has recently released a set of methodology called “Solid State Storage (SSS) Performance Testing Specifications (PTS)” that helps customers evaluate and compare the SSD performance from a vendor-neutral perspective. There is also a whitepaper related to SSS PTS. This is something very important because we have to continue to educate the community about what is right and what is wrong.

In a recent webcast, the presenters from the SNIA SSS TWG (Technical Working Group) mentioned a few questions that I think we as vendors and customers should think about when working with an SSD sales pitch. I thought I share them with you.

Was the performance testing done at the SSD device level or at the file system level?
Was the SSD pre-conditioned before the testing? If so, how?
Was the performance results taken at a steady state?
How much data was written during the testing?
Where was the data written to?
What data pattern was tested?
What was the test platform used to test the SSDs?
What hardware or software package(s) used for the testing?
Was the HBA bandwidth, queue depth and other parameters sufficient to test the SSDs?
What type of NAND Flash was used?
What is the target workload?
What was the percentage weight of the mix of Reads and Writes?
Are there warranty life design issue?

I thought that these questions were very relevant in understanding SSDs’ performance. And I also got to know that SSDs behave differently throughout the life stages of the device. From a performance point of view, there are 3 distinct performance life stages

Fresh out of the box (FOB)
Transition
Steady State

As you can see from the graph below, a SSD, fresh out of the box (FOB) displayed considerable performance numbers. Over a period of time (the graph shown minutes), it transitioned into a mezzanine stage of lower IOPS and finally, it normalized to the state called the Steady State. The Steady State is the desirable test range that will give the most accurate type of IOPS numbers. Therefore, it is important that your storage vendor’s performance numbers should be taken during this life stage.

Another consideration when understanding the SSDs’ performance numbers are what type of tests used? The test could be done at the file system level or at the device level. As shown in the diagram below, the test numbers could be taken from many different elements through the stack of the data path.

Performance for cached data would given impressive numbers but it is not accurate. File system performance will not be useful because the data travels through different layers, masking the true performance capability of the SSDs. Therefore, SNIA’s performance is based on a synthetic device level test to achieve consistency and a more accurate IOPS numbers.

There are many other factors used to determine the most relevant performance numbers. The SNIA PTS test has 4 main test suite that addresses different aspects of the SSD’s performance. They are:

Write Saturation test
Latency test
IOPS test
Throughput test

The SSS PTS would be able to reveal which is a better SSD. Here’s a sample report on latency.

Once again, it is important to know and not to take vendors’ numbers in verbatim. As the SSD market continue to grow, the responsibility lies on both side of the fence – the vendor and the customer.

The recipe for storage performance modeling

By cfheoh | November 22, 2011 - 6:02 pm |November 22, 2011 Data, Disks, Performance Benchmark, RAID, SNIA

2 Comments

Good morning, afternoon, evening, Ladies & Gentlemen, wherever you are.

Today, we are going to learn how to bake, errr … I mean, make a storage performance model. Before we begin, allow me to set the stage.

Don’t you just hate it when you are asked to do storage performance sizing and you don’t have a freaking idea how to get started? A typical techie would probably say, “Aiya, just use the capacity lah!”, and usually, they will proceed to size the storage according to capacity. In fact, sizing by capacity is the worst way to do storage performance modeling.

Bear in mind that storage is not a black box, although some people wished it was. It is not black magic when it comes to performance sizing because things can be applied in a very scientific and logical manner.

SNIA (Storage Networking Industry Association) has made a storage performance modeling methodology (that’s quite a mouthful), and basically simplified it into these few key ingredients. This recipe is for storage performance modeling in general and I am advising you guys out there to engage your storage vendors professional services. They will know their storage solutions best.

And I am going to say to you – Don’t be cheap and not engage professional services – to get to the experts out there. I was having a chat with an consultant just now at McDonald’s. I have known this friend of mine for about 6-7 years now and his name is Sugen Sumoo, the Director of DBORA Consulting. They specialize in Oracle and database performance tuning and performance forecasting and it is something that a typical DBA can’t do, because DBORA Consulting is the Professional Service that brings expertise and value to Oracle customers. Likewise, you have to engage your respective storage professional services as well.

In a cook book or a cooking show, you are presented with the ingredients used and in this recipe for storage performance modeling, the ingredients (in no particular order) are:

Application block size
Read and Write ratio
Application access patterns
Working set size
IOPS or throughput
Demand intensity

Application Block Size

First of all, the storage is there to serve applications. We always have to look from the applications’ point of view, not storage’s point of view. Different applications have different block size. Databases typically range from 8K-64K and backup applications usually deal with larger block sizes. Video applications can have 256K block sizes or higher. It all depends.

The best way is to find out from the DBA, email administrator or application developers. The unfortunate thing is most so-called technical people or administrators in Malaysia doesn’t have a clue about the applications they manage. So, my advice to you storage professionals, do your research on the application and take the default value. These clueless fellas are likely to take the default.

Read and Write ratio

Applications behave differently at different times of the day, and at different times of the month (no, it’s not PMS). At the end of the financial year or calendar, there are some tasks that these applications do as well. But in a typical day, there are different weightage or percentage of read operations versus write operations.

Most OLTP (online transaction processing)-based applications tend to be read heavy and write light, but we need to find out the ratio. Typically, it can be a 2:1 ratio or 60%:40%, but it is best to speak to the application administrators about the ratio. DSS (Decision Support Systems) and data warehousing applications could have much higher reads than writes while a seismic-analysis applications can have multiple writes during the analysis periods. It all depends.

To counter the “clueless” administrators, ask lots of questions. Find out the workflow of several key tasks and ask what that particular tasks do at different checkpoints of the application’s processing. If you are lazy (please don’t be lazy, because it degrades your value as a storage professional), use a rule of thumb.

Application access patterns

Applications behave differently in general. They can be sequential, like backup or video streaming. They can be random like emails, databases at certain times of the day, and so on. All these behavioral patterns affect how we design and size the disks in the storage.

Some RAID levels tend to work well with sequential access and others, with random access. It is not difficult to find out about the applications’ pattern and if you read more about the different RAID-levels in storage, you can easily identify the type of RAID levels suitable for each type of behavioral patterns.

Working set size

This variable is a bit more difficult to determine. This means that a chunk of the application has to be loaded into a working area, usually memory and cache memory, to be used and abused by the application users.

Unless someone is well versed with the applications, one would not be able to determine how much of the applications would be placed in memory and in cache memory. Typically, this can only be determined after the application has been running for some time.

The flexibility of having SSDs, especially the DRAM-type of SSDs, are very useful to ensure that there is sufficient “working space” for these applications.

IOPS or Throughput

According to SNIA model, for I/O less than 64K, IOPS should be used as a yardstick to do storage performance modeling. Anything larger, use throughput, in which MB/sec is the measurement unit.

The application guy would be able to tell you what kind of IOPS their application is expecting or what kind of throughput they want. Again, ask a lot of questions, because this will help you determine the type of disks and the kind of performance you give to the application guys.

If the application guy is clueless again, ask someone more senior or ask the vendor. If the vendor engineers cannot give you an answer, then they should not be working for the vendor.

Demand intensity

This part is usually overlooked when it comes to performance sizing. Demand intensity refers to how intense is the I/O requests. It could come from 1 channel or 1 part of the applications, or it could come from several parts of the applications in parallel. It is as if the storage is being ‘bombarded’ by applications and this is the part that is hard to determine as well.

In some applications, the degree of intensity or parallelism can be tuned and to find out, ask the application administrator or developer. If not, ask the vendor. Also do a lot of research on the application’s architecture.

And one last thing. What I have learned is to add buffers to the storage performance model. Typically I would add about 10-20% extra but you never know. As storage professionals, I would strongly encourage to engage professional services, because it is worthwhile, especially in the early stages of the sizing. It is usually a more expensive affair to size it after the applications have been installed and running.

“Failure to plan is planning to fail”. The recipe isn’t that difficult. Go figure it out.

Tag Archives: SNIA

Open Source Storage Technology Crafters

Storage Assemblers

Encryption Key Management in TrueNAS

Key Management Interoperability Protocol (KMIP)

Oops, excuse me but your silo is showing

The Storage Compass

Not all SSDs are the same

The recipe for storage performance modeling

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense

Storage Assemblers

Share this:

Key Management Interoperability Protocol (KMIP)

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense