Digital Transformation is again a big word for 2020. As more and more organizations becoming digitalized, the opportunity to communicate, interact and collaborate has become easier, faster, more convenient than ever.
File Sharing forever
Working in projects, file sharing is a fundamental activity that underpins communication and collaboration. Network drives via NAS (network attached storage) for file sharing are common within the confines of the company network. The perimeter of the company’s network is further extended via VPN (virtual private network) access, allowing branch offices and remote individuals to access the files from the central NAS server. It is a workable solution albeit poor network performance in delivery, challenges of siloed data management and difficult scalability.
The phenomenon of Dropbox
When Dropbox arrived circa 2008-2009, it took the industry by storm. They practically invented the term BYOD (bring your own device) and capture the imagination of the file sharing market. Gartner recognized this and coined EFSS (enterprise file sync and share) to consolidate the burgeoning file sharing market. Pretenders and challengers flooded the market, and after the shakedown, Box.net, Microsoft OneDrive, Google Drive and of course, Dropbox, are some of the market leaders today.
It is from one of my FreeNAS customers daily security run logs, emailed to our email@example.com alias. It is attempting a brute force attack trying to crack the authentication barrier via the exposed SSH port.
Just days after the installation was completed months ago, a bot has been doing IP port scans on our system, and found the SSH port open. (We used it for remote support). It has been trying every since, and we have been observing the source IP addresses.
The new Ransomware attack vector
This is not surprising to me. Ransomware has become more sophisticated and more damaging than ever because the monetary returns from the ransomware are far more effective and lucrative than other cybersecurity threats so far. And the easiest preys are the weakest link in the People, Process and Technology chain. Phishing breaches through social engineering, emails are the most common attack vectors, but there are vhishing (via voicemail) and smshing (via SMS) out there too. Of course, we do not discount other attack vectors such as mal-advertising sites, or exploits and so on. Anything to deliver the ransomware payload.
Something triggered my thoughts a few days ago. A few of us got together talking about climate change and a friend asked how green was the datacenter in IT. With cloud computing booming, I would say that green computing isn’t really the hottest thing at present. That in turn, leads us to one of the most voracious energy beasts in the datacenter, storage. Where is green storage in the equation?
What is green?
Over the past decade, several storage related technologies were touted as more energy efficient. These include
Tape – when tapes are offline, they do not consume power and do not require cooling
Virtualization – Virtualization reduces the number of servers and desktops, and of course storage too
MAID (Massive Array of Independent Disks) – the arrays spin down the HDDs if idle for a period of time
SSD (Solid State Drives) – Compared to HDDs, SSDs consume much less power, and overall reduce the cooling needs
Data Footprint Reduction – Deduplication, compression and other technologies to reduce copies of data
SMR (Shingled Magnetic Recording) Drives – Higher areal density means less drives but limited by physics.
The largest gorilla in storage technology
HDDs still dominate the market and they are the biggest producers of heat and vibration in a storage array, along with the redundant power supplies and fans. Until and unless SSDs dominate, we have to live with the fact that storage disk drives are not green. The statistics from Statistica below forecasts that in 2021, the shipment of SSDs will surpass HDDs.
Today the areal density of HDDs have increased. With SMR (shingled magnetic recording), the areal density jumped about 25% more than the 1Tb/inch (Terabit per inch) in the CMR (conventional magnetic recording) drives. The largest SMR in the market today is 16TB from Seagate with 18TB SMR in the horizon. That capacity is going to grow significantly when EAMR (energy assisted magnetic recording) – which counts heat assisted and microwave assisted – drives enter the market next year. The areal density will grow to 1.6Tb/inch with a roadmap to 4.0Tb/inch. Continue reading →
Cloud Data Management is a tricky word. Often vague, ambigious, how exactly would you define “Cloud Data Management“?
Fresh off the boat from Commvault GO 2019 in Denver, Colorado last week, I was invited to sample Veeam a few days ago at their Solution Day and soak into their rocketing sales in Asia Pacific, and strong market growth too. They reported their Q3 numbers this week, impressing many including yours truly.
I went to the seminar early in the morning, quite in awe of their vibrant partners and resellers activities and ecosystem compared to the tepid Commvault efforts in Malaysia over the past decade. Veeam’s presence in Malaysia is shorter than Commvault’s but they are able to garner a stronger following with partners and customers alike.
I wrote about Digital Transformation a few weeks ago. In the heart of it, People are the real key to the transformation of every organization. Following up what I described earlier, Change is the factor that People in every organization have to embrace.
Drowning and going blind
We are swarmed by technology. We are inundated with everything digital and we are attracted to the latest buzz and hype. In the sea of it all, these things have made us, the People reliant of technology. This reliance, this needy dependency, has made us complacent. We settle because the boring and mundane tasks have been taken away from us. Moreover, the constant firehose feeding our lives has created “digital drowning“, a situation I would like describe as gasping for a breather to think clearly. We are bogged by digital quagmire, blinded by what shiny things and we lose sight of the strategic focus.
We shrivel and we go back to what we think is our comfort zone.
Change is constant and uncomfortable
I once read that our known comfort zone is no longer our safety zone. That idea of everyone’s safety zone has been obliterated aeons ago. I love the following quote from Seth Godin, my absolute marketing guru.
As he rightly pointed out, “There is no ‘ever after’. There’s just the chaos of now“. We don’t arrive at a comfortable place after the change. There is no comfortable place or safety place for that matter … at all. The Digital Transformation or what ever Information Age we described our generation earlier, is constant change. We have to ride the hungry bear and we have to saddle the ferocious dragon at all times. We have to learn to ride the bucking bronco!
The hype of Deep Learning (DL), Machine Learning (ML) and Artificial Intelligence (AI) has reached an unprecedented frenzy. Every infrastructure vendor from servers, to networking, to storage has a word to say or play about DL/ML/AI. This prompted me to explore this hyped ecosystem from a storage perspective, notably from a storage performance requirement point-of-view.
One question on my mind
There are plenty of questions on my mind. One stood out and that is related to storage performance requirements.
Reading and learning from one storage technology vendor to another, the context of everyone’s play against their competitors seems to be “They are archaic, they are legacy. Our architecture is built from ground up, modern, NVMe-enabled“. And there are more juxtaposing, but you get the picture – “We are better, no doubt“.
Are the data patterns and behaviours of AI different? How do they affect the storage design as the data moves through the workflow, the data paths and the lifecycle of the AI ecosystem?
I like LTFS (Linear Tape File System). I was hoping it would take off but it has not. And looking at its future, its significance is becoming less and less relevant. I look if Cloud has been a factor in the possible demise of LTFS in the next few years.
What is LTFS?
In a nutshell, Linear Tape File System makes LTO tapes look like a disk with a file system. It takes a tape and divides it into 2 partitions:
Index Partition (XML Index Schema with file names, metadata and attributes details)
Data Partition (where the data resides)
Diagram from https://www.snia.org/sites/default/orig/SDC2011/presentations/tuesday/DavidPease_LinearTape_File_System.pdf
It has a File System module which is implemented in supported OS of Unix/Linux, MacOS and Windows. And the mounted file system “tape partition” shows up as a drive or device.
There were many attempts to kill off tapes and so far, none has been successful.
Let’s start this story with 2 supposed friends – Dave and Hal.
How do we become friends?
We have friends and we have enemies. We become friends when trust is established. Trust is established when there is an unsaid pact, a silent agreement that I can rely on you to keep my secrets private. I will know full well that you will protect my personal details with a strong conviction. Your decisions and your actions towards me are in my best interest, unbiased and would benefit both me and you.
I feel secure with you.
AI is my friend
When the walls of uncertainty and falsehood are broken down, we trust our friends more and more. We share deeper secrets with our friends when we believe that our privacy and safety are safeguarded and protected. We know well that we can rely on them and their decisions and actions on us are reliable and unbiased.
AI, can I count on you to protect my privacy and give me security that my personal data is not abused in the hands of the privileged few?
AI, can I rely on you to be ethical, unbiased and give me the confidence that your decisions and actions are for the benefit and the good of me, myself and I?
My AI friends (maybe)
As I have said before, I am not a skeptic. When there is plenty of relevant, unbiased data fed into the algorithms of AI, the decisions are fair. People accept these AI decisions when the degree of accuracy is very close to the Truth. The higher the accuracy, the greater the Truth. The greater the Truth, the more confident people are towards the AI system.
But we have to careful here as well. Accuracy can be subjective, paradoxical and enigmatic. When ethics are violated, we terminate the friendship and we reject the “friend”. We categorically label him or her as an enemy. We constantly have to check, just like we might, once in a while, investigate on our friends too.
To me, containers and container orchestration (CO) engines such as Kubernetes, Mesos, Docker Swarm are fantastic. They scale effortlessly and are truly designed for cloud native applications (CNA).
But one thing irks me. Storage management for containers and COs. It was as if when they designed and constructed containers and the containers orchestration (CO) engines, they forgot about the considerations of storage and storage management. At least the persistent part of storage.
Over a year ago, I was in two minds about persistent storage, especially when it comes to the transient nature of microservices which was so prevalent and were inundating the cloud native applications landscape. I was searching for answers in my blog. The decentralization of microservices in containers means mass deployment at the edge, but to have the pre-processed and post-processed data stick to the persistent storage at the edge device is a challenge. The operative word here is “STICK”.
Two different worlds
Containers were initially designed and built for lightweight applications such as microservices. The runtime, libraries, configuration files and dependencies are all in one package. They were meant to do simple tasks quickly and scales to thousands easily. They could be brought up and brought down in little time and did not have to bother about the persistent data stored by the host. The state of the containers were also not important to the application tasks at hand.
Today containers like Docker have matured to run enterprise applications and the state of the container is important. The applications must know the state and the health of the container. The container could be in online mode, online but not accepting data mode, suspended mode, paused mode, interrupted mode, quiesced mode or halted mode. Each mode or state of the container is important to the running applications and the container can easily brought up or down in an instance of a command. The stateful nature of the containers and applications is critical for the business. The same situation applies to container orchestration engines such as Kubernetes.
Container and Kubernetes Storage
Docker provides 3 methods to local storage. In the diagram below, it describes: