A storage revolution? The time is now! 0

Thanks to game changing companies, such as VMware, the IT industry’s “status quo” has been rapidly evolving. Virtualization and cloud computing technologies have enabled businesses to become more agile and able to adapt to changing business needs quickly and efficiently without worrying if their IT infrastructure can keep up.

The explosion of virtualization and cloud computing technologies has been helped tremendously by changes in thinking, as well as more powerful hardware. Three of the “core four” components have seen huge upgrades in the past few years. CPUs now have 6+ cores per socket, motherboards support terabytes of memory, and 10 gigabit connections and deep integration with hypervisors has become more mainstream in the networking realm of things. Unfortunately, I feel as if storage has been left behind.

We all know that storage area networks (SANs) have been around for quite a while now. They are mainstream in larger businesses and enterprise environments. SMBs typically cannot afford to include SAN infrastructure in their IT budgets. SAN technologies have evolved minimally over the past few years. Sure they have faster drives via SSDs and interconnects via fibre channel and iSCSI, however they still remain prohibitively expensive for most SMBs. I also make the argument that a SAN is a single point of failure, unless you go the really expensive route and have SAN to SAN replication set up (I know there are those out there that would like to refute this point, but that is another discussion altogether).

Lately there has also been some development around so-called “storage hypervisors”. The premise of a storage hypervisor is that an additional layer of abstraction is added to the storage environment that allows the pooling of multiple SANs into one manageable layer. In my opinion, storage hypervisors are more of a management tool than a game changer. I can see where they would be beneficial in order to achieve an even higher ROI on an organization’s existing SAN infrastructure, however, they do not solve the issue of up-front costs.

VMware has taken some steps to help SMBs that would like to take advantage of some of the more advanced features of vSphere by creating the vSphere Storage Appliance (or VSA). This appliance allows up to three ESXi hosts to use their local disks in a redundant and replicated fashion, allowing for the use of vMotion, Storage vMotion, etc. This is a step in the right direction and solves a niche issue, however it does not represent a storage revolution that will be a game changer for the IT industry as a whole.

Many SMBs have resorted to hosting their applications in the cloud (which is a good thing!), however cloud service providers themselves face unique storage challenges as well. Cloud service providers have a great need for large quantities of shared storage. Again, SANs become prohibitively expensive at a certain point. Cloud service providers taking advantage of advanced features, such as auto-deploy for deploying stateless systems, run into growing pains when it comes to the scalability of underlying storage systems. Nowadays it seems as if CPU, memory, and networking are the lowest cost of a large scale deployment.

What the IT industry needs is a block-based storage solution that can be built upon distributed, commodity off-the-shelf hardware that can be scaled up to thousands of nodes. Basically, a (preferably open source) version of Amazon’s EBS product. If you were to couple that sort of technology with VMware and their vCloud offerings – THAT would be a game changer.

As of today, there are three very promising projects in existence that aim to solve this very issue. The Ceph project, the Sheepdog project, and pNFS.

As taken from their web site, “Ceph is a distributed network storage and file system designed to provide excellent performance, reliability, and scalability.” The Ceph project provides a block-based storage driver, meaning you would have iSCSI access into the distributed data store. Ceph is actually an object storage system at heart. Its distributed nature is what makes it so appealing. Ceph is sponsored by Dreamhost, a web hosting and cloud services company.

Sheepdog is a distributed storage system that is primarily aimed at the QEMU/KVM hypervisor, however, it has high potential and could be adapted to work with other environments, such as Vmware’s ESXi.

pNFS is important because it brings together the benefits of parallel I/O with the benefits of the ubiquitous standard for network file systems (NFS). It allows for massively scalable storage without diminished performance. Even though pNFS is not a block-based storage system, it addresses many of the same problems discussed above, albeit in the NAS realm of things.

Unfortunately, at this point in time all three of these projects are in their early development phases, thus they are considered unstable. Each project has the potential to completely revolutionize the way we procure and utilize storage. Let’s hope that storage is on its way to a full-blown revolution. If / when it happens, I can only imagine the possibilities that will come to light!