San Francisco – October 26th, 2012
As Internet use continues to skyrocket, the volume of files being produced, shared and stored climbs at a tremendous pace.
The traditional methods of storing this kind of ‘unstructured data’ show signs of failing under the strain of this completely unanticipated load. In reaction, system administrators work furiously to plug the fissures appearing in their infrastructure, and CIOs cast about urgently trying to assess and source potential solutions to these tectonic shifts.
Although this is a new and unique challenge within the data storage world, other industries have been through equivalent shifts in scale, and the solutions that they evolved to might serve as illuminating parallels for these beleaguered decision makers.
The shipping container industry might seem an unlikely source of relevant information but holds some poignant insights on scalability, flexibility, automation and cost efficiency.
The story of this parallel system is as old as civilization itself. Trade was based on the slow, arduous and often unpredictable transport of relatively small quantities of high value items, along established trade routes. But as populations grew and technology evolved, the traditional demand and supply chains broke and reformed and the markets for manufactured goods exploded.
It took quite some time for the transportation industry to respond to this radical shift in scale, but in 1953 container shipping was invented and the shipping industry entered a new era. Instead of transporting individual bags and boxes on ships and trains and camel caravans – packing and unpacking them at each stage of their journey – the container consolidated vast quantities of goods and transported them en masse aboard massive ships to centralized transportation hubs where whole containers could be picked up and forwarded by train or truck, onward to their final destinations.
Probably the most obvious lesson to be derived from this history is that the scale of everything had to change radically. The size of the ‘boxes’, the size of the transporting vessels, the technology necessary to move these enormous containers on and off ships, trains, and trucks.
Of course, where there are such massive changes in the scale of a system’s operations, it requires equivalent changes in the way those systems are automated and administered. Indeed if you are trying to get several thousand containers onto and off a single vessel simultaneously and manage dozens of such moves a day, you are going to need some serious planning, precision scheduling and some very intelligent automation systems.
Today container shipping is a highly automated industry. Computers determining the optimal placement of boxes, to keep the contents safe and ensure that loading and unloading is expedited. Computers schedule the arrival and departure of trucks, trains, ships, and computers chart courses and more or less run the container ships – with not much more than a skeleton crew.
The whole operation needs to run with complete predictability and split second timing because at these scales – damaged or lost containers (or ships) or seemingly minor delays in unloading cost millions, even billions, of dollars in lost revenue, spoiled goods, and angry customers. All of this leaves very little room for human error.
Probably the most important lessons to be derived from the container shipping industry though – relate to the commoditization & standardization of the components and systems. The interchangeable nature of the containers, the ships, the cranes, the trucks, and the trains keeps the infrastructure and management costs to a minimum, and ensures complete flexibility in the event that one of these components needs to be ‘swapped out’, and allows for the smooth transition of goods from source to destination.
This aspect is an integral and fundamental aspect of the industry’s success. It is the interchangeable nature of the components that makes the system work.
So in summary the container shipping industry teaches us to take scale very seriously – choosing an infrastructure that scales, linearly, without introducing new costs associated with that growth. When we are applying this to Storage, these lessons must be applied to both capacity and performance, adding a dimension to the scale. So we need to select infrastructure that can natively scale the number and size of files, and the speed at which they are accessed and moved – by design, and not as an afterthought.
Automation is not optional. At scale, human beings become a liability – they operate in too serial a manner to keep up with the parallel demands and split second timing needed by systems like these. The systems themselves need to be intelligent, not just automated but problem solving. Ideally they should be able to identify and work around problems. In our software derived, virtual world they can even heal data, rebalance loads and retire infrastructure.
Commodity hardware and standard interfaces are essential in rapid growth, low margin environments. There is simply no room for the costs associated with vendor lock in, and absolutely no time to be wasted on developing custom interfaces between individual applications, or appliances.
Scality RING delivers native scalability and high performance cost effectively with its Object Storage, matching all the characteristics we just mentioned. Its organic, self-healing architecture is highly automated, minimizing the risk of human error.
And it is completely hardware vendor agnostic – running on commodity x86 hardware – thereby ensuring that CAPEX costs are kept to a minimum, and that appropriate density hardware can be utilized as necessary. It also ensures that as hardware technology evolves, data can be easily and seamlessly moved to the newer hardware.
It also offers a range of standard interfaces including REST / HTTP, CDMI and a Scale Out File System.