Growing a SaaS service means adding users. Just a small change in the number of users can cause stored data to expand by multiple orders of magnitudes. Unstructured data (such as photos, blogs, videos, CAD/CAM diagrams, etc.) represent a highly significant proportion of the new data growth. Traditional storage is simply not architected to handle this “Internet scale” data.
In their attempt to manage the exponential increase in unstructured data, traditional storage systems frequently resort to byzantine workarounds that inevitably break, causing significant data loss and loss in productivity. These workarounds also tend to produce system sprawl, leading to significant increases in operational complexity. For this reason, traditional storage systems inevitably come up short when dealing with the rapid growth of data and the increasing demands of Software-as-a-Service applications, which are more properly tailored to a pay-per-use pricing system than the upfront storage costs of traditional solutions.
Key Challenges for SaaS Applications
Traditional or legacy storage tends to be overly complex, and exceedingly difficult to scale beyond a low petabyte capacity. This provides insufficient performance to meet user expectations, or to increase the number of managed files and objects to levels commonly required for SaaS. This would be problematic enough if all SaaS clients were co-located with the SaaS.storage, but it becomes especially unmanageable when users of the same organization must access the same files or data concurrently from different locales across the globe.
The following obstacles, which are particularly recalcitrant to traditional storage solutions, must be overcome to help make SaaS into a sustainable business:
- Incapacity to scale storage to levels required by SaaS Traditional solutions are plagued by bottlenecks that cause unexpected customer service disruptions. They typically lead to storage system sprawl, with excessive expenditures of time and money on management, infrastructure, and ongoing data migrations—all in an ultimately futile effort to chase down the root causes of data storage and retrieval problems.
- Incompatibility with distributed and shared geographic accessGeographically dispersed users must be routed to the data origin site to access shared SaaS data. Data must be automatically moved to the user based on policy, something that is not particularly well handled by traditional storage solutions.
- Excessive HA and DR costs and complicationsWith traditional storage, high availability and disaster recovery are predicated on storing multiple copies of data. Expensive hardware and data center doubling (or even tripling) to accommodate the explosion in data add to storage system complexity and lead ultimately to operator and customer frustration.
- Exceedingly high TCOTotal cost of ownership is based on a model that does not work for SaaS service providers, whose margins are thin, whose pricing must be extremely competitive.
SaaS applications can never be offline or the service provider loses customers. However, traditional storage is designed to be down for scheduled tech refreshes and migrations. Arranging for traditional storage to never be down guarantees incredibly high costs that are impractical for the typical SaaS service provider. During a tech refresh, data migration alone severely impacts storage uptime and availability. In those cases where traditional storage can meet the challenges of scaling, distributed access, and availability, it does so at such a high cost (both upfront and ongoing), that it becomes extremely damaging to the continued financial tenability of the business.
Essential SaaS Requirements
SaaS customers start with GBs of storage capacity, but rapidly expand to TBs. It doesn’t take that many customers to quickly consume PBs of capacity and billions of stored objects. EBs of capacity is not just a theoretical concern to the SaaS provider—it is real and imminent.
Geographically dispersed users also challenge the SaaS provider to distribute copies of the data across vast distances. This requires a storage system container that can distribute data to the location that best meets SLAs, and that provides the best combination of performance, service, and availability to the SaaS customer. SaaS storage systems must meet all of the following mininimal requirements:
- Ability to scale to billions of objects while maintaining performance for all users without disruptionSaaS storage systems must be able to accommodate PBs to EBs of capacity, and billions of objects or files, with satisfactory performance for millions of concurrent users.
- Ability to share files across geographically distributed locationsFiles and data must be movable, based on policy, to where they are required when they are required.
- Always available and onlineFive nines (99.999%) availability all the time
- Competitive TCOTightening margins and higher technology costs are placing ever increasing pressure on IT storage budgets.
A geographically dispersed customer base creates the “always up and available” storage requirement, 7 x 24 x 365. Meeting this requirement dictates that SaaS storage be adaptive, flexible, and always online, providing for transparent tech refreshes and enabling all scheduled and unscheduled maintenance to be performed without user disruption.
Decreasing margins and increasing competition has put added pressure on SaaS providers to meet or exceed these storage imperatives at the lowest possible TCO. Given the SaaS industriy’s broad revenue variability, a storage system with pay-by-the-drink (pay-per-use) pricing would seem to be a much better fit than the upfront pricing of traditional storage (pricing for fixed, preset storage capacity).
The Solution Scality RING™ Organic Storage
Scality RING Organic Storage is architected from the ground up to meet and exceed all SaaS provider requirements. It scales capacity into the exabytes, files or objects into the billions, and can do so over a geographically dispersed area. The scalability of the RING solution is the direct result of its unique Distributed Hash Table (DHT). DHT is an extraordinarily efficient lookup methodology that enables storage and retrieval of very large numbers of files or objects at a very high level of performance.
Scality RING Organic Storage provides unparalleled data, nodal, and system availability by leveraging its distinctive industry-hardened, carrier-grade peer-to-peer technology. The RING also comes with unequalled built-in system data resilience similar to an organic immune system. Every node constantly monitors a limited number of its peers, automatically rebalancing replicas and load to make the system fully self-healing without human intervention. Consistent hashing guarantees that only a small subset of keys is ever affected by a node failure or removal.
The RING also rebalances the data load automatically when a node fails, is removed or upgraded, or when new nodes are added. RING makes technology refresh a simple, online process with no application disruptions, eliminating data migration, long nights, and sleepless weekends. The result is a very high level of fault tolerance because the system stays reliable even with nodes joining or leaving the ring. Scality RING keeps costs low by enabling the use of standard off-the-shelf commodity server nodes, and through the use of a paradigm-shifting pay-by-the-drink pricing model. Unlike traditional storage, Scality RING charges are based on used capacity, not raw storage capacity, thereby assuring the lowest possible storage TCO.
© 2012 Scality. All rights reserved. Specifications are subject to change without notice. Scality, the Scality logo, Organic Storage, RING, RING Organic Storage, are trademarks or registered trademarks of Scality. in the United States and/or other countries.