RAID used to be the accepted norm for ensuring resilience and accessibility of stored data, however RAID simply does not scale to the extent required by today’s data storage needs. As data volumes grow, RAID becomes a nightmare to manage, and introduces tremendous administrative complexity. Furthermore, when using high capacity disk drives, rebuilds take much longer, increasing the probability of multiple failures, which results in an unacceptably high probability of data loss. Double-parity does not change this fact; it simply delays the problem.
Replication, on the other hand, does a very good job, providing resilience and reliability even at tremendous scale. The catch is that, at very large capacities, keeping 2, 3 or even more copies becomes cost prohibitive.
Erasure Coding technology addresses both the issues of concurrent multiple failures and the cost concerns associated with large scale deployments. Scality’s Advanced Resilience Configuration (ARC) offers erasure coding technology for RING. By default, ARC is configured to create 14 data fragments and 4 checksums or parity fragments, but this is completely configurable.
ARC(14,4) supports 4 simultaneous storage, node or even rack failures—for only 30% storage overhead, as opposed to 300% for standard, three-copy replication. ARC maintains one full, unscrambled copy of each object to provide direct data without data rebuild overhead and to maintain a very high IO rate. ARC is delivered as a built-in feature within the RING core product, and the administrator can choose between replication or ARC based on storage protection needs and cost considerations. For geo-redundancy, ARC can be combined with replication to build 1:N topology of ARC sites, capable of reaching very high (99.99999999999%) durability levels.
The following diagram illustrates the difference between ARC, replication and dispersed approaches:
For more information, please see our ARC data sheet.