Active Archives at Petabyte Scale with Access and Performance on Demand
Customers consistently choose the Scality RING for:
- Hardware Freedom: Pure software, runs on ANY x86 standard Linux based servers
- High Performance: Line-rate performance, scales linearly across mixed workloads
- 100% Reliability: Reliability with minimal intervention at massive scale
- It Just Works: Production proven in most demanding environments
Nearline or Active Archives are positioned in the Goldilocks Zone, between the high cost of real-time online storage, and the low cost offline storage with very high access latency. Delivering the same data durability as online, nearline archives create new opportunities to mine, extract value, and generate new revenue streams, unlike legacy cold archives. Providing near online availability at offline price point, and is characterized by primary data that is mostly-written at petabyte scale, nearline is here to stay.
This Performance Spotlight highlights why high throughput performance with space and cost-efficient erasure coding matters for active archive.
The RING throughput performance saturates across the 10GbE link, using multi-generation hardware with large, fixed and mixed object size distribution. Reaching over 1000 MB/s of aggregate performance using a single native REST connector, throughput performance results include erasure coding to highlight the high data durability. The RING is designed to address active archive challenges and performance workloads across a broad spectrum of applications at petabyte scale while supporting optimal availability and durability.
A disaster recovery or backup retrieval situation almost instantly changes the attributes for data access requirement; it is bound by access time and speed of data transfer. High throughput performance with erasure coding enables faster retrieval times and ensures data will be safe and can be accessed with maximum efficiency. Active Archives provide significant added value, they are searchable (unlike cold/offline archives) and can be offered as a service for research. Preserved as a digital library, Active Archives enable faster legal compliance actions and create new revenue streams for paid service models such as video archives.
Active Archives serve a wide spectrum of verticals: Media & Entertainment, Financial Enterprise, Government, Public and Commercial HPC. All seek benefits of deploying their own active archive at cloud scale.
An Active Archive is a primary copy of infrequently accessed information that has been moved off the primary production storage to a tier-two system without impacting user or application workflows. Data workloads are characterized by large file sizes (see table), where videos, imaging research data, scientific archives such as meteorological data, and cloud backups seek instant accessibility. Media and entertainment industries build their own private clouds and leverage active video archives at petabyte scale. High-resolution video files must ingest rapidly into the archives and be made quickly accessible for broadcasting. Legacy archival system latencies no longer suffice when client queries scale: offline storage (tapes) will incur lead times of minutes, hours or even days before the first byte of data can be made available.
The RING erasure coding mode enables significantly lower overhead for data storage and minimizes network traffic compared to replication. It is designed for high failure tolerance at a lower cost than replication. Users may use any combination of data and coding chunks for different classes of service to provide high data durability in production deployments.
Why Choose Scality RING for Active Archive?
- Instant access to archived data
- Scales as a single, uniform system
- Lower TCO vs tier-1 online and offline tape
- No hidden data access fees
- Choose any x86 standard hardware
- Line rate throughput performance
- Limitless storage
The performance was validated at a customer lab, and there are more details available upon request. The performance scales up to saturate the 10GbE link with larger fixed size objects and a distribution of mixed size objects (512KB to 64MB). Both workloads quickly maximized the bandwidth utilization and plateau write throughput. Consistently high throughput performance was observed across mixed and fixed size workloads using only a single native REST connector.