SOFTWARE-DEFINED OBJECT STORAGE

WHAT IS OBJECT STORAGE?

There are two types of traditional storage systems:

Block storage, which manages data as blocks within sectors and tracks, and file storage, which manages files organized into hierarchical file systems. Block storage is used by Storage Area Networks (SANs), where a SAN disk array is connected via a SCSI, iSCSI (SCSI over Ethernet) or Fibre Channel network to servers. File storage exists in two forms: File Servers and Networked Attached Storage (NAS). NAS is a file server appliance. File storage provides standard network file sharing protocols to exchange file content between systems. Standard file sharing protocols include NFS and SMB (fka CIFS). Index tables include: inode tables, records of where the data resides on the physical storage devices or appliances, and file paths, which provide the addresses of those files. Standard file system metadata, stored separately from the file itself, record basic file attributes such as the file name, the length of the contents of a file, and the file creation date.

Block Storage versus File Storage versus Object StorageObject storage is designed to be massively scalable and as such is fundamentally different from traditional block or file storage systems. Object storage organizes information into containers of flexible sizes, referred to as objects. Each object includes the data itself as well as its associated metadata and has a globally unique identifier, instead of a file name and a file path. These unique identifiers are arranged in a flat address space, which removes the complexity and scalability challenges of a hierarchical file system based on complex file paths.

Metadata in object storage systems can be augmented with custom attributes to handle additional file-related information. Doing so with a traditional storage system would require a custom application and database to manage the metadata (these are known as “extended attributes”).

Object Storage Protocols:

Natively, object storage systems speak RESTful / HTTP protocols, the same ‘language’ as the Internet. Because of this native support for Web protocols, an object storage system is perfectly suited to Web 2.0, Cloud-native and XaaS use cases. Historically, this Web-centricity was considered an impediment to adoption by mainstream enterprise applications, which use traditional NFS, SMB, or SCSI interfaces. However, this has changed with the rise of cloud computing, mobile applications, and cloud-native applications which all use HTTP to provide and access services.

To provide universal information access within an object storage system, some object storage vendors have added support for enterprise file sharing protocols such as NFS and SMB, either natively, like Scality, or by using a cloud gateway. In addition, some object storage systems support two other important HTTP-based protocols: Amazon Web Services Simple Storage Service APIs known as S3, which is a de-facto standard; and CDMI, the Cloud Data Management Interface, an industry standard API, specified and promoted by the Storage Networking Industry Association (SNIA) for accessing cloud storage.

Data Protection:

RAID has been a classic solution for smaller storage arrays and terabytes of data. However, as systems have grown to hundreds and thousands of disks, and drives have gotten much denser, RAID has become problematic because it can’t protect data across these larger fault domains, and rebuilds of single 8 terabyte drives can take days and impact overall system performance. Rather than using RAID to protect data, object storage provides for redundancy and high availability in two ways:

Replication is a data protection technique that stores multiple copies of each object on different nodes and, potentially, across multiple, geographically dispersed data centers. It is particularly appropriate for the protection of large numbers of small files.

Large files, on the other hand, are best protected using a technology called Erasure Coding. Erasure Coding divides an object into pieces, and calculates multiple parities. In the event that the original file, or some of the pieces of it are lost, the system can use the parities and the remaining pieces to recalculate the original data. Some object storage implementations store only parities, requiring a processor-intensive recalculation and decoding of data to access.

Software-defined:

Most object storage solutions are architected to run on inexpensive commodity x86 hardware. Each server constitutes a node, which provides both compute and storage resources. This allows for the linear scaling of both capacity and performance by simply adding additional nodes. Although object storage service is often sold as a storage appliance (hardware with installed software), pure object storage is ‘software-only’ and is typically hardware-agnostic.

Distributed Architecture:

Object Storage solutions are often designed as a distributed architecture, a collection of distributed servers operating in parallel requiring no special machine or machines to provide or manage specific services. Instead all responsibilities are divided among the machines and don’t require a central ‘control’ machine. Thus, there is no risk of a single point of failure (SPOF) in the architecture.

When evaluating object storage, you should ask how metadata is accessed and how it scales. While they may be designed to avoid SPOF, many are still designed with a specific set of metadata nodes, or are based on non-scalable relational databases. These types of designs will suffer from performance and availability degradation at scale (e.g. tens of millions of objects, really a starting point for object stores).

The distributed nature of object storage enables two characteristics essential to massive scalability:

Shared-nothing architecture is a distributed design that combines independent and autonomous nodes into a federated data store. Because none of the nodes share memory or disk storage, there are no single points of contention, making it uniquely suited to massive scale. Furthermore, because nodes are independent, they can be easily added and removed to accommodate changing performance and scalability requirements.

Parallel tasks: Distributed systems can be designed to allow very large numbers of tasks to be run in parallel. In Scality’s RING, this capability has been developed to support very high levels of aggregate throughput.

Object Storage in the Cloud

What is object storage in the context of cloud? An object storage system is the infrastructure used by many if not most cloud storage services. In fact, these services might be termed cloud object storage. As explained earlier, with software-defined object storage, server resources are aggregated into a single massively scalable pool—an object storage service—distributing capacity to applications and users on demand.

You don’t need to use the public cloud as, effectively, an object storage server. You can build the equivalent in your own data center, with an object storage system like the Scality RING. The result is what could be called private cloud object storage, or an enterprise cloud. You gain parallel advantages to using public cloud data storage: unmatched scalability, higher reliability, and lower costs, thanks to the ability to use standard x86 hardware. All in all, the Scality RING is an attractive on-premises alternative for companies that feel their data is too sensitive for a public cloud object storage option.

If you’re a cloud storage provider, or would like to become one, the Scality RING object storage system is the perfect choice for your core infrastructure. Deliver an object storage service to enterprises or individuals, for a variety of use cases such as capacity-driven workloads and cloud backup services.

Object Storage Use Cases:

Object storage is best suited to the storage of unstructured data, rather than for transactional data in databases, which requires serial operations. It it most commonly used for active archive applications, content distribution, public and private cloud services. Although historically, object storage was only used for “cold” data, the performance requirements of primary storage are well within the reach of some contemporary object storage products, including Scality’s RING.
 
Download the Private Storage Cloud Data Sheet