By Gregoire Doumergue, global support engineer, Scality
The unending barrage of data continues, coming faster than ever. With data backups, surveillance streams, video, email, files, genomics and more, most of that data is unstructured data. Object storage systems are designed for this type of data at petabyte scale.
As AWS has become the leading service in public cloud computing, the company’s AWS S3 (simple storage service) has grown to become the largest and most popular of this kind. Scality provides the same S3-compatible storage to allow access to and management of the data it stores over an S3-compliant interface. Let’s examine three features of Scality’s S3-compatible storage.
Bucket versioning simplifies preserving, retrieving and restoring
Bucket versioning is a key feature on S3. Scality implemented it to be 100% compatible with AWS specifications. When a bucket is versioned, a single S3 object can have multiple payloads, and each payload is identified by a version ID. This version is assigned by the S3 cluster as the result of a successful upload.
At its core, bucket versioning is a way to keep multiple variants of an object in the same bucket. This feature is useful for preserving, retrieving and restoring each version of every object stored in your buckets. Versioning helps you to more easily recover from application failures and unintended user actions.
Upon deletion actions, objects in a versioned bucket are not deleted but rather, hidden behind a particular version called a DeleteMarker. This mechanism prevents accidental data deletion. To effectively delete payload, one must provide the ID of the object’s version.
If you need to empty a versioned bucket, Scality technical services can provide a python3 script, which depends on the boto3 python library. Here’s how to run it:
$ ./bucket_clean.py –endpoint http://node –bucket buckettobedeleted
are you sure? (y/N): y
Read our blog about discovering data from existing buckets for more info.
Object Lock is the key to WORM
S3 Object Lock is the feature in S3 providing write-once-read-many (WORM) capabilities. As with bucket versioning, Scality has implemented it to be 100% compatible with AWS specifications. With AWS’ API, objects can be stored using a WORM model. You can use it to prevent an object from being deleted or overwritten for a fixed amount of time or indefinitely. Object Lock helps meet regulations that require WORM storage or simply add another layer of protection against object changes and deletion.
Object Lock has a few requirements in order to operate. First, you’ll need to set the Object Lock flag while creating a bucket, and versioning needs to be enabled on that bucket. In addition, Object Lock must be enabled on a bucket in order to write a lock configuration using the PUT Object Lock Configuration API that has its object lock flag set.
S3 has several means to set the lock configuration of an object. Retention modes, including Governance and Compliance modes, retain the lock on an object until a set period of time expires. Governance mode lets you delegate permission to certain users to override the lock settings. Only the root account or a user with s3:BypassGovernanceRetention permission can send a delete request with x-amz-bypass-governance-retention:true header to override and delete the object. This can be useful to protect a backup file from ransomware attack or accidental deletion. Governance mode is also used to test retention-period settings before creating a compliance-mode retention period.
If you place a lock on an object using Compliance mode, the object version cannot be deleted by any user until the retention period expires. This includes the root user (account credentials) in the account; no other user can be given permission either to override the settings or delete the version.
Track storage use more easily with UTAPI
Short for Utilization API, UTAPI was initially built as a metering application for Scality customers to track storage use. A custom API was built because comparable services in AWS did not have a full set of features that met the requirements at that time.
Version 1 had certain limitations and issues. Requirements changed; the initial design was done to manage buckets in the range of thousands. Scalability issues arose when hundreds of thousands of buckets were created and deleted at random. There was high CPU and RAM usage from the key-value database (Redis) used by UTAPI during heavy usage.
In addition, incorrect metrics pushed from Cloudserver APIs to Redis for reasons such as:
- Corner cases of API usage like object overwrite, buckets deleted with incomplete multipart upload objects, and so on.
- Bugs in reindexer that resulted in incorrect metrics, even after reindexing.
Here’s what’s different about the V2 design: It removes Redis and introduces Warp10, a time series database (TSDB) for persistent storage of metrics. The time series data format of Warp10 efficiently stores data for both storage on disk and retrieval. Finally, it addresses the scalability issues faced in v1 where hundreds of thousands of buckets are created and destroyed on the platform.
Ideal object storage
This is a quick overview of three key features of Scality’s S3 Connector. We offer the S3 Connector for RING as the ideal solution for organizations that require an on-premises deployment model to maintain control over sensitive data, for performance optimization, or for reasons of security or compliance. Scality continues to enhance the RING Supervisor UI to provide simplified access to these S3 bucket capabilities.
For more details, take a look at this informative post that dives into the question, “what is S3 compatible storage?”