From POSIX to AWS and S3 – an Evolution
From Portable Operating System for Unix Filesystems (POSIX), introduced first in 1988 to Amazon Web Services Simple Storage Service (S3), introduced in 2006, data persistence is experiencing a revolution. In 2013, AWS published the fact that their S3 platform housed over 2 trillion objects, and that the number was doubling every year. That would bring them to something like 16 trillion objects stored and if we assume that the average size of objects is only 256KB in size, the platform would currently house somewhere well into the Exabyte capacity range, putting the platform somewhere around 1% of all the world’s digital data, probably more.
This is surely the single largest data storage platform on earth.
The success of AWS and their storage platform can no longer be ignored, but does S3 make sense for the future? It’s interesting to reflect back on why AWS introduced such a protocol to the market and why it does or doesn’t make sense.
We cannot know all what Amazon was thinking when the protocol was introduced, but traditional filesystem constraints are well known. AWS was designed to provide on-demand IT infrastructure in what today is called IaaS. Being able to start and stop services on demand was wonderful, but information created needed to be made persistent. The data persistence system needed special properties, among them:
- Infinite scalability: there could be no limit to how large the system could grow
- The protocol had to be WAN friendly
- The system had to provide simple, reliable security
- The system had to be highly available and robust
- The storage needed to fit into an increasingly web-centric world.
POSIX filesystems have been the mainstay of storage since their introduction, and the hierarchical organization model of filesystems certainly is a model well understood by humans. They nevertheless have constraints that make their scaling difficult if not impossible as their size grows. Many storage administrators have experienced watching inode counts (the unique identifiers for files and directories) like milk warming on the stove, knowing that beyond certain limits the system becomes unstable. The users of POSIX file-systems are also a demanding lot. Full consistency is expected across the entire data tree. Any modification to any file or the directory structure must be communicated to all users of the filesystem simultaneously. Application error handling mechanisms are not even expected to entertain the possibility that this is not true. The POSIX filesystem must publicly admit any deviation from the rules by returning a dreaded IO error.
It’s not hard to see that AWS needed to break the rules in one way or another in order to scale to the dizzying heights they have obtained today, and what better way to break the rules than to simply change them? By introducing an altogether new protocol, AWS was able to establish a new set of rules and conventions that allowed them to scale and provide services that made more sense to their customers. Here’s a high level view:
- S3 is not hierarchical, objects are stored by name in a collection called a bucket. The bucket is indexed and can be listed with a prefix based search. This provides a clear scaling advantage: the consistency domain is constrained to a single bucket instead of the entire hierarchical tree as was the case for POSIX.
- The bucket is guaranteed eventually consistent. The application must accept that an object that was just updated may or may not appear instantly in the list and may or may not have the latest information, the platform is allowed a best effort at being immediately consistent. Over time, the system’s performance has shown that data is in general fully consistent, but the architectural freedom to relax the constraint has real performance and scalability advantages.
- Requests are REST based and self-contained. This method is very well adapted to WAN based interactions.
- Requests are signed with a shared secret, and communicated over an SSL connection. This provides protection against man in the middle attacks and prevents passwords from being communicated in requests.
- The contents of a bucket can be presented using a virtual hosting model where the contents are presented as https://bucket.s3.amazonaws.com/object_name. This allows an AWS S3 bucket to behave directly as a static website.
- Due to this virtual hosting model, bucket names must respect DNS naming conventions, are unique across all of the AWS platform, and are limited to 100 buckets per end user.
- These constraints have prompted a model where applications generally use a single bucket and put thousands or millions of objects in a single bucket. Hierarchical filesystem models are often emulated with the choice of prefixes with names that resemble directory paths like /foo/bar/foo/object. While this permits GUI applications to give the impression of browsing a hierarchy, the principal usage of S3 based storage is application-based and not user-based
- Objects or entire buckets can be given permanent or temporary access privileges to any identified user of the platform or anonymous users.
- Highly granular billing is available based on the quantity of data stored and the frequency of its access.
In the next installment we will discuss how applications use AWS S3 storage and the many improvements to the model over the 10 years of its existence. We’ll discuss how Scality’s industry leading object storage backend is embracing this data persistence revolution.