Distributed storage includes hardware and software enabling a scale-out distributed file system technology targeted at unstructured data growth. Data can be split across various physical servers and multiple data centers. It generally utilizes a cluster of storage units with synchronization and coordination methods between nodes.
Distributed storage is one of the foundational technologies for cloud service providers and on-premises distributed storage systems. Distributed storage can handle file storage, block storage, and object storage.
Common features of distributed storage include distributing data between clusters, known as partitioning, and replicating data across clusters to maintain consistency. Distributed storage also has fault tolerance features where availability is maintained in case a cluster goes down, in addition to elastic scalability, the system can scale up and down with more storage if necessary. Distributed object storage, as the name suggests, is object storage implemented in a distributed fashion. This implementation allows for both the features of object storage and the benefits of distributed storage.
In taking a closer look at a distributed object storage system — Amazon S3 — the benefits become clearer. S3 objects consist of data and metadata such as last modified and support other custom metadata defined by the user. Objects are then organized into buckets, which are logical structures for data organization. Each object in S3 has a bucket, a key, and a version ID. The key is the unique ID for each object in its bucket. Versions of each object are tracked through the unique version ID.
Users can then specify which buckets to store objects into or retrieve from. The actual data, though, is distributed across a number of storage nodes across various zones within the same fixed region. For example, a piece of data that a user requests may be found using its unique ID, and it may be distributed across the US-East region. Through this method, data has incredible elasticity and scalability with high uptime and reliability.