Object Storage
Storage comes in three basic types, each with particular uses:
Description : “Data is stored on a device in fixed-sized blocks”
Function : Applications and file systems regulate how blocks are accessed, combined, and modified
Characteristics :
Breaks up data into blocks and stores them as separate pieces with unique identifiers
Blocks stored wherever most efficient
Can be stored across different systems
Each block can be configured to work with different operating systems
Description : “Data is stored in a hierarchical structure”
Function : Methodology that helps users, applications, and services access data in a shared file system
Characteristics : Similar to a centralized shared network drive in a company where employees store and access files
Description : “Data is stored as objects based on attributes and metadata”
Function : Files stored as objects with data, metadata, and object key
Characteristics :
Metadata has information about the data (object size, object purpose, and more)
Object key is the unique identifier of the object
When you update files, the entire file object is updated instead of a piece
Unlimited Storage : “Amazon S3 stores massive (unlimited) amounts of unstructured data”
Object-Based : “Amazon S3 stores data files as objects in a bucket that you define”
Maximum File Size : “Five TB is the maximum file size of a single object”
Global Namespace : “Objects have a globally unique URL (universal namespace)”
Bucket Naming : Every bucket must have a name that is globally unique across Regions and all AWS customer accounts
Each object has five consistent characteristics:
Key
The name you assign to an object. Used to retrieve the object. Includes the full path relative to the bucket root (Amazon S3 doesn’t know about directories).
Version ID
In a bucket, a key and version ID uniquely identify an object.
Value
The actual content that you store. Can be any sequence of bytes. Object values are immutable - cannot modify after upload.
Metadata
Set of name-value pairs to store information about the object. Includes user-defined metadata and system metadata.
Sub-resources : Amazon S3 uses sub-resources to store additional object-specific information
Definition : “A bucket is a container for objects that are stored in Amazon S3”
Functions :
Organize the Amazon S3 namespace at the highest level
Identify the account responsible for storage and data transfer charges
Play a role in access control
Serve as the unit of aggregation for usage reporting
Regional Nature : Each bucket is Regional - you choose the AWS Region where Amazon S3 stores buckets
Data Locality : “Objects stored in an AWS Region never leave the Region unless you explicitly transfer them to another Region”
Definition : “An object is the fundamental entity that is stored in Amazon S3”
Content : Can be any kind of file (text, video, photo, or other binary format)
Components : Consist of object data and metadata
Metadata Purpose : Describes the object (e.g., content type included in response header so browser knows how to render the file)
Function : “An object key uniquely identifies the object in a bucket”
Uniqueness : Each object in a bucket has exactly one key
Each bucket has a Region-specific endpoint in this format:
https://s3-<aws-region>.amazonaws.com/<bucket-name>/<object-key>
Amazon S3 supports the folder concept to group objects using shared name prefixes. The Amazon S3 console creates an object with the name followed by a slash (/) to represent folders.
Objects in bucket named graphics-bucket
:
photos/2022/catpiano.jpg
photos/2022/catonphone.jpg
photos/2022/ninepuppies.png
photos/2021/lakefront.png
video-source/9984.mp4
A GET query with prefix photos/2022
returns:
graphics-bucket/photos/2022/catpiano.jpg
graphics-bucket/photos/2022/catonphone.jpg
graphics-bucket/photos/2022/ninepuppies.png
Durability
11 nines (99.999999999%) durability
Helps ensure data is not lost
Average annual expected loss: 0.000000001% chance of losing an object per year
Example: storing 10,000 objects, expect to lose 1 object every 10,000,000 years on average
Availability
4 nines (99.99%) availability
Provides access to data when needed
Unlimited capacity to store data
Scalable storage solution
High Performance
Thousands of transactions per second
Achieves thousands of transactions when uploading and retrieving storage
Automatically scales to high request rates
“Amazon S3 redundantly stores your objects on multiple devices across multiple facilities in the Amazon S3 Region you designate”
“Amazon S3 is designed to sustain concurrent device failures by quickly detecting and repairing any lost redundancy”
“Amazon S3 regularly verifies the integrity of your data by using checksums”
Security : Provides many ways to control access to data and encrypt data
Performance : “Your applications can achieve thousands of transactions per second when uploading and retrieving storage from Amazon S3”
Amazon S3 serves as a foundational object storage service that provides massive scalability, high durability, and strong performance characteristics essential for modern cloud applications. The service’s architecture around buckets and objects with global namespace capabilities makes it suitable for a wide variety of storage use cases.