Skip to content
Pablo Rodriguez

Defining Amazon S3

Object Storage

Storage comes in three basic types, each with particular uses:

  • Description: “Data is stored on a device in fixed-sized blocks”
  • Function: Applications and file systems regulate how blocks are accessed, combined, and modified
  • Characteristics:
    • Breaks up data into blocks and stores them as separate pieces with unique identifiers
    • Blocks stored wherever most efficient
    • Can be stored across different systems
    • Each block can be configured to work with different operating systems
  • Description: “Data is stored in a hierarchical structure”
  • Function: Methodology that helps users, applications, and services access data in a shared file system
  • Characteristics: Similar to a centralized shared network drive in a company where employees store and access files
  • Description: “Data is stored as objects based on attributes and metadata”
  • Function: Files stored as objects with data, metadata, and object key
  • Characteristics:
    • Metadata has information about the data (object size, object purpose, and more)
    • Object key is the unique identifier of the object
    • When you update files, the entire file object is updated instead of a piece
  • Unlimited Storage: “Amazon S3 stores massive (unlimited) amounts of unstructured data”
  • Object-Based: “Amazon S3 stores data files as objects in a bucket that you define”
  • Maximum File Size: “Five TB is the maximum file size of a single object”
  • Global Namespace: “Objects have a globally unique URL (universal namespace)”
  • Bucket Naming: Every bucket must have a name that is globally unique across Regions and all AWS customer accounts

Each object has five consistent characteristics:

Key

The name you assign to an object. Used to retrieve the object. Includes the full path relative to the bucket root (Amazon S3 doesn’t know about directories).

Version ID

In a bucket, a key and version ID uniquely identify an object.

Value

The actual content that you store. Can be any sequence of bytes. Object values are immutable - cannot modify after upload.

Metadata

Set of name-value pairs to store information about the object. Includes user-defined metadata and system metadata.

  • Sub-resources: Amazon S3 uses sub-resources to store additional object-specific information
  • Definition: “A bucket is a container for objects that are stored in Amazon S3”
  • Functions:
    • Organize the Amazon S3 namespace at the highest level
    • Identify the account responsible for storage and data transfer charges
    • Play a role in access control
    • Serve as the unit of aggregation for usage reporting
  • Regional Nature: Each bucket is Regional - you choose the AWS Region where Amazon S3 stores buckets
  • Data Locality: “Objects stored in an AWS Region never leave the Region unless you explicitly transfer them to another Region”
  • Definition: “An object is the fundamental entity that is stored in Amazon S3”
  • Content: Can be any kind of file (text, video, photo, or other binary format)
  • Components: Consist of object data and metadata
  • Metadata Purpose: Describes the object (e.g., content type included in response header so browser knows how to render the file)
  • Function: “An object key uniquely identifies the object in a bucket”
  • Uniqueness: Each object in a bucket has exactly one key

Each bucket has a Region-specific endpoint in this format:

https://s3-<aws-region>.amazonaws.com/<bucket-name>/<object-key>

Amazon S3 supports the folder concept to group objects using shared name prefixes. The Amazon S3 console creates an object with the name followed by a slash (/) to represent folders.

Objects in bucket named graphics-bucket:

  • photos/2022/catpiano.jpg
  • photos/2022/catonphone.jpg
  • photos/2022/ninepuppies.png
  • photos/2021/lakefront.png
  • video-source/9984.mp4

A GET query with prefix photos/2022 returns:

  • graphics-bucket/photos/2022/catpiano.jpg
  • graphics-bucket/photos/2022/catonphone.jpg
  • graphics-bucket/photos/2022/ninepuppies.png

Durability

11 nines (99.999999999%) durability

  • Helps ensure data is not lost
  • Average annual expected loss: 0.000000001% chance of losing an object per year
  • Example: storing 10,000 objects, expect to lose 1 object every 10,000,000 years on average

Availability

4 nines (99.99%) availability

  • Provides access to data when needed
  • Unlimited capacity to store data
  • Scalable storage solution

High Performance

Thousands of transactions per second

  • Achieves thousands of transactions when uploading and retrieving storage
  • Automatically scales to high request rates
  • “Amazon S3 redundantly stores your objects on multiple devices across multiple facilities in the Amazon S3 Region you designate”
  • “Amazon S3 is designed to sustain concurrent device failures by quickly detecting and repairing any lost redundancy”
  • “Amazon S3 regularly verifies the integrity of your data by using checksums”
  • Security: Provides many ways to control access to data and encrypt data
  • Performance: “Your applications can achieve thousands of transactions per second when uploading and retrieving storage from Amazon S3”

Amazon S3 serves as a foundational object storage service that provides massive scalability, high durability, and strong performance characteristics essential for modern cloud applications. The service’s architecture around buckets and objects with global namespace capabilities makes it suitable for a wide variety of storage use cases.