Storing Content S3

Storing Content with Amazon S3

Storage Classes

Object Storage Classes Overview

Amazon S3 provides different storage classes aligned to different customer requirements:

General Purpose

S3 Standard: “Offers high durability, availability, and high-performing object storage for frequently accessed data”
- Delivers low latency and high throughput
- Appropriate for cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics
- Provides durability across at least three Availability Zones

Intelligent Tiering

S3 Intelligent-Tiering: “The first cloud storage that automatically reduces your storage costs on a granular object level”
- Automatically moves data to most cost-effective access tier based on access frequency
- No performance impact, retrieval fees, or operational overhead
- Delivers milliseconds latency and high throughput performance
- Can be used as default storage class for virtually any workload

Infrequent Access

S3 Standard-IA

Infrequent Access storage for older digital images or log files

All benefits of Amazon S3 Standard with different cost model
30-day minimum storage fee applied
Higher cost to retrieve data than S3 Standard

S3 One Zone-IA

Single Availability Zone storage for lower-cost option

Stores data in single Availability Zone
Good for secondary backup copies or data that can be recreated
Cost-effective for data replicated from another AWS Region

Archive Storage Classes

S3 Glacier Instant Retrieval

Purpose: “Archive storage class that delivers the lowest-cost storage for long-lived data that is rarely accessed and requires retrieval in milliseconds”
Use Cases: Medical images, news media assets, user-generated content archives
Access: Upload objects directly or use S3 lifecycle policies to transfer data

S3 Glacier Flexible Retrieval

Purpose: “For data that does not require immediate access but needs the flexibility to retrieve large sets of data 1-2 times per year”
Retrieval: Retrieved asynchronously at no cost
Use Cases: Backup, disaster recovery, offsite data storage needs, occasional data retrieval in minutes

S3 Glacier Deep Archive

Purpose: “The lowest-cost storage class in Amazon S3”
Use Cases: Long-term retention and digital preservation for data accessed once or twice per year
Target Industries: Financial services, healthcare, public sectors that retain data sets for 7-10 years or longer
Compliance: Meets regulatory compliance requirements

S3 on Outposts

Purpose: “Delivers object storage to your on-premises AWS Outposts environment”
APIs: Uses S3 APIs and features available in AWS Regions
Storage Class: Provides single Amazon S3 storage class named OUTPOSTS
Use Cases: Workloads with local data residency requirements and demanding performance needs

S3 Storage Classes Comparison

Key Differences

Storage Class	Availability Zones	Min Capacity Charge	Min Storage Duration	Retrieval Charge
S3 Standard	≥3	N/A	N/A	N/A
S3 Intelligent-Tiering	≥3	N/A	N/A	N/A
S3 Standard-IA	≥3	128 KB	30 days	Per GB retrieved
S3 One Zone-IA	1	128 KB	30 days	Per GB retrieved
S3 Glacier Instant Retrieval	≥3	128 KB	90 days	Per GB retrieved
S3 Glacier Flexible Retrieval	≥3	N/A	90 days	Per GB retrieved
S3 Glacier Deep Archive	≥3	N/A	180 days	Per GB retrieved

Shared Characteristics

Availability: 99.9% (S3 One Zone-IA at 99.5%)
SLA: 99% available service-level agreement
Latency: Millisecond latency for first byte of data
Type: Object storage type
Transitions: Support lifecycle transitions

Configuring Amazon S3 Lifecycle

Lifecycle Configuration Overview

“An S3 lifecycle configuration is a set of rules that defines the actions that Amazon S3 applies to a group of objects.”

Define when objects transition to another storage class

Example: Transition objects to S3 Standard-IA storage class 30 days after creation
Example: Archive objects to S3 Glacier Flexible Retrieval storage class 1 year after creation
Associated costs for lifecycle transition requests

Lifecycle Policy Benefits

Automatic Transfer: “After an S3 lifecycle policy is set, your data will automatically transfer to a different storage class without any changes to your application”
Cost Reduction: Cycle data at regular intervals among different storage types to reduce costs
Flexibility: Can set lifecycle rules per object or per bucket
Cost Optimization: “You pay less for data as it becomes less important over time”

Amazon S3 Lifecycle Examples

Common Lifecycle Scenarios

Periodic Logs
- Upload periodic logs to bucket
- Application needs them for a week or month
- Delete them after time period expires
Archival Data
- Upload data primarily for archival purposes
- Examples: Digital media, financial and healthcare records, raw genomics sequence data, long-term database backups
- Data retained for regulatory compliance
Document Access Patterns
- Documents frequently accessed for limited period
- Become infrequently accessed over time
- Eventually don’t need real-time access but require archival
- Can delete after specific retention period

Amazon S3 Versioning

Versioning Overview

“Amazon S3 versioning protects objects from accidental overwrites and deletes.” You can use versioning to recover from both unintended user actions and application failures.

Versioning States

An S3 bucket can be in one of three versioning states:

Versioning Enabled

Default behavior when enabled

Upload same key: Creates new object with different version ID, both retrievable
Delete action: Adds delete marker, object still retrievable by version ID

Versioning Disabled

Default setting for new buckets

Upload same key: Overwrites original object, previous not retrievable
Delete action: Deletes object permanently, not retrievable

Versioning Suspended

Temporary state

Versions of existing objects maintained
Bucket temporarily behaves as if versioning were disabled

Versioning Behavior Examples

Adding Objects in Versioning-Enabled Bucket

Process: “Amazon S3 generates a new version ID and adds this newer version of the object to the bucket”
Result: “The original version remains in the bucket”
Example: When new version of photo.gif is PUT into bucket, original object (ID = 111111) remains, new version gets ID = 121212

Deleting Objects in Version-Enabled Bucket

Process: “When a request is made to delete an object in a version-enabled bucket, all versions remain in the bucket, but Amazon S3 inserts a delete marker”
Result: Can still retrieve prior versions using version ID
Recovery: Previous versions remain accessible by version ID

Retrieving Objects

Default Behavior: “Requests for an object key return the most recent version”
Delete Marker: “If the most recent version is a delete marker, the request is not successful” (returns 404 Not Found)
Specific Version: “You can GET a noncurrent version of an object by specifying its version ID”

Permanent Deletion

Process: “Owners of the bucket can permanently delete an object by using delete with the version ID”
Result: “No delete marker is added, and the specified version is not recoverable”
Authorization: Only the owner of an S3 bucket can permanently delete a version

Versioning Considerations

Cost: “There is no cost for using versioning, but because each version is a copy of an object, each version contributes to storage costs”
Permanent Setting: “When versioning has been enabled on a bucket, it cannot be disabled and can only be suspended”
Version ID: When versioning disabled, object version ID is null

CORS Overview

“Cross-origin resource sharing (CORS) defines a way for client web applications that are loaded in one domain to interact with resources in a different domain.”

CORS Configuration

Create CORS configuration (XML document) with rules identifying:

Origins: Origins allowed to access your bucket
Operations: HTTP methods supported for each origin (e.g., GET requests from all origins)
Operation-specific information: Additional details for specific operations

CORS Example Use Case

Scenario: Host web font in S3 bucket
Challenge: Webpage in alternate domain tries to use this web font
Process: “Before the browser loads this webpage, it performs a CORS check to make sure that the domain from which the page is being loaded is allowed to access S3 bucket resources”
Result: Browser evaluates CORS configuration and uses matching rule to allow cross-origin request

Amazon S3 Data Consistency Model

Strong Consistency Features

Consistency Scope: “Is consistent for all new and existing objects in all Regions”
Read-After-Write: “Provides read-after-write consistency for all GET, LIST, PUT, and DELETE operations on objects in S3 buckets”
Big Data Advantage: “Offers an advantage for big data workloads”
Migration Simplification: “Simplifies the migration of on-premises analytics workloads”

How Strong Consistency Works

High Availability: “Amazon S3 achieves high availability by replicating data across multiple servers within AWS data centers”
Successful PUT: “If a PUT request is successful, the data is safely stored”
Immediate Consistency: “Any read (GET or LIST) that is initiated following a successful PUT response will return the data written by the PUT”
Automatic Implementation: Strong read-after-write consistency exists automatically for all applications without changes to performance or availability

Benefits of Strong Consistency

Application Migration: “Strong consistency simplifies the migration of on-premises analytics workloads by removing the need to make changes to support applications”
Infrastructure Reduction: “Removes the need for extra infrastructure, such as S3Guard, to provide strong consistency”
Cost Savings: Reduces costs by removing need for extra infrastructure
No Custom Code: Without strong consistency, you would need to “insert custom code into these applications or provision databases to keep objects consistent”

Bucket Configuration Consistency

Example: “If you delete a bucket and immediately list all buckets, the deleted bucket might still appear in the list”
Resolution: “Within a short period of time, if you run the list bucket command again, the deleted bucket will no longer appear in the results”

Implementation Considerations

Storage Class Selection

Frequent Access: Use S3 Standard for frequently accessed data requiring low latency
Infrequent Access: Choose S3 Standard-IA or S3 One Zone-IA based on availability requirements
Archive Needs: Select appropriate Glacier class based on retrieval time requirements
Unknown Patterns: Use S3 Intelligent-Tiering for data with unknown or changing access patterns

Lifecycle Management

Cost Optimization: Implement lifecycle policies to automatically transition objects to cheaper storage classes
Data Retention: Set expiration actions for data that doesn’t need permanent retention
Business Requirements: Align lifecycle rules with business data retention policies
Monitoring: Regular review of lifecycle effectiveness and cost impact

Versioning Strategy

Data Protection: Enable versioning for critical data requiring protection from accidental changes
Storage Costs: Consider storage cost impact when enabling versioning
Lifecycle Integration: Combine versioning with lifecycle policies to manage non-current versions
Access Patterns: Understand how applications will access versioned objects

CORS Configuration

Web Applications: Configure CORS for web applications accessing S3 resources from different domains
Security: Carefully define allowed origins to maintain security
Methods: Specify only necessary HTTP methods for each origin
Testing: Thoroughly test CORS configuration with actual web applications

This section covered the various storage classes available in Amazon S3, lifecycle management for cost optimization, versioning for data protection, CORS support for web applications, and the strong consistency model that ensures reliable data access across all S3 operations.

Storing Content S3

Storing Content with Amazon S3

Object Storage Classes Overview

General Purpose

Intelligent Tiering

Infrequent Access

Archive Storage Classes

S3 Glacier Instant Retrieval

S3 Glacier Flexible Retrieval

S3 Glacier Deep Archive

S3 on Outposts

S3 Storage Classes Comparison

Key Differences

Shared Characteristics

Configuring Amazon S3 Lifecycle

Lifecycle Configuration Overview

Types of Lifecycle Actions

Lifecycle Policy Benefits

Amazon S3 Lifecycle Examples

Common Lifecycle Scenarios

Amazon S3 Versioning

Versioning Overview

Versioning States

Versioning Behavior Examples

Adding Objects in Versioning-Enabled Bucket

Deleting Objects in Version-Enabled Bucket

Retrieving Objects

Permanent Deletion

Versioning Considerations

Support for Cross-Origin Resource Sharing (CORS)

CORS Overview

CORS Configuration

CORS Example Use Case

Amazon S3 Data Consistency Model

Strong Consistency Features

How Strong Consistency Works

Benefits of Strong Consistency

Bucket Configuration Consistency

Implementation Considerations

Storage Class Selection

Lifecycle Management

Versioning Strategy

CORS Configuration