Skip to content
Pablo Rodriguez

Storing Content S3

Storage Classes

Amazon S3 provides different storage classes aligned to different customer requirements:

  • S3 Standard: “Offers high durability, availability, and high-performing object storage for frequently accessed data”
    • Delivers low latency and high throughput
    • Appropriate for cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics
    • Provides durability across at least three Availability Zones
  • S3 Intelligent-Tiering: “The first cloud storage that automatically reduces your storage costs on a granular object level”
    • Automatically moves data to most cost-effective access tier based on access frequency
    • No performance impact, retrieval fees, or operational overhead
    • Delivers milliseconds latency and high throughput performance
    • Can be used as default storage class for virtually any workload

S3 Standard-IA

Infrequent Access storage for older digital images or log files

  • All benefits of Amazon S3 Standard with different cost model
  • 30-day minimum storage fee applied
  • Higher cost to retrieve data than S3 Standard

S3 One Zone-IA

Single Availability Zone storage for lower-cost option

  • Stores data in single Availability Zone
  • Good for secondary backup copies or data that can be recreated
  • Cost-effective for data replicated from another AWS Region
  • Purpose: “Archive storage class that delivers the lowest-cost storage for long-lived data that is rarely accessed and requires retrieval in milliseconds”
  • Use Cases: Medical images, news media assets, user-generated content archives
  • Access: Upload objects directly or use S3 lifecycle policies to transfer data
  • Purpose: “For data that does not require immediate access but needs the flexibility to retrieve large sets of data 1-2 times per year”
  • Retrieval: Retrieved asynchronously at no cost
  • Use Cases: Backup, disaster recovery, offsite data storage needs, occasional data retrieval in minutes
  • Purpose: “The lowest-cost storage class in Amazon S3”
  • Use Cases: Long-term retention and digital preservation for data accessed once or twice per year
  • Target Industries: Financial services, healthcare, public sectors that retain data sets for 7-10 years or longer
  • Compliance: Meets regulatory compliance requirements
  • Purpose: “Delivers object storage to your on-premises AWS Outposts environment”
  • APIs: Uses S3 APIs and features available in AWS Regions
  • Storage Class: Provides single Amazon S3 storage class named OUTPOSTS
  • Use Cases: Workloads with local data residency requirements and demanding performance needs
Storage ClassAvailability ZonesMin Capacity ChargeMin Storage DurationRetrieval Charge
S3 Standard≥3N/AN/AN/A
S3 Intelligent-Tiering≥3N/AN/AN/A
S3 Standard-IA≥3128 KB30 daysPer GB retrieved
S3 One Zone-IA1128 KB30 daysPer GB retrieved
S3 Glacier Instant Retrieval≥3128 KB90 daysPer GB retrieved
S3 Glacier Flexible Retrieval≥3N/A90 daysPer GB retrieved
S3 Glacier Deep Archive≥3N/A180 daysPer GB retrieved
  • Availability: 99.9% (S3 One Zone-IA at 99.5%)
  • SLA: 99% available service-level agreement
  • Latency: Millisecond latency for first byte of data
  • Type: Object storage type
  • Transitions: Support lifecycle transitions

“An S3 lifecycle configuration is a set of rules that defines the actions that Amazon S3 applies to a group of objects.”

Define when objects transition to another storage class

  • Example: Transition objects to S3 Standard-IA storage class 30 days after creation
  • Example: Archive objects to S3 Glacier Flexible Retrieval storage class 1 year after creation
  • Associated costs for lifecycle transition requests
  • Automatic Transfer: “After an S3 lifecycle policy is set, your data will automatically transfer to a different storage class without any changes to your application”
  • Cost Reduction: Cycle data at regular intervals among different storage types to reduce costs
  • Flexibility: Can set lifecycle rules per object or per bucket
  • Cost Optimization: “You pay less for data as it becomes less important over time”
  1. Periodic Logs

    • Upload periodic logs to bucket
    • Application needs them for a week or month
    • Delete them after time period expires
  2. Archival Data

    • Upload data primarily for archival purposes
    • Examples: Digital media, financial and healthcare records, raw genomics sequence data, long-term database backups
    • Data retained for regulatory compliance
  3. Document Access Patterns

    • Documents frequently accessed for limited period
    • Become infrequently accessed over time
    • Eventually don’t need real-time access but require archival
    • Can delete after specific retention period

“Amazon S3 versioning protects objects from accidental overwrites and deletes.” You can use versioning to recover from both unintended user actions and application failures.

An S3 bucket can be in one of three versioning states:

Versioning Enabled

Default behavior when enabled

  • Upload same key: Creates new object with different version ID, both retrievable
  • Delete action: Adds delete marker, object still retrievable by version ID

Versioning Disabled

Default setting for new buckets

  • Upload same key: Overwrites original object, previous not retrievable
  • Delete action: Deletes object permanently, not retrievable

Versioning Suspended

Temporary state

  • Versions of existing objects maintained
  • Bucket temporarily behaves as if versioning were disabled

Adding Objects in Versioning-Enabled Bucket

Section titled “Adding Objects in Versioning-Enabled Bucket”
  • Process: “Amazon S3 generates a new version ID and adds this newer version of the object to the bucket”
  • Result: “The original version remains in the bucket”
  • Example: When new version of photo.gif is PUT into bucket, original object (ID = 111111) remains, new version gets ID = 121212

Deleting Objects in Version-Enabled Bucket

Section titled “Deleting Objects in Version-Enabled Bucket”
  • Process: “When a request is made to delete an object in a version-enabled bucket, all versions remain in the bucket, but Amazon S3 inserts a delete marker”
  • Result: Can still retrieve prior versions using version ID
  • Recovery: Previous versions remain accessible by version ID
  • Default Behavior: “Requests for an object key return the most recent version”
  • Delete Marker: “If the most recent version is a delete marker, the request is not successful” (returns 404 Not Found)
  • Specific Version: “You can GET a noncurrent version of an object by specifying its version ID”
  • Process: “Owners of the bucket can permanently delete an object by using delete with the version ID”
  • Result: “No delete marker is added, and the specified version is not recoverable”
  • Authorization: Only the owner of an S3 bucket can permanently delete a version
  • Cost: “There is no cost for using versioning, but because each version is a copy of an object, each version contributes to storage costs”
  • Permanent Setting: “When versioning has been enabled on a bucket, it cannot be disabled and can only be suspended”
  • Version ID: When versioning disabled, object version ID is null

Support for Cross-Origin Resource Sharing (CORS)

Section titled “Support for Cross-Origin Resource Sharing (CORS)”

“Cross-origin resource sharing (CORS) defines a way for client web applications that are loaded in one domain to interact with resources in a different domain.”

Create CORS configuration (XML document) with rules identifying:

  • Origins: Origins allowed to access your bucket
  • Operations: HTTP methods supported for each origin (e.g., GET requests from all origins)
  • Operation-specific information: Additional details for specific operations
  • Scenario: Host web font in S3 bucket
  • Challenge: Webpage in alternate domain tries to use this web font
  • Process: “Before the browser loads this webpage, it performs a CORS check to make sure that the domain from which the page is being loaded is allowed to access S3 bucket resources”
  • Result: Browser evaluates CORS configuration and uses matching rule to allow cross-origin request
  • Consistency Scope: “Is consistent for all new and existing objects in all Regions”
  • Read-After-Write: “Provides read-after-write consistency for all GET, LIST, PUT, and DELETE operations on objects in S3 buckets”
  • Big Data Advantage: “Offers an advantage for big data workloads”
  • Migration Simplification: “Simplifies the migration of on-premises analytics workloads”
  • High Availability: “Amazon S3 achieves high availability by replicating data across multiple servers within AWS data centers”
  • Successful PUT: “If a PUT request is successful, the data is safely stored”
  • Immediate Consistency: “Any read (GET or LIST) that is initiated following a successful PUT response will return the data written by the PUT”
  • Automatic Implementation: Strong read-after-write consistency exists automatically for all applications without changes to performance or availability
  • Application Migration: “Strong consistency simplifies the migration of on-premises analytics workloads by removing the need to make changes to support applications”
  • Infrastructure Reduction: “Removes the need for extra infrastructure, such as S3Guard, to provide strong consistency”
  • Cost Savings: Reduces costs by removing need for extra infrastructure
  • No Custom Code: Without strong consistency, you would need to “insert custom code into these applications or provision databases to keep objects consistent”
  • Example: “If you delete a bucket and immediately list all buckets, the deleted bucket might still appear in the list”
  • Resolution: “Within a short period of time, if you run the list bucket command again, the deleted bucket will no longer appear in the results”
  • Frequent Access: Use S3 Standard for frequently accessed data requiring low latency
  • Infrequent Access: Choose S3 Standard-IA or S3 One Zone-IA based on availability requirements
  • Archive Needs: Select appropriate Glacier class based on retrieval time requirements
  • Unknown Patterns: Use S3 Intelligent-Tiering for data with unknown or changing access patterns
  • Cost Optimization: Implement lifecycle policies to automatically transition objects to cheaper storage classes
  • Data Retention: Set expiration actions for data that doesn’t need permanent retention
  • Business Requirements: Align lifecycle rules with business data retention policies
  • Monitoring: Regular review of lifecycle effectiveness and cost impact
  • Data Protection: Enable versioning for critical data requiring protection from accidental changes
  • Storage Costs: Consider storage cost impact when enabling versioning
  • Lifecycle Integration: Combine versioning with lifecycle policies to manage non-current versions
  • Access Patterns: Understand how applications will access versioned objects
  • Web Applications: Configure CORS for web applications accessing S3 resources from different domains
  • Security: Carefully define allowed origins to maintain security
  • Methods: Specify only necessary HTTP methods for each origin
  • Testing: Thoroughly test CORS configuration with actual web applications

This section covered the various storage classes available in Amazon S3, lifecycle management for cost optimization, versioning for data protection, CORS support for web applications, and the strong consistency model that ensures reliable data access across all S3 operations.