Storage Classes
Amazon S3 provides different storage classes aligned to different customer requirements:
S3 Standard : “Offers high durability, availability, and high-performing object storage for frequently accessed data”
Delivers low latency and high throughput
Appropriate for cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics
Provides durability across at least three Availability Zones
S3 Intelligent-Tiering : “The first cloud storage that automatically reduces your storage costs on a granular object level”
Automatically moves data to most cost-effective access tier based on access frequency
No performance impact, retrieval fees, or operational overhead
Delivers milliseconds latency and high throughput performance
Can be used as default storage class for virtually any workload
S3 Standard-IA
Infrequent Access storage for older digital images or log files
All benefits of Amazon S3 Standard with different cost model
30-day minimum storage fee applied
Higher cost to retrieve data than S3 Standard
S3 One Zone-IA
Single Availability Zone storage for lower-cost option
Stores data in single Availability Zone
Good for secondary backup copies or data that can be recreated
Cost-effective for data replicated from another AWS Region
Purpose : “Archive storage class that delivers the lowest-cost storage for long-lived data that is rarely accessed and requires retrieval in milliseconds”
Use Cases : Medical images, news media assets, user-generated content archives
Access : Upload objects directly or use S3 lifecycle policies to transfer data
Purpose : “For data that does not require immediate access but needs the flexibility to retrieve large sets of data 1-2 times per year”
Retrieval : Retrieved asynchronously at no cost
Use Cases : Backup, disaster recovery, offsite data storage needs, occasional data retrieval in minutes
Purpose : “The lowest-cost storage class in Amazon S3”
Use Cases : Long-term retention and digital preservation for data accessed once or twice per year
Target Industries : Financial services, healthcare, public sectors that retain data sets for 7-10 years or longer
Compliance : Meets regulatory compliance requirements
Purpose : “Delivers object storage to your on-premises AWS Outposts environment”
APIs : Uses S3 APIs and features available in AWS Regions
Storage Class : Provides single Amazon S3 storage class named OUTPOSTS
Use Cases : Workloads with local data residency requirements and demanding performance needs
Storage Class Availability Zones Min Capacity Charge Min Storage Duration Retrieval Charge S3 Standard ≥3 N/A N/A N/A S3 Intelligent-Tiering ≥3 N/A N/A N/A S3 Standard-IA ≥3 128 KB 30 days Per GB retrieved S3 One Zone-IA 1 128 KB 30 days Per GB retrieved S3 Glacier Instant Retrieval ≥3 128 KB 90 days Per GB retrieved S3 Glacier Flexible Retrieval ≥3 N/A 90 days Per GB retrieved S3 Glacier Deep Archive ≥3 N/A 180 days Per GB retrieved
Availability : 99.9% (S3 One Zone-IA at 99.5%)
SLA : 99% available service-level agreement
Latency : Millisecond latency for first byte of data
Type : Object storage type
Transitions : Support lifecycle transitions
“An S3 lifecycle configuration is a set of rules that defines the actions that Amazon S3 applies to a group of objects.”
Define when objects transition to another storage class
Example: Transition objects to S3 Standard-IA storage class 30 days after creation
Example: Archive objects to S3 Glacier Flexible Retrieval storage class 1 year after creation
Associated costs for lifecycle transition requests
Define when objects expire
Amazon S3 deletes expired objects automatically
Lifecycle expiration costs depend on when objects are set to expire
Automatic Transfer : “After an S3 lifecycle policy is set, your data will automatically transfer to a different storage class without any changes to your application”
Cost Reduction : Cycle data at regular intervals among different storage types to reduce costs
Flexibility : Can set lifecycle rules per object or per bucket
Cost Optimization : “You pay less for data as it becomes less important over time”
Periodic Logs
Upload periodic logs to bucket
Application needs them for a week or month
Delete them after time period expires
Archival Data
Upload data primarily for archival purposes
Examples: Digital media, financial and healthcare records, raw genomics sequence data, long-term database backups
Data retained for regulatory compliance
Document Access Patterns
Documents frequently accessed for limited period
Become infrequently accessed over time
Eventually don’t need real-time access but require archival
Can delete after specific retention period
Note
S3 Lifecycle Benefits : With S3 Lifecycle configuration rules, you can tell Amazon S3 to transition objects to less-expensive storage classes, archive objects, or delete objects.
“Amazon S3 versioning protects objects from accidental overwrites and deletes.” You can use versioning to recover from both unintended user actions and application failures.
An S3 bucket can be in one of three versioning states:
Versioning Enabled
Default behavior when enabled
Upload same key: Creates new object with different version ID, both retrievable
Delete action: Adds delete marker, object still retrievable by version ID
Versioning Disabled
Default setting for new buckets
Upload same key: Overwrites original object, previous not retrievable
Delete action: Deletes object permanently, not retrievable
Versioning Suspended
Temporary state
Versions of existing objects maintained
Bucket temporarily behaves as if versioning were disabled
Process : “Amazon S3 generates a new version ID and adds this newer version of the object to the bucket”
Result : “The original version remains in the bucket”
Example : When new version of photo.gif is PUT into bucket, original object (ID = 111111) remains, new version gets ID = 121212
Process : “When a request is made to delete an object in a version-enabled bucket, all versions remain in the bucket, but Amazon S3 inserts a delete marker”
Result : Can still retrieve prior versions using version ID
Recovery : Previous versions remain accessible by version ID
Default Behavior : “Requests for an object key return the most recent version”
Delete Marker : “If the most recent version is a delete marker, the request is not successful” (returns 404 Not Found)
Specific Version : “You can GET a noncurrent version of an object by specifying its version ID”
Process : “Owners of the bucket can permanently delete an object by using delete with the version ID”
Result : “No delete marker is added, and the specified version is not recoverable”
Authorization : Only the owner of an S3 bucket can permanently delete a version
Cost : “There is no cost for using versioning, but because each version is a copy of an object, each version contributes to storage costs”
Permanent Setting : “When versioning has been enabled on a bucket, it cannot be disabled and can only be suspended”
Version ID : When versioning disabled, object version ID is null
“Cross-origin resource sharing (CORS) defines a way for client web applications that are loaded in one domain to interact with resources in a different domain.”
Create CORS configuration (XML document) with rules identifying:
Origins : Origins allowed to access your bucket
Operations : HTTP methods supported for each origin (e.g., GET requests from all origins)
Operation-specific information : Additional details for specific operations
Scenario : Host web font in S3 bucket
Challenge : Webpage in alternate domain tries to use this web font
Process : “Before the browser loads this webpage, it performs a CORS check to make sure that the domain from which the page is being loaded is allowed to access S3 bucket resources”
Result : Browser evaluates CORS configuration and uses matching rule to allow cross-origin request
Consistency Scope : “Is consistent for all new and existing objects in all Regions”
Read-After-Write : “Provides read-after-write consistency for all GET, LIST, PUT, and DELETE operations on objects in S3 buckets”
Big Data Advantage : “Offers an advantage for big data workloads”
Migration Simplification : “Simplifies the migration of on-premises analytics workloads”
High Availability : “Amazon S3 achieves high availability by replicating data across multiple servers within AWS data centers”
Successful PUT : “If a PUT request is successful, the data is safely stored”
Immediate Consistency : “Any read (GET or LIST) that is initiated following a successful PUT response will return the data written by the PUT”
Automatic Implementation : Strong read-after-write consistency exists automatically for all applications without changes to performance or availability
Application Migration : “Strong consistency simplifies the migration of on-premises analytics workloads by removing the need to make changes to support applications”
Infrastructure Reduction : “Removes the need for extra infrastructure, such as S3Guard, to provide strong consistency”
Cost Savings : Reduces costs by removing need for extra infrastructure
No Custom Code : Without strong consistency, you would need to “insert custom code into these applications or provision databases to keep objects consistent”
Example : “If you delete a bucket and immediately list all buckets, the deleted bucket might still appear in the list”
Resolution : “Within a short period of time, if you run the list bucket command again, the deleted bucket will no longer appear in the results”
Frequent Access : Use S3 Standard for frequently accessed data requiring low latency
Infrequent Access : Choose S3 Standard-IA or S3 One Zone-IA based on availability requirements
Archive Needs : Select appropriate Glacier class based on retrieval time requirements
Unknown Patterns : Use S3 Intelligent-Tiering for data with unknown or changing access patterns
Cost Optimization : Implement lifecycle policies to automatically transition objects to cheaper storage classes
Data Retention : Set expiration actions for data that doesn’t need permanent retention
Business Requirements : Align lifecycle rules with business data retention policies
Monitoring : Regular review of lifecycle effectiveness and cost impact
Data Protection : Enable versioning for critical data requiring protection from accidental changes
Storage Costs : Consider storage cost impact when enabling versioning
Lifecycle Integration : Combine versioning with lifecycle policies to manage non-current versions
Access Patterns : Understand how applications will access versioned objects
Web Applications : Configure CORS for web applications accessing S3 resources from different domains
Security : Carefully define allowed origins to maintain security
Methods : Specify only necessary HTTP methods for each origin
Testing : Thoroughly test CORS configuration with actual web applications
This section covered the various storage classes available in Amazon S3, lifecycle management for cost optimization, versioning for data protection, CORS support for web applications, and the strong consistency model that ensures reliable data access across all S3 operations.