Skip to content
Pablo Rodriguez

Moving Data S3

Data Transfer
  • Unlimited Objects: “There is no limit to the number of objects in a bucket”
  • Permission Requirement: “Uploading an object requires write permission to the bucket”
  • File Types: You can upload any file type, including images, backups, data, and movies
  • Upload Process: “During upload, objects are automatically encrypted by using server-side encryption”
  • Download Process: “During download, objects are decrypted”
  • Default Method: Server-side encryption with Amazon S3 managed keys (SSE-S3)

Options for Uploading Objects to Amazon S3

Section titled “Options for Uploading Objects to Amazon S3”

AWS Management Console

Wizard-based approach with drag and drop capability

  • Maximum file size: 160 GB
  • User-friendly interface for smaller files

AWS CLI

Command line interface for terminal or script usage

  • Upload or download from terminal command prompt
  • Can be called from scripts for automation

AWS SDKs

Programmatic access using software development kits

  • Wrapper libraries for uploading data
  • Multiple programming language support

Amazon S3 REST API

Direct API requests for programmatic control

  • Send PUT request to upload data in single operation
  • Requires write permissions on bucket
  • Console Limit: Maximum 160 GB for AWS Management Console uploads
  • Large Files: “To upload a file larger than 160 GB, use the AWS Command Line Interface (AWS CLI), AWS SDKs, or Amazon S3 REST API”
  • Maximum Object Size: Can upload single large object up to 5 TB using multipart upload API

“You can use multipart upload to upload a single object as a set of parts. Each part is a contiguous portion of the object’s data.”

  1. Part Division: Object divided into multiple parts
  2. Independent Upload: Parts uploaded independently and in any order
  3. Failure Recovery: If transmission of any part fails, retransmit only that part
  4. Assembly: “After all parts of your object are uploaded, Amazon S3 assembles these parts and creates the object”

Parallel Processing: Parts uploaded in parallel to improve throughput

  • Purpose: “Provides fast and secure transfers of files over long distances”
  • Optimization: “Optimizes transfer speeds from across the world into S3 buckets”
  • Infrastructure: “Uses globally distributed edge locations in CloudFront”
  • Performance: “Improves speed by 50–500 percent on average for cross-country transfer of larger objects”
  1. Edge Location Upload: Data uploaded to nearest CloudFront edge location
  2. Optimized Routing: “As the data arrives at an edge location, the data is routed to Amazon S3 over an optimized network path”
  3. Global Distribution: Takes advantage of globally distributed edge locations
  • Global Uploads: “Your customers upload to a centralized bucket from all over the world”
  • Large Data Transfers: “You transfer gigabytes to terabytes of data on a regular basis across continents”
  • Bandwidth Utilization: “You can’t use all of your available bandwidth over the internet when uploading to Amazon S3”

Direct upload to S3 bucket without acceleration - slower for long distances

  • Service Type: “Is a fully managed AWS service”
  • Purpose: “Used to transfer files into and out of Amazon S3 storage or Amazon Elastic File System (Amazon EFS) file systems”
  • Amazon S3 storage
  • Amazon Elastic File System (Amazon EFS) Network File System (NFS) file systems
  • SFTP: Secure Shell (SSH) File Transfer Protocol (SFTP) version 3
  • FTPS: File Transfer Protocol Secure (FTPS)
  • FTP: File Transfer Protocol (FTP)
  • AS2: Applicability Statement 2 (AS2)

Managed Service

Scales in real time to meet your needs without infrastructure management

No Modifications

Don’t need to modify applications or run file transfer protocol infrastructure

Native Integration

Use native AWS services for processing, analytics, reporting, auditing, and archival functions

Elastic File System

Managed elastic file system for AWS Cloud services and on-premises resources

  • Serverless Workflows: “Fully managed, serverless file transfer workflow service that you can use to set up, run, automate, and monitor file uploads”
  • Cost Model: “There are no upfront costs, and you pay for only the use of the service”
  • Automatic Scaling: With Amazon EFS, “grows and shrinks automatically as you add and remove files to help eliminate the need to provision and manage capacity”
  • Data Lakes: Data lakes in AWS for uploads from third parties such as vendors and partners
  • Data Distribution: Subscription-based data distribution with your customers
  • Internal Transfers: Internal transfers within your organization
  • Data Distribution: Content distribution across systems
  • Supply Chain: Supply chain data management
  • Content Management: Content management systems
  • Web Applications: Web-serving applications
  • Small Files (<160 GB): Use AWS Management Console for simplicity
  • Large Files (>160 GB): Use AWS CLI, SDKs, or REST API with multipart upload
  • Automated Processes: Use AWS CLI or SDKs for scripted uploads
  • Custom Applications: Use AWS SDKs or REST API for programmatic integration
  • Global Users: Implement S3 Transfer Acceleration for worldwide user base
  • Large Objects: Use multipart upload for files approaching 100 MB or larger
  • Network Issues: Multipart upload provides resilience against network failures
  • Bandwidth Optimization: Transfer Acceleration maximizes available bandwidth utilization
  • Legacy Systems: Use AWS Transfer Family to support existing FTP/SFTP workflows
  • Security Requirements: Choose appropriate protocol (SFTP for encryption, FTP for legacy compatibility)
  • Compliance Needs: AS2 protocol support for business-to-business communications

Moving data efficiently to and from Amazon S3 requires understanding the various upload methods, optimization features, and transfer services available to match specific use case requirements and performance needs.