Skip to content
Pablo Rodriguez

Well Architected Framework

Applying the AWS Well-Architected Framework Principles to Data Pipelines

Section titled “Applying the AWS Well-Architected Framework Principles to Data Pipelines”

The Data Analytics Lens is a collection of customer-proven best practices for designing well-architected analytics workloads, containing insights gathered from real-world case studies. This provides IT architects and developers a way to evaluate analytics workloads without becoming subject matter experts.

  1. Define the Workload: Identify set of components that together deliver business value (marketing websites, e-commerce, mobile backends, analytic platforms)

  2. Evaluate Against Pillar Design Principles: Prioritize pillars by importance and identify most important design principles for each pillar

  3. Implement Best Practices: Proceed with implementation and continue regular evaluations to add more best practices

The tools and techniques covered in this module align to AWS Well-Architected Framework pillars. Continual evaluation ensures solutions remain the best fit for analytics workloads.

Best Practice: Control Access to Workload Infrastructure

Section titled “Best Practice: Control Access to Workload Infrastructure”

Implementation: “Implement policies of least privilege for source and downstream systems”

The security pillar encompasses protection of data, systems, and assets to improve security using cloud technologies. Analytics environments change based on evolving data processing and distribution requirements.

  • Least Privilege Access: Give only enough access for systems to perform their jobs
  • Permission Boundaries: System actions on data should determine permissions
  • Role-Based Access: Identify minimum privileges each user requires
  • Granular Controls: Grant only necessary permissions (e.g., read-only table access for business analysts)
  • Analytics environments change frequently with evolving requirements
  • Ensure environment accessibility with minimum necessary permissions
  • Regularly review and update access controls as requirements change

Best Practice: Choose the Best-Performing Compute Solution

Section titled “Best Practice: Choose the Best-Performing Compute Solution”

Implementation: “Identify analytics solutions that best suit your technical challenges”

The performance efficiency pillar focuses on efficient use of resources to meet requirements as demand changes and technologies evolve.

  • Amazon Redshift: Data warehousing for structured analytics
  • Kinesis: Streaming data processing for real-time analytics
  • QuickSight: Data visualization and business intelligence
  • Purpose-Built Services: Each designed to overcome specific challenges
  • Business Requirements: Match tools to business and technical requirements
  • Use Case Fit: Identify right tool for specific jobs
  • Technical Challenges: Choose services that address particular analytical challenges
  • Example: Café clickstream visualization using QuickSight and Athena

Implementation: Two key approaches for cost management

  • Data Retention: Delete data past retention period to reduce storage costs
  • Metadata Catalog: Identify data outside retention periods
  • Automation: Use Amazon S3 lifecycle configurations for automatic data expiration
  • Regular Cleanup: Implement standardized process to identify and remove unused resources
  • Utilization Monitoring: Track resource utilization changes over time
  • Data Movement: Move infrequently used data from data warehouse to data lake
  • Query Optimization: Use Redshift Spectrum to query S3 data without movement
  • Storage Tiering: Use Athena to query data at rest in Amazon S3

Consider data retention periods when making storage decisions, balancing cost efficiency with query performance requirements.

Best Practice: Design Resilience for Analytics Workloads

Section titled “Best Practice: Design Resilience for Analytics Workloads”

Implementation: “Understand the business requirements of analytics and ETL jobs”

The reliability pillar encompasses workload ability to perform intended functions correctly and consistently when expected.

Extract → Transform → Load

  • Transforms data before loading into target
  • Good for structured data with known requirements

Understanding business requirements helps determine appropriate patterns for moving data from source systems to target data stores, ensuring reliability matches business needs.

  • Regular Evaluation: Continually assess whether solutions fit analytics workloads
  • Best Practice Implementation: Add as many best practices as possible over time
  • Pillar Prioritization: Order pillars by importance for specific use cases
  • Iterative Refinement: Refine and improve systems over entire lifecycle

The AWS Well-Architected Framework supplies foundational questions to understand if architecture aligns with cloud best practices, helping evaluate trade-offs in workload design, operation, and maintenance.

Tools and techniques learned throughout this module align with Well-Architected Framework pillars, providing practical implementation guidance for data engineering patterns.

Applying the AWS Well-Architected Framework to data pipelines requires continuous evaluation across security, performance efficiency, cost optimization, and reliability pillars. Implementation focuses on least privilege access, appropriate tool selection, cost management, and resilient data movement patterns.