Scaling Compute Resources
Scaling Your Compute Resources
Section titled “Scaling Your Compute Resources”The Need for Reactive Architectures
Section titled “The Need for Reactive Architectures”Modern applications are expected to handle petabytes of data, require close to 100 percent uptime, and deliver sub-second response time to users. A growing number of enterprises have adopted reactive architectures and reactive systems.
Reactive Application Characteristics
Section titled “Reactive Application Characteristics”To meet these requirements, you can implement a reactive application that is:
- Elastic - scaling is a mechanism to scale an application’s resources dynamically. This type of scaling will add or remove resources to react to changes and avoid downtime or traffic bottlenecks
- Resilient - stays responsive in the face of failure. Has the capability to recover when stressed by load, attacks, and failure of any component in the workload’s components
- Responsive - applications should respond in a timely manner with the lowest latency possible. A reactive application remains responsive under varying workloads
- Message-driven - perhaps the most important characteristic. To establish boundaries between services, reactive applications rely on asynchronous message-passing to help ensure loose coupling, isolation, and location transparency
Achieve Elasticity with Scaling
Section titled “Achieve Elasticity with Scaling”Elasticity means that the infrastructure can expand and contract when capacity requirements change. You can acquire resources when you need them and release resources when you do not.
Scaling is the ability to increase or decrease the compute capacity of your application. Scaling is a technique that is used to achieve elasticity.
Vertical Scaling
Section titled “Vertical Scaling”Vertical scaling is where you increase or decrease the specifications of an individual resource:
- You could upgrade to a new server with a larger hard drive or a faster CPU
- Typically the application and data has to be transferred to a new resource, which can lead to application downtime
- With Amazon EC2, you can stop an instance and resize it to an instance type that has more RAM, CPU, I/O, or networking capabilities
- Vertical scaling can eventually reach a limit because it is hardware bound
- Not always a cost-efficient or a highly available approach
Horizontal Scaling
Section titled “Horizontal Scaling”Horizontal scaling is where you add or remove resources available to the application:
- Adding resources is referred to as scaling out
- Ending resources is referred to as scaling in
- Good way to build internet-scale applications that take advantage of the elasticity of cloud computing
- Applications, data, or both are automatically transferred to added resources
Amazon EC2 Auto Scaling
Section titled “Amazon EC2 Auto Scaling”Amazon EC2 Auto Scaling does scaling by grouping EC2 instances in a management group called an Amazon EC2 Auto Scaling group. The group can span across Availability Zones.
Key Features
Section titled “Key Features”- Manages a logical collection of Amazon EC2 instances called an Amazon EC2 Auto Scaling group across Availability Zones
- Launches or retires EC2 instances configured by launch templates
- Resizes based on events from scaling policies, load balancer health check notifications, or schedule actions
- Integrates with Elastic Load Balancing (ELB) to send new instances registrations and receive health notifications
- Balances the number of instances across Availability Zones
- Is available free of charge
Benefits
Section titled “Benefits”With Amazon EC2 Auto Scaling, your applications gain the following benefits:
- Better fault tolerance - can detect when an instance is unhealthy, terminate it, and launch an instance to replace it
- Better availability - helps ensure that your application always has the right amount of capacity to handle the current traffic demand
- Better cost management - can dynamically increase and decrease capacity as needed
Amazon EC2 Auto Scaling Group Components
Section titled “Amazon EC2 Auto Scaling Group Components”Capacity Settings
Section titled “Capacity Settings”The number of instances in the group is determined by the capacity settings:
- Minimum capacity - the smallest number of instances needed to run the application
- Maximum capacity - the largest number of instances permitted for the group
- Desired capacity - the optimal number of instances needed to run the application under normal circumstances
Launch Templates
Section titled “Launch Templates”To launch an EC2 instance, the group needs to know which type of EC2 instances to initiate. The group launch template is used to specify the EC2 instance configuration details:
- Instance type and the Amazon Machine Image (AMI) ID
- What percentage of the desired capacity should be fulfilled with On-Demand Instances, Reserved Instances, and Spot Instances
- Launch template can be versioned
Scaling Mechanisms
Section titled “Scaling Mechanisms”A new Amazon EC2 Auto Scaling group has no scaling mechanisms. You can add scaling mechanisms, such as:
- Schedule actions
- Dynamic scaling policies
- Predictive scaling policy
Amazon EC2 Auto Scaling Mechanisms
Section titled “Amazon EC2 Auto Scaling Mechanisms”Scale based on a date and time
- Are for predictable workloads
- Useful for predictable workloads when you know exactly when to increase or decrease the number of instances
- Example: Traffic increases on Wednesday, remains high on Thursday, starts to decrease on Friday
Scale based on tracked metrics
- Are for moderately spiky workloads
- More advanced way to scale your resources
- Define parameters that control the scaling process
- Gives you extra capacity to handle traffic spikes without maintaining an excessive number of idle resources
Scales based on previous traffic patterns with machine learning
- Is for workload traffic that can be predicted with a pattern
- Analyzes historical load data to detect daily or weekly patterns in traffic flows
- Uses this information to forecast future capacity needs
Target Tracking Scaling
Section titled “Target Tracking Scaling”Target tracking scaling policies increase or decrease the current capacity of the group based on a target value for a specific metric. This type of scaling is similar to the way that your thermostat maintains the temperature of your home:
- You select a temperature, and the thermostat does the rest
- You select a scaling metric and set a target value
- Amazon EC2 Auto Scaling creates and manages the CloudWatch alarms that invokes the scaling policy
Step Scaling and Simple Scaling
Section titled “Step Scaling and Simple Scaling”- Step scaling - adjust the scaling to match the size of the alarm breach
- Simple scaling - will wait for scaling to finish when an alarm is reported ignoring subsequent alarms
- Both use scaling metrics and threshold values for the CloudWatch alarms that invoke the scaling process
Predictive Scaling Use Cases
Section titled “Predictive Scaling Use Cases”Predictive scaling is well suited for situations where you have:
- Cyclical traffic, such as high use of resources during regular business hours and low use during evenings and weekends
- Recurring on-and-off workload patterns, such as batch processing, testing, or periodic data analysis
- Applications that take a long time to initialize, causing a noticeable latency impact on application performance during scale-out events
More AWS Scaling Options
Section titled “More AWS Scaling Options”AWS Auto Scaling
Section titled “AWS Auto Scaling”Uses a scaling plan to configure auto scaling for multiple resources:
- Scale multiple AWS services:
- Amazon Aurora
- Amazon EC2 Auto Scaling
- Amazon ECS
- Amazon DynamoDB
- Use tags to group resources in categories such as production, testing, or development
- Search for and set up scaling plans for scalable resources that belong to each category
AWS Application Auto Scaling
Section titled “AWS Application Auto Scaling”Scale multiple resources with target tracking, step scaling, or scheduled scaling:
- Scale multiple AWS services:
- AWS Auto Scaling services
- AWS Lambda functions
- Amazon SageMaker
- Amazon ElastiCache for Redis
- Similar to Amazon EC2 Auto Scaling groups but for individual AWS services beyond Amazon EC2
Amazon EC2 Auto Scaling creates and manages logical collections of EC2 instances called Auto Scaling groups. Groups have capacity settings that specify minimum, maximum, and desired number of instances. Group size can be scaled in and out with schedule actions, dynamic policies, and predictive policies, while AWS Auto Scaling and Application Auto Scaling extend scaling capabilities to services beyond EC2 instances.