Auto-Scaling
Last updated
Was this helpful?
Last updated
Was this helpful?
There are three types of auto scaling services:
AWS Auto Scaling Service: Holistic scaling view + predictive scaling feature
EC2 Auto Scaling Service: Focused specifically on EC2 instance scaling
Application Auto Scaling Service: API for scaling non-EC2 AWS services
Provides holistic view of all auto scaling activities
Manages scaling across different application layers
Accessed through dedicated console
Offers business-oriented scaling strategies
API-based service for non-EC2 resources. Manages scaling for services like:
DynamoDB
ECS
EMR
Application Auto Scaling Consolidates various service-specific scaling capabilities into a single API.
There are three Application Auto Scaling types:
Target Tracking Policy
Initiates scaling events to try to track as closely as possible a given target metric.
"I want my ECS hosts to stay at or below 70% CPU utilization."
Step Scaling Policy
Based on a metric, adjusts capacity given certain defined thresholds.
"I want to increase my EC2 Spot Fleet by 20% every time I add another 10,000 connections on my ELB"
Scheduled Scaling Policy
Initiates scaling events based on a predefined time, day or date.
"Every Monday at 0800, I want to increase the Read Capacity Units of my DynamoDB Table to 20,000"
EC2 Auto Scaling Service
Primary service for EC2 resource scaling
Manages core scaling functionality
Handles policies, metrics, configurations
EC2 Auto Scaling Groups (ASG)
Logical grouping of EC2 instances
Implements scaling mechanisms
Manages instance monitoring
Controls min/max instances
Applies policies and health checks
Responds to real-time demand through three policy types:
Target Tracking Scaling Policy
Scale based on a predefined or custom metric in relation to a target value
"When CPU utilization gets to 70% on current instances, scale up."
Simple Scaling Policy
Waits until health check and cool down period expires before evaluating new need
"Let's add new instances slow and steady."
Step Scaling Policy
Responds to scaling needs with more sophistication and logic
"AGG! Add ALL the instances!"
Leverages machine learning algorithms
Analyzes historical data to forecast scaling needs
Offers automatic scaling and advisory insights
Optional data collection feature
Time-based pattern scaling
Useful for known traffic patterns
Maintain Mode
Keeps a fixed instance count
No scaling types involved
Simply maintains desired capacity
Manual Mode
User-controlled scaling
No automatic scaling types
Changes made through manual intervention
Schedule-based Mode
Uses Scheduled Scaling
Time-based pattern scaling
Useful for known traffic patterns
Dynamic Mode
Uses both Dynamic and Predictive Scaling
Dynamic Scaling Options:
Target Tracking Policy
Simple Scaling Policy
Step Scaling Policy
Predictive Scaling Features:
ML-based forecasting
Historical data analysis
Optional data collection
AMI specifications
VPC settings
Load balancer integration
Network configurations
Allows time for instance initialization
Prevents premature health check failures
Default: 300 seconds
Purpose: Stabilization time between scaling events
Applies to:
Dynamic scaling (mandatory)
Manual scaling (optional)
Scheduled scaling (not supported)
Can be customized for specific scenarios