Auto-Scaling

There are three types of auto scaling services:

AWS Auto Scaling Service: Holistic scaling view + predictive scaling feature
EC2 Auto Scaling Service: Focused specifically on EC2 instance scaling
Application Auto Scaling Service: API for scaling non-EC2 AWS services

AWS Auto Scaling Service

Provides holistic view of all auto scaling activities
Manages scaling across different application layers
Accessed through dedicated console
Offers business-oriented scaling strategies

Application Auto Scaling

API-based service for non-EC2 resources. Manages scaling for services like:

DynamoDB
ECS
EMR

Application Auto Scaling Consolidates various service-specific scaling capabilities into a single API.

There are three Application Auto Scaling types:

Scaling

What

When

Target Tracking Policy

Initiates scaling events to try to track as closely as possible a given target metric.

"I want my ECS hosts to stay at or below 70% CPU utilization."

Step Scaling Policy

Based on a metric, adjusts capacity given certain defined thresholds.

"I want to increase my EC2 Spot Fleet by 20% every time I add another 10,000 connections on my ELB"

Scheduled Scaling Policy

Initiates scaling events based on a predefined time, day or date.

"Every Monday at 0800, I want to increase the Read Capacity Units of my DynamoDB Table to 20,000"

EC2 Auto Scaling Service

Core Components

EC2 Auto Scaling Service
- Primary service for EC2 resource scaling
- Manages core scaling functionality
- Handles policies, metrics, configurations
EC2 Auto Scaling Groups (ASG)
- Logical grouping of EC2 instances
- Implements scaling mechanisms
- Manages instance monitoring
- Controls min/max instances
- Applies policies and health checks

ASG Scaling Types

1. Dynamic Scaling

Responds to real-time demand through three policy types:

Scaling Policies

What

When

Target Tracking Scaling Policy

Scale based on a predefined or custom metric in relation to a target value

"When CPU utilization gets to 70% on current instances, scale up."

Simple Scaling Policy

Waits until health check and cool down period expires before evaluating new need

"Let's add new instances slow and steady."

Step Scaling Policy

Responds to scaling needs with more sophistication and logic

"AGG! Add ALL the instances!"

2. Predictive Scaling

Leverages machine learning algorithms
Analyzes historical data to forecast scaling needs
Offers automatic scaling and advisory insights
Optional data collection feature

3. Scheduled Scaling

Time-based pattern scaling
Useful for known traffic patterns

ASG Operating Modes

Maintain Mode
- Keeps a fixed instance count
- No scaling types involved
- Simply maintains desired capacity
Manual Mode
- User-controlled scaling
- No automatic scaling types
- Changes made through manual intervention
Schedule-based Mode
- Uses Scheduled Scaling
- Time-based pattern scaling
- Useful for known traffic patterns
Dynamic Mode
- Uses both Dynamic and Predictive Scaling
- Dynamic Scaling Options:
  - Target Tracking Policy
  - Simple Scaling Policy
  - Step Scaling Policy
- Predictive Scaling Features:
  - ML-based forecasting
  - Historical data analysis
  - Optional data collection

Configuration Elements

Launch Configuration

AMI specifications
VPC settings
Load balancer integration
Network configurations

Health Check Grace Period

Allows time for instance initialization
Prevents premature health check failures

Cooldown Periods

Default: 300 seconds
Purpose: Stabilization time between scaling events
Applies to:
- Dynamic scaling (mandatory)
- Manual scaling (optional)
- Scheduled scaling (not supported)
Can be customized for specific scenarios

Scaling Based on Amazon SQS

PreviousScaling Concepts and Services NextCompute Optimizer

Last updated 6 months ago

Was this helpful?