Kinesis
Last updated
Was this helpful?
Last updated
Was this helpful?
Kinesis is a suite of services designed for processing streaming data with the following characteristics:
Processes data in shards
Each shard handles 1,000 records per second
Default limit of 500 shards (can be increased upon request)
Functions as a transient data store
Each record consists of:
Partition key
Sequence number
Data payload (up to 1 MB)
Default: 24 hours
Maximum: 7 days
Not designed for persistent storage
Kinesis Data Streams
Ingests high-volume data
Multiple processing options
Primary focus for exam
Kinesis Firehose
Automated data delivery
Multiple destination options
No immediate processing required
Kinesis Analytics
Real-time data analysis
Processing during ingestion
Pre-warehouse analytics
Kinesis Video Streams
Video stream processing
Less relevant for exam
Shards function like highway lanes
More shards = higher throughput
Data distributed across shards
Each shard has unique 128-bit MD5 hash partition key
Sequential numbering within shards
Unique identification requires both:
Partition key
Sequence number
Real-world architecture:
Twitter API → Kinesis ingestion
Firehose → S3 storage
Lambda processing:
DynamoDB storage
Sentiment analysis
Text processing
This architecture demonstrates Kinesis's capability to handle real-time data processing and integration with other AWS services.