ML Example
Real-world Example: Let's say you're building a document classification system:
Glue processes raw documents, extracting text and metadata
Ground Truth workers label these processed documents
The combined processed and labeled data becomes your training dataset
The key is that Glue handles data preparation/transformation, while Ground Truth handles the labeling aspect.
SageMaker Ground Truth:
Used for data labeling/annotation
Creates training datasets
Handles both automated and human labeling
Focuses on preparing data for ML training
AWS Glue:
Handles ETL (Extract, Transform, Load) operations
Data cataloging and discovery
Schema management
Data preprocessing and transformation
Integration Pattern Example:
Raw data stored in S3
Glue catalogs and processes the raw data
Ground Truth used to label the processed data
Labeled data stored back in S3
SageMaker uses final dataset for training
Last updated
Was this helpful?