Real-time interfaces

Here are the key steps to deploy real-time interfaces in SageMaker:

  1. Prepare Model

  • Package your trained model with inference code

  • Create model.tar.gz containing the model artifacts and inference.py script

  1. Create SageMaker Model

  • Use CreateModel API/console to register your model

  • Specify container image and model artifacts location in S3

  1. Create Endpoint Configuration

  • Define instance type (e.g. ml.t2.medium)

  • Set number of instances for scaling

  • Configure production variants if A/B testing

  1. Create Endpoint

  • Deploy using the endpoint configuration

  • Wait for status to become "InService"

  • Get endpoint name/URL for making predictions

  1. Invoke Endpoint

  • Use InvokeEndpoint API for real-time predictions

  • Send properly formatted input data

  • Get predictions in response

The minimum required components are:

# Basic deployment flow
model = Model(
    model_data='s3://bucket/model.tar.gz',
    image_uri=container_uri,
    role=role)

predictor = model.deploy(
    instance_type='ml.t2.medium',
    initial_instance_count=1)

Last updated

Was this helpful?