SageMaker Inference¶

SageMaker real-time inference deploys your models to containers so they are always ready to perform inference quickly.

Terminology¶

An inference package is a GZip file.
- The root of this package is the model_dir
- The package must contain a folder named code containing a file named inference.py
A SageMaker model identifies an inference package and an ECR docker image
A SageMaker endpoint configuration describes a set of models and the type and number of instances
A SageMaker endpoint is a deployment of an endpoint configuration that is live for inference

Create a model

Invoke a local inference package extracted to a folder

Create, invoke, then destroy a remote endpoint

Build docker images for models