Elastic Inference (EI)

To decrease the cose of sage maker.

Speed up throughput and decreases latency or real-time inferences deployed on SageMaker hosted servcies using only CPU-based instances. It is much cost effective than full GPU instance.

It must be configured when you create a deployable model. EI is not available for all algorithms yet.