Starting a deployment

To deploy a model, click the Deploy button in your sandbox.

Deploy button

Configuring a deployment

A modal will appear, showing the deployment version. You’ll have the option to configure your deployment by clicking Advanced deployment options.

You can find more information on using advanced deployment options here.

This modal appears after clicking Deploy

Configure various deployment settings, if you wish

Clicking Deploy will package and deploy your application to the Slai backend.

Monitoring a deployment

After deploying, you’ll be redirected to a dashboard to monitor the status of the deployment.

On the monitoring page, you can view APM-style metrics:

  • Total API calls
  • Total errors
  • Average inference time
  • Average cold start time
  • Requests per second
  • Logs on the container

Monitor the deployment

Container Logs

You can view all console output on your deployed container by scrolling down the page to the Deployment Logs section.

Monitor the deployment

Calling the API

Once the model is deployed, it can be called via our cURL, Python, or Node client. Your client_id, client_secret, and model name will be filled in dynamically.

Click the Integrate button to copy the integration code in any of our three client libraries.


Just copy and paste this into your shell, and you can start running inference.

# Example usage of model API

import slai


model = slai.model("GPT-2/initial")

# This is just an example of how to call your model

# Parameters vary based on your handler inputs

prediction = model(text="Once upon a time I was reading the docs")