Starting a deployment
To deploy a model, click the Deploy button in your sandbox.
Configuring a deployment
A modal will appear, showing the deployment version. You’ll have the option to configure your deployment by clicking Advanced deployment options.
You can find more information on using advanced deployment options here.
This modal appears after clicking Deploy
Configure various deployment settings, if you wish
Clicking Deploy will package and deploy your application to the Slai backend.
Monitoring a deployment
After deploying, you’ll be redirected to a dashboard to monitor the status of the deployment.
On the monitoring page, you can view APM-style metrics:
- Total API calls
- Total errors
- Average inference time
- Average cold start time
- Requests per second
- Logs on the container
You can view all console output on your deployed container by scrolling down the page to the Deployment Logs section.
Calling the API
Once the model is deployed, it can be called via our cURL, Python, or Node client. Your
client_secret, and model name will be filled in dynamically.
Click the Integrate button to copy the integration code in any of our three client libraries.
Just copy and paste this into your shell, and you can start running inference.
# Example usage of model API import slai slai.login( client_id="c3cf0b3d30f03579f9d8e7b416ce81a3", client_secret="51b42996f59d795f12c80c45ac4f4ffe" ) model = slai.model("GPT-2/initial") # This is just an example of how to call your model # Parameters vary based on your handler inputs prediction = model(text="Once upon a time I was reading the docs")