Deploying an ML Model

Define the environment

The first thing we’ll do is define the environment that our app will run on. For this example, we’re building a Sentiment Analysis model using Huggingface.

First, create a file with your Beam App definition. You can name this whatever you want. In this example, we’ll call it app.py.

app.py

import beam

app = beam.App(
    name="sentiment-analysis-app",
    cpu=4,
    memory="4Gi",
    gpu=0,
    python_version="python3.8",
    python_packages=["transformers", "torch", "numpy"],
)

Invoking the Huggingface Model

Now, we’ll write some code to predict the sentiment of a given text prompt. Create a new file. Again, you can name this whatever you want. We’ll name ours inference.py

Our function takes keyword arguments, as (**inputs).

inference.py

from transformers import pipeline

def predict_sentiment(**inputs):
    model = pipeline(
        "sentiment-analysis", model="siebert/sentiment-roberta-large-english"
    )
    result = model(inputs["text"], truncation=True, top_k=1)
    prediction = {i["label"]: i["score"] for i in result}

    print(prediction)

    return {"prediction": prediction}

Setting a REST API Trigger

To deploy the API, we’ll create a REST API Trigger in our app.py.

Our trigger requires three things:

Inputs - the name and type of all inputs to the API
Outputs - the name and type of all outputs returned from the API
Handler - the file and function to be invoked when the API is called

Add the following lines to your app.py file:

app.py

app.Trigger.RestAPI(
    inputs={"text": beam.Types.String()},
    outputs={"prediction": beam.Types.String()},
    handler="inference.py:predict_sentiment",
)

(Optional) Caching Model on Disk

For performance reasons, you want to store the model on disk rather than downloading it from HuggingFace for each request.

Create a Persistent Volume to store the model weights.

Add the following lines to your app.py:

app.py

app.Mount.PersistentVolume(app_path="./cached_models", name="cached_model")
os.environ["TRANSFORMERS_CACHE"] = "/workspace/cached_models"

The complete app.py file will look like this:

app.py

import os
import beam

app = beam.App(
    name="sentiment-analysis",
    cpu=4,
    memory="4Gi",
    gpu=0,
    python_version="python3.9",
    python_packages=["transformers", "torch"],
)

app.Trigger.RestAPI(
    inputs={"text": beam.Types.String()},
    outputs={"prediction": beam.Types.String()},
    handler="inference.py:predict_sentiment",
)

app.Mount.PersistentVolume(app_path="./cached_models", name="cached_model")
os.environ["TRANSFORMERS_CACHE"] = "/workspace/cached_models"

Deploying the app

To deploy the model, enter your terminal and cd to the directory you’re working on.

Then, run the following:

beam deploy app.py

Once you deploy, you’ll see the following console output:

At the bottom of the console, you’ll see a URL for invoking your function. Here’s what a cURL request would look like:

  curl -X POST --compressed "https://beam.slai.io/kajru" \
   -H 'Accept: */*' \
   -H 'Accept-Encoding: gzip, deflate' \
   -H 'Authorization: Basic YWJmNWVhYjhjY2VkZTQ3ZDJmZWU4NTYyNTliYWU3NDA6ZTE0ZTE0MTY0M2NmYzkxNDdkMDM1MzZkNDdjYzRkMGI=' \
   -H 'Connection: keep-alive' \
   -H 'Content-Type: application/json' \
   -d '{"text": "If we override the bandwidth, we can get to the SMTP capacitor through the cross-platform RSS alarm!"}'

The requests are authenticated with basic auth. Your username is your ClientID, and password is your Client Secret.

Congrats - you just deployed your first function on Beam!

Guides

Triggers

Data

Deployment

Tools

Account

Security

Deploying an ML Model

Define the environment

Invoking the Huggingface Model

Setting a REST API Trigger

(Optional) Caching Model on Disk

Deploying the app

Guides

Triggers

Data

Deployment

Tools

Account

Security

​Define the environment

​Invoking the Huggingface Model

​Setting a REST API Trigger

​(Optional) Caching Model on Disk

​Deploying the app

Define the environment

Invoking the Huggingface Model

Setting a REST API Trigger

(Optional) Caching Model on Disk

Deploying the app