Recommendation System
In this example, let’s train a movie recommendation system on Slai, and deploy the finished model as a REST API.
We’re using the MovieLens dataset, which contains a large number of movie ratings by users.
We’ll start by uploading the dataset to Slai.
Here, we have uploaded the data directly to the Slai model sandbox as a static file. However, in a real world use case, we can take advantage of Slai’s many integrations with external dynamic data sources, such as S3, BigQuery, and Snowflake. This allows us to continuously ingest new data, retrain the model, and deploy updates.
Here, we have uploaded the data directly to the Slai model sandbox as a static file.
However, for a real world use case, you may want to take advantage of Slai’s many integrations with external dynamic data sources, such as S3, BigQuery, and Snowflake.
Exploring the data
It is important to gain an understanding of the dataset before beginning to design or train a model. This stage is often referred to as exploratory data analysis, or EDA. Conveniently, the Slai sandbox includes a built-in notebook, the EDA tool of choice for most data scientists.
For example, we can visualize sample counts for both users and movies using a histogram. In this case, we see that the highest density of both users and movies have had a relatively small number of interactions. This motivates us to use a smaller embedding size to prevent overfitting.
Movielens dataset
Now onto the code - to start, let’s take a look at the MovielensDataset
class. This class implements the PyTorch dataset interface, allowing our custom data preparation logic to be easily integrated into standard data loading and training routines.
We load all user ratings from the source CSV, organize by user, and split temporally into a “train” and “test” set
To simplify the problem we assume that any rating or interaction between a user and a movie is a binary “1”. We also generate a fixed set of negative samples for each user from movies they have not seen.
By preprocessing the dataset, we have formed a simpler problem – given a user ID and movie ID, predict whether they interacted. The intention, of course, being that the model will also learn to predict high scores for movies that the user would be likely to interact with in the future.
Building the neural network
Next, let’s dig into the most interesting section of this code, the neural net definition.
We use an embedding layer for both the user and movie, to compress the respective one-hot encoded vectors into rich, compact representations that are easier to model.
These two representations are concatenated into a single vector and then passed into a simple fully connected neural network.
For the forward pass, the user and item IDs are first projected to the user and item embeddings respectively.
These vectors are then concatenated and passed into the series of dense layers. After each fully connected inner layer, we apply ReLu activation (promote nonlinear learning) and dropout (prevent overfitting). Finally, we pass this into a linear output layer with sigmoid activation function to predict a single probability value between zero and one.
To begin the training process, the model weights are randomly initialized. Though it would certainly be possible to pre-train the embedding layers on any other meaningful information about the users or movies that was available.
We use an Adam optimizer with a binary cross entropy loss function to minimize error in predicting interactions between users and movies.
The training data is shuffled and processed in batches, until all samples have been seen. This is repeated for 20 epochs.
What’s our hit rate?
Evaluation and comparison of recommender systems can be difficult. The loss function is usually not directly meaningful for actual performance and can vary largely in value based on training approach and model.
One common metric that is easy to compute and interpret generically across systems is “hit rate” - basically, given N total samples, including 1 positive sample, what is the probability that the positive sample will appear in the top K results. We can refer to this as “hit rate @ K / N”.
Training the model
To execute the actual model training on Slai, we have a few options. The simplest is just running a single job to immediately train and package a new model. This is a convenient option for tutorials and during development.
Alternatively we can schedule a recurring training job for continuous learning. This is an extremely powerful option in a production setting, when training on a dynamic dataset, or for long running jobs.
Writing the handler
The handler is the code that actually receives and processes an incoming request to the deployed model. Slai allows us to customize this handler logic, meaning we can implement some interesting functionality natively in the API, without any need for wrapping calls or an additional backend.
As an example, here we have built an endpoint to return the top N unseen movie recommendations for a specific user. This is accomplished by loading the user viewing history, filtering out any previously viewed movies, scoring all unseen movie candidates, then returning the top N results.
Local testing
Slai also provides hooks to implement testing and validation of the trained model and request handler. In this movie recommender example, we have created tests to verify output schema, high scoring recommendations, unique results between users, and unseen recommendations.
This test suite can be run to facilitate development, test the handler, or to verify a newly trained model.
Deploying
Finally, deploying our newly trained model is as easy as a few clicks. Simply hit “publish and deploy” to package the model artifact, stand up a new inference service, and expose the endpoint to the world.
Once this is live, we can immediately start making requests from the client side.
Which will return movie recommendations for our user:
You can view the sandbox for this tutorial here. We’re excited to see what you build!