Guides
Running a Web Scraper
Let’s build a simple web scraper which extracts headlines from The New York Times and uses a BERT model from Huggingface to detect the sentiment of each.
Define the environment
First, we’ll define our environment. For this project. we’ll need the following libraries:
app.py
Starting the environment
Spin up the environment by running beam start <your app>.py
You’ll see the red
beam text at the end of your shell path, which means you’ve entered the Beam
environment!
Write scraping logic
Now, we’ll write logic to scrape the headlines from The New York Times. Create a
new file - let’s call it scraper.py
.
scraper.py
Running the scraper
Now, we’re ready to run our code using Beam. In your terminal, run:
You should see the headlines and the detected sentiment!