A step-by-step tutorial on how to deploy a PyTorch implementation of the AlexNet Image Classifier using FastAPI, Docker, and Cloud Run.
Welcome to another tutorial that aims to show you how to bring any trained AI model to live by deploying them into production.
In the previous tutorial, deploy ml using flask, we showed you a step-by-step guide of how to deploy a machine learning model using flask, docker, and cloud run. Therefore, the main goal of the following article we will go a step further and show you a better way.
This time we will deploy ML with FastAPI, specifically, an AlexNet Image Classifier model.
The whole workflow is pretty straightforward. First, we will wrap a model into a web service, later create a Dockerfile, and finally host a service on GCP.
The final version of the code for this tutorial can be found here.
Before we deploy ML using FastAPI prepare install following requirements:
torch==1.7.1 torchvision==0.8.2 fastapi==0.63.0 uvicorn==0.13.3 requests==2.18.4
π‘ Explore: If you are interested in AI model deployment, please check out also a tutorial on how to deploy yolov5 model.
In the first step, we need to create a webservice that will handle all requests.
For this purpose, we will use FastAPI, which is a web python framework that is high-performance, easy to learn, fast to code, and ready for high-end production.
What I personally love about the FastAPI is that it's built on Starlette & Pydantic, and despite that fact the syntax is very similar to the flask so that you can easily refactor your code to a new framework.
As a first step download (or copy) a repository with AlexNet Image Classifier that was written in the PyTorch framework. It takes an image in the form of a URL as an input and returns a class of the picture.
The structure of the repository should look as follows:
βββ README.md
βββ requirements.txt
βββ sample_data
β βββ input.jpg
β βββ sample_input.json
βββ syndicai.py
In the main directory create a new file called app.py
and paste the code below:
import json
import uvicorn
from fastapi import FastAPI
from syndicai import PythonPredictor
app = FastAPI()
@app.get("/")
def hello():
""" Main page of the app. """
return "Hello World!"
@app.get("/predict")
async def predict(url: str):
""" Return JSON serializable output from the model """
payload = {'url': url}
classifier = PythonPredictor("")
return classifier.predict(payload)
if __name__ == '__main__':
uvicorn.run(app, host='0.0.0.0', port=8000)
You can see that at the beginning of the file, we import important json
, uvicorn
, FastAPI
as well as a PythonPredictor
class form the syndicai.py file.
Next, we create an instance of this class. The first argument is the name of the applicationβs module or package. This is needed so that Flask knows where to look for templates, static files, and so on (For more information have a look at the FastAPI documentation).
Later, we have created two HTTP requests in the paths /
and /predict
. Both paths take GET
operations (also known as HTTP methods)
In order to run a webservice use python app.py
command and paste http://localhost:8000
in the web browser to see "Hello World!".
What is great about FastAPI is that it generates the documentation on the go when you are developing the API which is the most requested thing from all the developers. Just run http://localhost:8000/docs
or http://localhost:8000/redoc
in the browser.
In order to run a model paste https://i.imgur.com/PzXprwl.jpg
in the parameter section of the /predict
endpoint.
Keep in mind the first model run will take some time because the model will download missing files (weights)!
Congrats! You have just created a FastAPI app!
π‘ Explore: Learn more about Web services for ML: FastAPI: The right replacement for Flask?, FastAPI new project boilerplate, Complete Backend (API) Development course
In the next, we need to take our service with all dependencies and wrap it using docker so that it will be reproducible.
For this purpose, we will use Docker which is an open-source application to create, manage, deploy, and replicate applications using containers.
Containers can be thought of as a package that houses dependencies required by the app to run at an operating system level. This means that each application deployed using Docker lives in an environment of its own and its requirements are handled separately.
When deploying a FastAPI app with Docker allows us to replicate the application across different servers with no reconfiguration.
In order to use docker, you need to first create a Dockerfile in the main directory with the following code.
# Use python as base image
FROM python:3.6-stretch
# Use working directory /app/model
WORKDIR /app/model
# Copy and install required packages
COPY requirements.txt .
RUN pip install --trusted-host pypi.python.org -r requirements.txt
# Copy all the content of current directory to working directory
COPY . .
# Set env variables for Cloud Run
ENV PORT 80
ENV HOST 0.0.0.0
EXPOSE 80:80
# Run flask app
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "80"]
The Dockerfile is a text document that contains commands used to assemble the image. Therefore, just paste the code below inside. Creating a Dockerfile it's worth to have in mind a couple of best practices:
For more detailed information you can read the Dockerfile reference.
Additionally to Dockerfile let's create a docker-compose.yaml that will help us test the app locally.
version: '3'
services:
flask_classifier:
build: .
ports:
- 80:80
The docker-compose is a tool for defining and running multi-container Docker applications using yaml file. Compose works in all environments: production, staging, development, testing, as well as CI workflows and it allows very quickly run a dockerized app.
In order to check if the app is working properly inside the docker just run docker-compose up --build
in the terminal. The service should be available under the localhost
(port 80).
π‘ Explore: Please check out the following resources to learn more about Docker: Dockerfile reference, Docker Image vs Container, Intro guide to Dockerfile best practices
At this stage, the repository structure should look as follows.
.
βββ app.py
βββ docker-compose.yaml
βββ Dockerfile
βββ README.md
βββ requirements.txt
βββ sample_data
β βββ input.jpg
β βββ sample_input.json
βββ syndicai.py
Previously we exposed our app in the form of a web service and created a Dockerfile that allows us to customize the runtime of our container. The last step of this tutorial is service deployment so that it will be out to the world. Therefore, we will use a Cloud Run, a serverless platform for hosting docker containers. It is fully manageable so you don't need to worry about the infrastructure.
In order to deploy a service, you need to run two commands. In addition, remember to substitute PROJECT_ID with the name of your GCP project!
# Build a docker image
gcloud builds submit --tag gcr.io/PROJECT_ID/flask-classifier
# deploy a docker image to cloud run
gcloud run deploy --image gcr.io/PROJECT_ID/flask-classifier \
--platform managed \
--port 80 \
--memory 1G
You can see the deployed service in the Cloud Run.
In order to run your model, just copy the url of the service and use the following request.
curl --request GET \
--url 'https://CLOUD_RUN_SERVICE_URL/predict?url=https://i.imgur.com/PzXprwl.jpg'
Wow! You have just deployed the Image Classifier, Tutorial Deploy ML using FastAPI completed!
To sum up, in the following tutorial you had a chance to learn how to deploy a model using FastAPI, docker and Cloud Run.
As far as FastAPI is concerned I would highly recommend using it when deploying a model. In short, it is very easy to set up, works in async, and automatically generates docs.