A step-by-step tutorial on how to deploy a PyTorch implementation of the AlexNet Image Classifier using FastAPI, Docker, and Cloud Run.


Welcome to another tutorial that aims to show you how to bring any trained AI model to live by deploying them into production.

In the previous tutorial, deploy ml using flask, we showed you a step-by-step guide of how to deploy a machine learning model using flask, docker, and cloud run. Therefore, the main goal of the following article we will go a step further and show you a better way.

This time we will deploy ML with FastAPI, specifically, an AlexNet Image Classifier model.

deploy a model with fastapi
AlexNet model takes image as input and output the predicted class

The whole workflow is pretty straightforward. First, we will wrap a model into a web service, later create a Dockerfile, and finally host a service on GCP.

Machine Learning model deployment workflow
Machine Learning model deployment workflow

The final version of the code for this tutorial can be found here.


Before we deploy ML using FastAPI prepare install following requirements:

  1. Basic understanding of Python, Docker, REST
  2. Access to Google Cloud Platform (if you want to host a model)
  3. Trained model from the repository.
  4. Installed docker, docker-compose, gcloud, python 3.6
  5. Installed the following python dependencies: torch==1.7.1 torchvision==0.8.2 fastapi==0.63.0 uvicorn==0.13.3 requests==2.18.4

πŸ’‘ Explore: If you are interested in AI model deployment, please check out also a tutorial on how to deploy yolov5 model.

Expose as a Webservice

In the first step, we need to create a webservice that will handle all requests.

For this purpose, we will use FastAPI, which is a web python framework that is high-performance, easy to learn, fast to code, and ready for high-end production.

What I personally love about the FastAPI is that it's built on Starlette & Pydantic, and despite that fact the syntax is very similar to the flask so that you can easily refactor your code to a new framework.

fastapi vs flask
Code comparison between flask (left) and fastAPI (right)

Download a model

As a first step download (or copy) a repository with AlexNet Image Classifier that was written in the PyTorch framework. It takes an image in the form of a URL as an input and returns a class of the picture.

The structure of the repository should look as follows:

β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ sample_data
β”‚ β”œβ”€β”€ input.jpg
β”‚ └── sample_input.json
└── syndicai.py

Create an app.py

In the main directory create a new file called app.py and paste the code below:

import json
import uvicorn
from fastapi import FastAPI
from syndicai import PythonPredictor

app = FastAPI()

def hello():
""" Main page of the app. """
return "Hello World!"

async def predict(url: str):
""" Return JSON serializable output from the model """
payload = {'url': url}
classifier = PythonPredictor("")
return classifier.predict(payload)

if __name__ == '__main__':
uvicorn.run(app, host='', port=8000)

You can see that at the beginning of the file, we import important json, uvicorn, FastAPI as well as a PythonPredictor class form the syndicai.py file.

Next, we create an instance of this class. The first argument is the name of the application’s module or package. This is needed so that Flask knows where to look for templates, static files, and so on (For more information have a look at the FastAPI documentation).

Later, we have created two HTTP requests in the paths / and /predict. Both paths take GET operations (also known as HTTP methods)

Run a service

In order to run a webservice use python app.py command and paste http://localhost:8000 in the web browser to see "Hello World!".

What is great about FastAPI is that it generates the documentation on the go when you are developing the API which is the most requested thing from all the developers. Just run http://localhost:8000/docs or http://localhost:8000/redoc in the browser.

In order to run a model paste https://i.imgur.com/PzXprwl.jpg in the parameter section of the /predict endpoint.

run a fastapi model
Run a model using Swagger generated by FastAPI: http://localhost:8000/docs

Keep in mind the first model run will take some time because the model will download missing files (weights)!

Congrats! You have just created a FastAPI app!

πŸ’‘ Explore: Learn more about Web services for ML: FastAPI: The right replacement for Flask?, FastAPI new project boilerplate, Complete Backend (API) Development course

Wrap with a Docker

In the next, we need to take our service with all dependencies and wrap it using docker so that it will be reproducible.

For this purpose, we will use Docker which is an open-source application to create, manage, deploy, and replicate applications using containers.

Containers can be thought of as a package that houses dependencies required by the app to run at an operating system level. This means that each application deployed using Docker lives in an environment of its own and its requirements are handled separately.

When deploying a FastAPI app with Docker allows us to replicate the application across different servers with no reconfiguration.

Create a Dockerfile

In order to use docker, you need to first create a Dockerfile in the main directory with the following code.

# Use python as base image
FROM python:3.6-stretch

# Use working directory /app/model
WORKDIR /app/model

# Copy and install required packages
COPY requirements.txt .
RUN pip install --trusted-host pypi.python.org -r requirements.txt

# Copy all the content of current directory to working directory
COPY . .

# Set env variables for Cloud Run

EXPOSE 80:80

# Run flask app
CMD ["uvicorn", "app:app", "--host", "", "--port", "80"]

The Dockerfile is a text document that contains commands used to assemble the image. Therefore, just paste the code below inside. Creating a Dockerfile it's worth to have in mind a couple of best practices:

  1. Layer with requirements.txt installation needs to be above the layer where you copy model files. Thank to that approach, each time when you will change something in your model, docker will rebuild the image without installing dependencies again.
  2. Use more specific tags and dependency versions. It will prevent you from failure when new library updates are released.
  3. Change the working directory for your files, don't work in the root.

For more detailed information you can read the Dockerfile reference.

Create a docker-compose

Additionally to Dockerfile let's create a docker-compose.yaml that will help us test the app locally.

version: '3'

build: .
- 80:80

The docker-compose is a tool for defining and running multi-container Docker applications using yaml file. Compose works in all environments: production, staging, development, testing, as well as CI workflows and it allows very quickly run a dockerized app.

In order to check if the app is working properly inside the docker just run docker-compose up --build in the terminal. The service should be available under the localhost (port 80).

πŸ’‘ Explore: Please check out the following resources to learn more about Docker: Dockerfile reference, Docker Image vs Container, Intro guide to Dockerfile best practices

Deploy on Cloud Run

At this stage, the repository structure should look as follows.

β”œβ”€β”€ app.py
β”œβ”€β”€ docker-compose.yaml
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ sample_data
β”‚ β”œβ”€β”€ input.jpg
β”‚ └── sample_input.json
└── syndicai.py

Previously we exposed our app in the form of a web service and created a Dockerfile that allows us to customize the runtime of our container. The last step of this tutorial is service deployment so that it will be out to the world. Therefore, we will use a Cloud Run, a serverless platform for hosting docker containers. It is fully manageable so you don't need to worry about the infrastructure.

In order to deploy a service, you need to run two commands. In addition, remember to substitute PROJECT_ID with the name of your GCP project!

# Build a docker image 
gcloud builds submit --tag gcr.io/PROJECT_ID/flask-classifier

# deploy a docker image to cloud run
gcloud run deploy --image gcr.io/PROJECT_ID/flask-classifier \
--platform managed \
--port 80 \
--memory 1G

You can see the deployed service in the Cloud Run.

Cloud Run dashboard with deployed model
Cloud Run dashboard with deployed model

In order to run your model, just copy the url of the service and use the following request.

curl --request GET \
--url 'https://CLOUD_RUN_SERVICE_URL/predict?url=https://i.imgur.com/PzXprwl.jpg'

Wow! You have just deployed the Image Classifier, Tutorial Deploy ML using FastAPI completed!

congrats, model deployed


To sum up, in the following tutorial you had a chance to learn how to deploy a model using FastAPI, docker and Cloud Run.

As far as FastAPI is concerned I would highly recommend using it when deploying a model. In short, it is very easy to set up, works in async, and automatically generates docs.

* *Β *

If you found that material helpful, have some comments, or want to share some ideas for the next one - don't hesitate to drop us a line via slack or mail. We would love to hear your feedback!

You might like these

Deploy yolov5 model in a few simple clicks

February 2, 2022
MichaΕ‚ ZmysΕ‚owski

Train yolov5. A quick guide from a model to the actual use case.

February 2, 2022
MichaΕ‚ ZmysΕ‚owski