Deploy ML model using Flask, Docker, and Cloud Run

β‹… Marcin Laskowski
flask, docker and cloud run

A step-by-step tutorial on how to deploy a PyTorch implementation of the Alexnet Image Classifier using Flask, Docker, and Cloud Run.

Introduction

Once you have a working model, the next step is to put it to production!

You have many options on how to do it. It mostly depends on the use-case which one to choose. If you want to explore all of them, please check out the article about AI model deployment.

In this article, we will deploy ML model using Flask. Specifically, an AlexNet Image Classifier using THE most popular stack: Flask, Docker, and Cloud Run. The whole workflow is pretty straightforward. Wrap a model into a web service, create a Dockerfile, and host on GCP. The final version of the code for this tutorial can be found here.

Easy peasy, so let’s start!

Prerequisites

This tutorial requires the following:

  1. Basic understanding of Python, Docker, REST
  2. Access to Google Cloud Platform (if you want to host a model)
  3. Trained model from the repository.
  4. Installed docker, docker-compose, gcloud, python 3.6
  5. Installed following python dependencies: torch==1.7.1 torchvision==0.8.2 flask==1.1.2 requests==2.18.4

πŸ’‘ Explore
If you are interested in AI model deployment, please check out also tutorial on how to deploy yolov5 model.

Expose as a Webservice

Once you have a trained model, you need to create a web service able to handle requests.

For this purpose, we will use Flask, a lightweight WSGI web application framework. It provides various tools and libraries for building web apps making the core simple but extensible.

As a first step download (or copy) a repository with AlexNet Image Classifier that was written in the PyTorch framework. It takes an image in the form of a URL as an input and returns a class of the picture.

The structure of the repository should look as follows:

β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ sample_data
β”‚   β”œβ”€β”€ input.jpg
β”‚   └── sample_input.json
└── syndicai.py

In the main directory create a new file called app.py and paste the code below:

import json
from flask import Flask, request
from syndicai import PythonPredictor

app = Flask(__name__)


@app.route('/')
def hello():
    """ Main page of the app. """
    return "Hello World!"


@app.route('/predict')
def predict():
    """ Return JSON serializable output from the model """
    payload = request.args
    classifier = PythonPredictor("")
    return classifier.predict(payload)


if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

At the begining of the file we import important json, flask as well as a PythonPredictor class form the syndicai.py file.

Next, we create an instance of this class. The first argument is the name of the application’s module or package. This is needed so that Flask knows where to look for templates, static files, and so on. For more information have a look at the Flask documentation.

We then use the route() decorator to tell Flask what URL should trigger our function.

The function is given a name which is also used to generate URLs for that particular function, and returns the message we want to display in the user’s browser.

Finally, start the web service using python app.py command and paste http://localhost:5000 in the web browser to see “Hello World!”.

In order to run a model paste http://localhost:5000/predict?url=https://i.imgur.com/PzXprwl.jpg in the browser or run the following REST API request in your terminal. Keep in mind the first model run will take some time because the model will download missing files (weights).

curl --request GET \
  --url 'http://localhost:5000/predict?url=https://i.imgur.com/PzXprwl.jpg'

Congrats! You have just created a flask app!

πŸ’‘ Explore
Learn more about Web services for ML.
REST in a nutshell
Flask Mega Tutorial by Miguel Grinberg
A curated list of awesome Flask resources and plugins
Organizing flask project

Wrap with a Docker

The next step is wrapping a flask service with all dependencies and making it reproducible.

For this purpose, we will use Docker. It is an open-source application to create, manage, deploy, and replicate applications using containers. Containers can be thought of as a package that houses dependencies required by the app to run at an operating system level. This means that each application deployed using Docker lives in an environment of its own and its requirements are handled separately.

Deploying a Flask app with Docker allows us to replicate the application across different servers with no reconfiguration.

In order to use docker, you need to first create a Dockerfile in the main directory with the following code.

# Use python as base image
FROM python:3.6-stretch

# Use working directory /app/model
WORKDIR /app/model

# Copy and install required packages
COPY requirements.txt .
RUN pip install --trusted-host pypi.python.org -r requirements.txt

# Copy all the content of current directory to working directory
COPY . .

# Set env variables for Cloud Run
ENV PORT 5000
ENV HOST 0.0.0.0

# Open port 5000
EXPOSE 5000:5000

# Run flask app
CMD ["python","app.py"]

The Dockerfile is a text document that contains commands used to assemble the image. Paste the code below inside. Creating a Dockerfile it’s worth to have in mind a couple of best practices:

  1. Layer with requirements.txt installation needs to be above the layer where you copy model files. Thank to that approach, each time when you will change something in your model, docker will rebuild the image without installing dependencies again.
  2. Use more specific tags and dependency versions. It will prevent you from failure when new library updates are released.
  3. Change the working directory for your files, don’t work in the root.

For more detailed information you can read the Dockerfile reference.

Additionally to Dockerfile let’s create a docker-compose.yaml that will help us test the app locally.

version: '3'
services:

  flask_classifier:
    build: .
    ports:
      - 5000:5000

The docker-compose is a tool for defining and running multi-container Docker applications using yaml file. Compose works in all environments: production, staging, development, testing, as well as CI workflows and it allows very quickly run a dockerized app.

In order to check if the app is working properly inside the docker just run docker-compose up --build in the terminal. The service should be available under the port 5000.

πŸ’‘ Explore
Please check out the following resources to learn more about Docker.
Dockerfile reference
Docker Image vs Container
Intro guide to Dockerfile best practices

Deploy on Cloud Run

We already exposed our app in the form of a web service and created a Dockerfile that allows us to customize the runtime of our container. At this stage, the repository structure should look as follows.

.
β”œβ”€β”€ app.py
β”œβ”€β”€ docker-compose.yaml
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ sample_data
β”‚   β”œβ”€β”€ input.jpg
β”‚   └── sample_input.json
└── syndicai.py

The last step of this tutorial is service deployment so that it will be out to the world. We will use a Cloud Run, a serverless platform for hosting docker containers. It is fully manageable so you don’t need to worry about the infrastructure.

In order to deploy a service, you need to run two commands. Remember to substitute PROJECT_ID with the name of your GCP project!

# Build a docker image 
gcloud builds submit --tag gcr.io/PROJECT_ID/flask-classifier

# deploy a docker image to cloud run
gcloud run deploy --image gcr.io/PROJECT_ID/flask-classifier \
  --platform managed \
  --port 5000 \
  --memory 1G

You can see the deployed service in the Cloud Run.

In order to run your model, just copy the url of the service and use the following request.

curl --request GET \
  --url 'https://CLOUD_RUN_SERVICE_URL/predict?url=https://i.imgur.com/PzXprwl.jpg'

Wow! You have just deployed the Image Classifier!

flask mlops syndicai

Conclusion

In the following tutorial, you had a chance to explore one approach of AI model deployment, based on flask, docker, and Cloud Run.

The solutions that we’ve tried has couple pros and cons.

On one hand, flask gives you a lot of freedom, is easy to use, has integrated unit testing support and extensive documentation. However, a lot of freedom means that you need to configure a lot of things to make it work really well in production.

* * *

If you found that tutorial helpful, or have some questions, please let us know via mail or join us on slack. We would love to hear your feedback.

We would also like to inspire you via our AI models Showcase page and give you a warm invitation to try out our MLOps platform that speeds up the work of AI Teams 10x.