Deploy ML model using Flask, Docker, and Cloud Run⋅ Marcin Laskowski
A step-by-step tutorial on how to deploy a PyTorch implementation of the Alexnet Image Classifier using Flask, Docker, and Cloud Run.
Once you have a working model, the next step is to put it to production!
You have many options on how to do it. It mostly depends on the use-case which one to choose. If you want to explore all of them, please check out the article about AI model deployment.
In this article, we will deploy ML model using Flask. Specifically, an AlexNet Image Classifier using THE most popular stack: Flask, Docker, and Cloud Run. The whole workflow is pretty straightforward. Wrap a model into a web service, create a Dockerfile, and host on GCP. The final version of the code for this tutorial can be found here.
Easy peasy, so let’s start!
This tutorial requires the following:
- Basic understanding of Python, Docker, REST
- Access to Google Cloud Platform (if you want to host a model)
- Trained model from the repository.
- Installed docker, docker-compose, gcloud, python 3.6
- Installed following python dependencies:
torch==1.7.1 torchvision==0.8.2 flask==1.1.2 requests==2.18.4
If you are interested in AI model deployment, please check out also tutorial on how to deploy yolov5 model.
Expose as a Webservice
Once you have a trained model, you need to create a web service able to handle requests.
For this purpose, we will use Flask, a lightweight WSGI web application framework. It provides various tools and libraries for building web apps making the core simple but extensible.
As a first step download (or copy) a repository with AlexNet Image Classifier that was written in the PyTorch framework. It takes an image in the form of a URL as an input and returns a class of the picture.
The structure of the repository should look as follows:
├── README.md ├── requirements.txt ├── sample_data │ ├── input.jpg │ └── sample_input.json └── syndicai.py
In the main directory create a new file called app.py and paste the code below:
import json from flask import Flask, request from syndicai import PythonPredictor app = Flask(__name__) @app.route('/') def hello(): """ Main page of the app. """ return "Hello World!" @app.route('/predict') def predict(): """ Return JSON serializable output from the model """ payload = request.args classifier = PythonPredictor("") return classifier.predict(payload) if __name__ == '__main__': app.run(host='0.0.0.0', port=5000)
At the begining of the file we import important
flask as well as a
PythonPredictor class form the syndicai.py file.
Next, we create an instance of this class. The first argument is the name of the application’s module or package. This is needed so that Flask knows where to look for templates, static files, and so on. For more information have a look at the Flask documentation.
We then use the
route() decorator to tell Flask what URL should trigger our function.
The function is given a name which is also used to generate URLs for that particular function, and returns the message we want to display in the user’s browser.
Finally, start the web service using
python app.py command and paste
http://localhost:5000 in the web browser to see “Hello World!”.
In order to run a model paste
http://localhost:5000/predict?url=https://i.imgur.com/PzXprwl.jpg in the browser or run the following REST API request in your terminal. Keep in mind the first model run will take some time because the model will download missing files (weights).
curl --request GET \ --url 'http://localhost:5000/predict?url=https://i.imgur.com/PzXprwl.jpg'
Congrats! You have just created a flask app!
Learn more about Web services for ML.
REST in a nutshell
Flask Mega Tutorial by Miguel Grinberg
A curated list of awesome Flask resources and plugins
Organizing flask project
Wrap with a Docker
The next step is wrapping a flask service with all dependencies and making it reproducible.
For this purpose, we will use Docker. It is an open-source application to create, manage, deploy, and replicate applications using containers. Containers can be thought of as a package that houses dependencies required by the app to run at an operating system level. This means that each application deployed using Docker lives in an environment of its own and its requirements are handled separately.
Deploying a Flask app with Docker allows us to replicate the application across different servers with no reconfiguration.
In order to use docker, you need to first create a Dockerfile in the main directory with the following code.
# Use python as base image FROM python:3.6-stretch # Use working directory /app/model WORKDIR /app/model # Copy and install required packages COPY requirements.txt . RUN pip install --trusted-host pypi.python.org -r requirements.txt # Copy all the content of current directory to working directory COPY . . # Set env variables for Cloud Run ENV PORT 5000 ENV HOST 0.0.0.0 # Open port 5000 EXPOSE 5000:5000 # Run flask app CMD ["python","app.py"]
The Dockerfile is a text document that contains commands used to assemble the image. Paste the code below inside. Creating a Dockerfile it’s worth to have in mind a couple of best practices:
- Layer with requirements.txt installation needs to be above the layer where you copy model files. Thank to that approach, each time when you will change something in your model, docker will rebuild the image without installing dependencies again.
- Use more specific tags and dependency versions. It will prevent you from failure when new library updates are released.
- Change the working directory for your files, don’t work in the root.
For more detailed information you can read the Dockerfile reference.
Additionally to Dockerfile let’s create a docker-compose.yaml that will help us test the app locally.
version: '3' services: flask_classifier: build: . ports: - 5000:5000
The docker-compose is a tool for defining and running multi-container Docker applications using yaml file. Compose works in all environments: production, staging, development, testing, as well as CI workflows and it allows very quickly run a dockerized app.
In order to check if the app is working properly inside the docker just run
docker-compose up --build in the terminal. The service should be available under the port 5000.
Deploy on Cloud Run
We already exposed our app in the form of a web service and created a Dockerfile that allows us to customize the runtime of our container. At this stage, the repository structure should look as follows.
. ├── app.py ├── docker-compose.yaml ├── Dockerfile ├── README.md ├── requirements.txt ├── sample_data │ ├── input.jpg │ └── sample_input.json └── syndicai.py
The last step of this tutorial is service deployment so that it will be out to the world. We will use a Cloud Run, a serverless platform for hosting docker containers. It is fully manageable so you don’t need to worry about the infrastructure.
In order to deploy a service, you need to run two commands. Remember to substitute PROJECT_ID with the name of your GCP project!
# Build a docker image gcloud builds submit --tag gcr.io/PROJECT_ID/flask-classifier # deploy a docker image to cloud run gcloud run deploy --image gcr.io/PROJECT_ID/flask-classifier \ --platform managed \ --port 5000 \ --memory 1G
You can see the deployed service in the Cloud Run.
In order to run your model, just copy the url of the service and use the following request.
curl --request GET \ --url 'https://CLOUD_RUN_SERVICE_URL/predict?url=https://i.imgur.com/PzXprwl.jpg'
Wow! You have just deployed the Image Classifier!
In the following tutorial, you had a chance to explore one approach of AI model deployment, based on flask, docker, and Cloud Run.
The solutions that we’ve tried has couple pros and cons.
On one hand, flask gives you a lot of freedom, is easy to use, has integrated unit testing support and extensive documentation. However, a lot of freedom means that you need to configure a lot of things to make it work really well in production.
* * *