Kubeflow is an open source machine learning MLOps platform which makes it easy to deploy and manage ML stack on Kubernetes. In this tutorial, I will demonstrate how to create ML pipeline in Kubeflow. We will train and serve image classification model using MNIST dataset. The goal here is to create pipeline for getting data, pre-processing it and creating the model and finally inference the model.
Install Kubeflow
Make sure your system have minimum 8vCPU and 16GB RAM. Also you need to tweak kernel in order to support many pods. More info in kubeflow installation guide here
Make sure to install all kubeflow components. Wait several minutes. Finally you should see all the pods are healthy
$ kubectl get pods -n kubeflow
NAME READY STATUS RESTARTS AGE
admission-webhook-deployment-67fd864794-hhlr7 1/1 Running 12 (100m ago) 29d
cache-server-5945b96448-ptgk6 2/2 Running 11 (100m ago) 29d
centraldashboard-5f49c896c7-v8nxz 2/2 Running 11 (100m ago) 29d
jupyter-web-app-deployment-555bd4c7f6-s7sq5 2/2 Running 11 (100m ago) 29d
katib-controller-5674c8b4d6-kbx6m 1/1 Running 15 (100m ago) 29d
katib-db-manager-85987474b8-7sfks 1/1 Running 12 (100m ago) 29d
katib-mysql-c688997bd-4mhgr 1/1 Running 12 (100m ago) 29d
katib-ui-585dc5766-s4xdl 2/2 Running 11 (100m ago) 29d
kserve-controller-manager-5fbbbcdd64-vjfrk 2/2 Running 24 (100m ago) 29d
kserve-localmodel-controller-manager-5fcbb75c44-gp5r2 2/2 Running 11 (100m ago) 29d
kserve-models-web-app-678949ffdd-lfhv2 2/2 Running 11 (100m ago) 29d
kubeflow-pipelines-profile-controller-699dc67f96-wxwgl 1/1 Running 12 (100m ago) 29d
metacontroller-0 1/1 Running 13 (100m ago) 29d
metadata-envoy-deployment-78dc9bd89-8x7j9 1/1 Running 12 (100m ago) 29d
metadata-grpc-deployment-6786fdf748-rwqx2 2/2 Running 21 (100m ago) 29d
metadata-writer-b74948545-l6hrd 2/2 Running 15 (100m ago) 29d
minio-6d486b66cd-4wb5x 2/2 Running 11 (100m ago) 29d
ml-pipeline-65ff55599d-gk4gp 2/2 Running 15 (100m ago) 29d
ml-pipeline-persistenceagent-c58647ff5-qmqsl 2/2 Running 11 (100m ago) 29d
ml-pipeline-scheduledworkflow-6d8dc9b889-x4kl6 2/2 Running 11 (100m ago) 29d
ml-pipeline-ui-5f96555b97-p82nn 2/2 Running 11 (100m ago) 29d
ml-pipeline-viewer-crd-5745b89f8f-4hvhb 2/2 Running 11 (100m ago) 29d
ml-pipeline-visualizationserver-64dbbb8d96-vxcwk 2/2 Running 11 (100m ago) 29d
mysql-6868b5b465-hfnws 2/2 Running 11 (100m ago) 29d
notebook-controller-deployment-c4f4fb986-vtqk9 2/2 Running 11 (100m ago) 29d
profiles-deployment-d675596d7-rpdfs 3/3 Running 22 (100m ago) 29d
pvcviewer-controller-manager-556b9c9586-zn7p2 3/3 Running 22 (100m ago) 29d
spark-operator-controller-6dfb845b84-4bk6v 1/1 Running 14 (100m ago) 29d
spark-operator-webhook-5746fc6666-h9chf 1/1 Running 13 (100m ago) 29d
tensorboard-controller-deployment-76d7f8f55-zp6cq 3/3 Running 22 (100m ago) 29d
tensorboards-web-app-deployment-66d4b74977-c78hr 2/2 Running 11 (100m ago) 29d
training-operator-7597cf8fcc-dkjjj 1/1 Running 17 (100m ago) 29d
volumes-web-app-deployment-5698b6c5c9-nwf2x 2/2 Running 11 (100m ago) 29d
workflow-controller-75b848f885-gl57x 2/2 Running 11 (100m ago) 29d
Explore Kubeflow
In order to access the Kubeflow dashboard, forward the port
kubectl port-forward -n istio-system svc/istio-ingressgateway --address 8000:80
Visit localhost:8000
, You should see the login form. Default email is [email protected]
and password 12341234
Preparing notebook
Kubeflow Notebook provides web-based environments that run within your Kubernetes cluster inside Pod. Before we create a notebook in kubeflow, we need to allow the notebook to access kubeflow pipeline. Apply the below manifest
|
|
Now while you are in kubeflow-user-example-com
namespace, create the notebook from dashboard Notebooks > New Notebook
Name it anything and make sure to select jupyter-tensorflow-full:v1.10.0
Select at least 2 CPU and 4GB RAM
Under the Advanced Options, make sure to select Allow access to Kubeflow Pipeline
config. It won’t show if you don’t apply the above manifest. With this configuration, we will have the access to kubeflow pipeline directly from the notebook.
Now select Launch and wait few minutes. Soon notebook will be ready. Click connect
Explore the notebook
While in notebook, create a terminal and clone the following repo
git clone https://github.com/k4mrul/kubeflow-mnist
cd kubeflow-mnist
Also, make sure to install the following packages
pip install minio==7.2.15
pip install kserve==0.15.2
We will use minio package to upload the model to the MinIO storage (comes with Kubeflow components) and also use kserve package to test our model against test data.
Open digits_recognize.ipynb
in jupyter notebook. Notebook includes the following steps:
- Importing the MNIST handwritten digits dataset
- Exploring and preparing the data
- Building and training a model to recognize digits
- Evaluating model accuracy and visualizing results with a confusion matrix
- Saving and exporting the trained model to MinIO storage for deployment
Running pipeline
When you are done exploring, open digits_recognize_pipeline.ipynb
notebook. This notebook will trigger Kubeflow pipeline which will builds machine learning pipeline for digit recognition using the MNIST dataset. It loads and uploads the data to MinIO storage, reshapes and normalizes it, trains a deep learning model, evaluates its performance, and then deploys the trained model for serving with KServe.
(Note: if you get 401 authorization error even after allowing notebook to access kubeflow pipeline, apply this manifest
If you go to Pipeline > Runs, you should see the pipeline has been triggered and running
If you click on it, you should see all the steps are successfully passed
Testing the model inference
Go to KServe Endpoints. You should see digits-recognizer
inference service is ready and healthy
Now we will test the model. Open kserve-test.ipynb
notebook. This notebook tests a deployed KServe model for MNIST digit recognition. It sends an example image of the digit “5” to the model’s prediction endpoint, receives the predicted probabilities, and prints out the predicted digit along with its one-hot encoding.
Run the cell (make sure you installed kserve package)
You should see output like this
Actual Number: 5
One-hot: [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]
Predicted digit: 5
Our trained model can predict the number accurately.
And that’s it, we have successfully configured kubeflow pipeline for training and deploying model.