Name	Name	Last commit message	Last commit date
parent directory ..
seldon_serve	seldon_serve
Dockerfile	Dockerfile
README.md	README.md
housing.py	housing.py
py-claim.yaml	py-claim.yaml
py-pod.yaml	py-pod.yaml
py-volume.yaml	py-volume.yaml

Ames housing value prediction using XGBoost on Kubeflow

In this example we will demonstrate how to use Kubeflow with XGBoost using the Kaggle Ames Housing Prices prediction. We will do a detailed walk-through of how to implement, train and serve the model. You will be able to run the exact same workload on-prem and/or on any cloud provider. We will be using Google Kubernetes Engine to show how the end-to-end workflow runs on Google Cloud Platform.

Pre-requisites

As a part of running this setup on Google Cloud Platform, make sure you have enabled the Google Kubernetes Engine. In addition to that you will need to install Docker and gcloud. Note that this setup can run on-prem and on any cloud provider, but here we will demonstrate GCP cloud option. Finally, follow the instructions to create a GKE cluster.

Steps

Kubeflow Setup
Data Preparation
Dockerfile
Model Training on GKE
Model Export
Model Serving Locally
Deploying Model to Kubernetes Cluster

Kubeflow Setup

In this part you will setup Kubeflow on an existing Kubernetes cluster. Checkout the Kubeflow getting started guide.

Data Preparation

You can download the dataset from the Kaggle competition. In order to make it convenient we have uploaded the dataset on GCS

gs://kubeflow-examples-data/ames_dataset/

Dockerfile

We have attached a Dockerfile with this repo which you can use to create a docker image. We have also uploaded the image to gcr.io, which you can use to directly download the image.

IMAGE_NAME=ames-housing
VERSION=v1

Use gcloud command to get the GCP project

PROJECT_ID=`gcloud config get-value project`

Let's create a docker image from our Dockerfile

docker build -t gcr.io/$PROJECT_ID/${IMAGE_NAME}:${VERSION} .

Once the above command is successful you should be able to see the docker images on your local machine by running docker images. Next we will upload the image to Google Container Registry

gcloud auth configure-docker
docker push gcr.io/${PROJECT_ID}/${IMAGE_NAME}:${VERSION}

A public copy is available at gcr.io/kubeflow-examples/ames-housing:v1.

Model training on GKE

In this section we will run the above docker container on a Google Kubernetes Engine. There are two steps to perform the training

Create a GKE cluster
Create a Persistent Volume
- Follow the instructions here. You will need to run the following kubectl create commands in order to get the claim attached to the pod.
```
kubectl create -f py-volume.yaml
kubectl create -f py-claim.yaml
```
Run docker container on GKE
- Use the kubectl command to run the image on GKE
```
kubectl create -f py-pod.yaml
```
  Once the above command finishes you will have an XGBoost model available at Persistent Volume /mnt/xgboost/housing.dat

Model Export

The model is exported to the location /tmp/ames/housing.dat. We will use Seldon Core to serve the model asset. In order to make the model servable we have created xgboost/seldon_serve with the following assets

HousingServe.py
housing.dat
requirements.txt

Model Serving Locally

We are going to use seldon-core to serve the model. HousingServe.py contains the code to serve the model. Run the following command to create a microservice

docker run -v $(pwd):/seldon_serve seldonio/core-python-wrapper:0.7 /seldon_serve HousingServe 0.1 gcr.io --base-image=python:3.6 --image-name=${PROJECT_ID}/housingserve

Let's build the seldon-core microservice image. You can find seldon core model wrapping details here.

cd build
./build_image.sh

You should see the docker image locally gcr.io/cloudmlplat/housingserve which can be run locally to serve the model. Before running the image locally push it to gcr.io

gcloud auth configure-docker
docker push gcr.io/${PROJECT_ID}/housingserve:0.1

Let's run the docker image now

docker run -p 5000:5000 gcr.io/cloudmlplat/housingserve:0.1

Now you are ready to send requests on localhost:5000

curl -H "Content-Type: application/x-www-form-urlencoded" -d 'json={"data":{"tensor":{"shape":[1,37],"values":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]}}}' http://localhost:5000/predict

{
  "data": {
    "names": [
      "t:0", 
      "t:1"
    ], 
    "tensor": {
      "shape": [
        1, 
        2
      ], 
      "values": [
        97522.359375, 
        97522.359375
      ]
    }
  }
}

Model serving on GKE

One of the amazing features of Kubernetes is that you can run it anywhere i.e., local, on-prem and cloud. We will show you how to run your code on Google Kubernetes Engine. First off, start a GKE cluster.

Deploy Seldon core to your GKE cluster by following the instructions in the Deploy Seldon Core section here. Once everything is successful you can verify it using kubectl get pods -n${NAMESPACE}.

NAME                                      READY     STATUS    RESTARTS   AGE
ambassador-849fb9c8c5-5kx6l               2/2       Running   0          16m
ambassador-849fb9c8c5-pww4j               2/2       Running   0          16m
ambassador-849fb9c8c5-zn6gl               2/2       Running   0          16m
redis-75c969d887-fjqt8                    1/1       Running   0          30s
seldon-cluster-manager-6c78b7d6c9-6qhtg   1/1       Running   0          30s
spartakus-volunteer-66cc8ccd5b-9f8tw      1/1       Running   0          16m
tf-hub-0                                  1/1       Running   0          16m
tf-job-dashboard-7b57c549c8-bfpp8         1/1       Running   0          16m
tf-job-operator-594d8c7ddd-lqn8r          1/1       Running   0          16m

Deploy the XGBoost model

ks generate seldon-serve-simple xgboost-ames   \
                                --name=xgboost-ames   \
                                --image=gcr.io/cloudmlplat/housingserve:0.1   \
                                --namespace=${NAMESPACE}   \
                                --replicas=1
                                
ks apply ${KF_ENV} -c xgboost-ames

Sample request and response

Seldon Core uses ambassador to route its requests. To send requests to the model, you can port-forward the ambassador container locally:

kubectl port-forward $(kubectl get pods -n ${NAMESPACE} -l service=ambassador -o jsonpath='{.items[0].metadata.name}') -n ${NAMESPACE} 8080:80

Now you are ready to send requests on localhost:8080

curl -H "Content-Type: application/x-www-form-urlencoded" -d 'json={"data":{"tensor":{"shape":[1,37],"values":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]}}}' http://localhost:8080/predict

{
  "data": {
    "names": [
      "t:0", 
      "t:1"
    ], 
    "tensor": {
      "shape": [
        1, 
        2
      ], 
      "values": [
        97522.359375, 
        97522.359375
      ]
    }
  }
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xgboost_ames_housing

xgboost_ames_housing

README.md

Ames housing value prediction using XGBoost on Kubeflow

Pre-requisites

Steps

Kubeflow Setup

Data Preparation

Dockerfile

Model training on GKE

Model Export

Model Serving Locally

Model serving on GKE

Sample request and response

Files

xgboost_ames_housing

Directory actions

More options

Directory actions

More options

Latest commit

History

xgboost_ames_housing

Folders and files

parent directory

README.md

Ames housing value prediction using XGBoost on Kubeflow

Pre-requisites

Steps

Kubeflow Setup

Data Preparation

Dockerfile

Model training on GKE

Model Export

Model Serving Locally

Model serving on GKE

Sample request and response