This repository contains a RESTful API implementation for predicting diabetes in Pima Indians based on various diagnostic measurements. The API is built using Flask and XGBoost. The project includes training and saving the best model as well as testing the API using Python requests.
Ensure that you have the following installed on your machine:
- Python 3.6 or later
- pandas
- scikit-learn
- numpy
- Flask
- requests
- xgboost
The dataset contains information about female patients at least 21 years old of Pima Indian heritage. The task is to predict whether a patient has diabetes based on various diagnostic measurements.
Dataset source: https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv
train_and_save_xgb_model.py
: Trains an XGBoost classifier on the Pima Indians Diabetes dataset, selects the best model using grid search with cross-validation, and saves it to a file.app.py
: Flask application file that loads the saved model and provides an API endpoint for making predictions.test_api.py
: Python script to test the API by sending a POST request with input data and printing the received predictions.
- Clone the repository:
git clone https://github.com/yihong1120/Diabetes-Prediction-API.git
cd Diabetes-Prediction-API
- Install the required packages:
pip install -r requirements.txt
- Train and save the XGBoost model:
python train_and_save_xgb_model.py
The chart below illustrates the SHAP computated with the train XGB_model and X_test data.
- Run the Flask application:
python app.py
- Test the API with sample input data:
python test_api.py
Endpoint for making predictions using the saved XGBoost model.
Request
- JSON payload containing an array of input data
{
"data": [
[6, 148, 72, 35, 0, 33.6, 0.627, 50],
[1, 85, 66, 29, 0, 26.6, 0.351, 31]
]
}
Response
- JSON object containing an array of predictions
{
"predictions": [1, 0]
}
This project is licensed under the MIT License - see the LICENSE file for details.