Skip to content

klakathy/machine-learning-car-data-analysis-with-rmd

Repository files navigation

Prediction of risks of auction used cars

An analysis about judging whether a car from auctions has issues by fitting multiple machine learning models based on R.


The purpose of this project is to learning the philosophy between cars tradings. It is a practise of employing machine learning methods.

Getting started

Data

The original data is collected from kaggle Don't get kicked!, which is the same as csv files in orinigal_data folder. The original kaggle test has already closed the evaluation access, thus there is no specific rank for this project.

Entity:

Setup

  • Environment: R (R Markdown), Tableau
  • Data resource: kaggle Don't get kicked!
  • Models: classification tree model, Random Forest, XGBoost, support vector machine, neutral network
  • Libraries: caret, randomForest, xgboost, nnet, pROC, etc.
  • Other: PCA knowledge is also needed.

  • Import all .rmd files and install libraries as needed.
    The files is currently set to knit to word document.


Content

In this project, data is first cleaned with preprocessing and feature selection methods, and second analyzed with multiple machine learning models, and then evaluated considering roc when applying different models.

Preprocessing

img img img

Model applying

  • Tree model
    img
    Five variables are chosen for classification tree model according to gini index. After and additional cross-validation check, further improvement is made by changing model complexity to pruning the tree.

  • Random forest
    img

  • XGBoost
    img

  • SVM
    img

  • NNet
    img

Evaluation ROC

img
img

In terms of the auc and recall rate, the best current model is nnet, which gives a 59.48% recall rate, this is based on cutoff value where having the J statistic (Sensitivity+Specificity-1) maximal.

Constribution

This is a group project for a course. My teammate Wendy, Nina, Andi and I worked together to accomplish this work.

About

Analysis of the business risk on the online used cars sales

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published