Classification of Data Using Decision Tree and Random Forest Based on Three Different Criteria

C++ implementation of Decision Trees and Random Forests for classification of Insurance Dataset

We build decision trees and random forests for a insurance dataset, evaluating it for various experiments . Dataset taken from : https://archive.ics.uci.edu/ml/datasets/Insurance+Company+Benchmark+%28COIL+2000%29

HOW TO RUN :

Go to the folder :

   cd Final

Compile the program by entering the following command :

   g++  -o ID3 ID3.cpp

Run the executable by entering the following command :

   ./ID3  ticdata2000.txt  experiment_no

ticdata2000.txt contains the dataset for creating the tree.

Press enter to print the output.
Please refer to the Results and Conclusion file to see the final results of all the experiments.

Experiments :

We vary the "stopping criteria" that prevents further splitting of node. Changes in accuracy and complexity of model are observed.

Add noise to the dataset and evaluate the accuracy of the model along with the change in its complexity (number of nodes)

Perform "Reduced Error Pruning" on the tree and measure the change in accuracy of the tree.

Create a random forest using "Feature Bagging" approach where we select a subset of features, make multiple trees, and take majority vote for the result.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Classification of Data Using Decision Tree and Random Forest Based on Three Different Criteria

HOW TO RUN :

Experiments :

Files

README.md

Latest commit

History

README.md

File metadata and controls

Classification of Data Using Decision Tree and Random Forest Based on Three Different Criteria

HOW TO RUN :

Experiments :