Skip to content

Short data science and big data scripts at your service

Notifications You must be signed in to change notification settings

wadhwasahil/ML-bucket

Repository files navigation

Ml-bucket

Ever gone through a situation where you are implementing a research paper and wish for some petty scripts which could have made your life easier? Well, the aim of the repository is to bring all the appurtenances of ML (NLP/CV etc.) into one place and use them whenever you need them with a little tweak. I have added some basic scripts and will add more in due time.

  1. tf-idf.py implements the standard tf-idf (term frequence - inverse document frequency) algorithm using sklearn (TfidfVectorizer), although you can use HashVectorizer for better speedup and scalability.

  2. SVM.py implements Support Vector Machine algorithm on the data train.csv. The code first removes all the un-necessary features, converts the categorical/nominal features to numberical using one-hot encoding method and final training is done using LibSVM .

Everyone is encouraged to contribute to this repository.

About

Short data science and big data scripts at your service

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published