🚀 Machine Learning and Deep Learning Projects Repository

📋 Overview

This repository is a collection of Machine Learning (ML) and Deep Learning (DL) projects focusing on data analysis, model building, and evaluation. The projects cover clustering, classification, regression, text generation, and sentiment analysis using various datasets and algorithms.

📁 Projects in the Repository

Census Solution: Clustering on Census Income Dataset
Deep Learning using Fashion MNIST
Lending Club Loan Data Analysis
Lyrics Generation using LSTM
Mercedes-Benz Greener Manufacturing
Movielens Case Study
Sentiment Detection on IMDB Dataset

🔎 Project Details

1️⃣ Census Solution: Clustering on Census Income Dataset

Objective: Analyze the Census Income dataset and perform clustering using KMeans to uncover patterns in demographic attributes.

Steps:

Data Cleaning and Preprocessing
Exploratory Data Analysis (EDA)
Feature Scaling and Encoding
Clustering with KMeans
Insights and Interpretation

Tools & Libraries: Pandas, NumPy, Seaborn, Matplotlib, Scikit-learn

2️⃣ Deep Learning using Fashion MNIST

Objective: Build a Deep Neural Network (DNN) to classify clothing images from the Fashion MNIST Dataset into 10 categories.

Steps:

Data Preprocessing: Normalization and Label Encoding
Model Development:
- Input Layer: Flattened 28x28 images
- Hidden Layers: ReLU Activation
- Output Layer: Softmax Activation
Performance Metrics: Accuracy, Confusion Matrix

Tools & Libraries: TensorFlow, Keras, Pandas, NumPy, Matplotlib

3️⃣ Lending Club Loan Data Analysis

Objective: Analyze Lending Club loan data and predict loan repayment status using a Neural Network.

Steps:

Data Cleaning and Feature Engineering
Exploratory Data Analysis (EDA)
Neural Network Implementation
Model Training and Evaluation

Tools & Libraries: TensorFlow, Keras, Pandas, Seaborn, Matplotlib

4️⃣ Lyrics Generation using LSTM

Objective: Generate song lyrics using LSTM-based Recurrent Neural Networks (RNN).

Steps:

Data Preprocessing: Cleaning and Tokenization
Text to Sequence Transformation
LSTM Model Implementation
Model Training and Prediction
Generating Lyrics from Seed Text

Tools & Libraries: TensorFlow, Keras, NumPy, Pandas, Matplotlib

5️⃣ Mercedes-Benz Greener Manufacturing

Objective: Reduce Mercedes-Benz manufacturing process runtime by predicting the target variable y using regression models.

Steps:

Dimensionality Reduction with PCA (98% Variance Retention)
Regression Models:
- XGBoost Regressor
- Random Forest Regressor
Model Comparison and Performance Evaluation (MSE)

Tools & Libraries: Pandas, NumPy, Scikit-learn, XGBoost

6️⃣ Movielens Case Study

Objective: Analyze user ratings, identify top movies, and predict movie ratings using machine learning models.

Steps:

Merging and Cleaning User, Movie, and Ratings Data
Exploratory Data Analysis:
- User Age Distribution
- Top 25 Movies by Viewership
Feature Engineering:
- One-Hot Encoding for Genres
Models for Rating Prediction:
- Logistic Regression
- Decision Tree
- Random Forest
Model Performance Evaluation

Tools & Libraries: Pandas, Seaborn, Matplotlib, Scikit-learn

7️⃣ Sentiment Detection on IMDB Dataset

Objective: Perform sentiment analysis on IMDB movie reviews using Logistic Regression, Naive Bayes, and Decision Tree models.

Steps:

Text Preprocessing:
- Removing HTML Tags, Punctuation, and Stopwords
- Lemmatization
Vectorization:
- Bag-of-Words (BoW)
- TF-IDF
Hyperparameter Tuning:
- Logistic Regression: GridSearchCV
- Naive Bayes: Hyperparameter Optimization
- Decision Tree: GridSearchCV
Model Evaluation:
- Accuracy, Precision, Recall, and F1-Score
- Confusion Matrix for Results Visualization

Model	Accuracy (BoW)	Accuracy (TF-IDF)
Logistic Regression	89.13%	88.95%
Naive Bayes	85.73%	85.11%
Decision Tree Classifier	73.40%	73.40%

Tools & Libraries: Pandas, NLTK, NumPy, Scikit-learn, Seaborn, Matplotlib

🛠️ Technologies Used

Python: TensorFlow, Keras, Pandas, NumPy, Seaborn, Matplotlib, Scikit-learn, NLTK, XGBoost

📂 Repository File Structure

.
|-- Census_Solution.ipynb                     # Clustering on Census Dataset
|-- Deep Learning using Fashion MNIST.ipynb   # Fashion MNIST Classification
|-- Lending Club Loan Data Analysis.ipynb     # Loan Status Prediction
|-- Lyrics_Generation.ipynb                   # LSTM-based Lyrics Generation
|-- Mercedes-Benz Greener Manufacturing.ipynb # Regression for Runtime Prediction
|-- Movielens Case Study .ipynb               # Movie Ratings Analysis & Prediction
|-- Sentiment_Detection_imdb.ipynb            # Sentiment Analysis on IMDB Dataset
|-- README.md                                 # Repository Documentation

⚙️ How to Run

Clone the Repository:

git clone https://github.com/your_username/ml-dl-projects.git
cd ml-dl-projects

Install Dependencies:

pip install tensorflow keras pandas numpy seaborn matplotlib scikit-learn nltk xgboost

Run the Notebooks:
- Launch Jupyter Notebook:
```
jupyter notebook
```
- Open the respective project file (e.g., Sentiment_Detection_imdb.ipynb) and execute the cells.

🔮 Future Enhancements

Implement Transformer Models (e.g., BERT) for sentiment analysis.
Use Collaborative Filtering for improved movie recommendations.
Optimize LSTM performance for longer lyrics generation.
Add Ensemble Methods to combine models for better predictions.

👨‍💻 Contributors

Harmanan Kohli - Data Scientist and ML/DL Enthusiast

📄 License

This repository is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Machine Learning and Deep Learning Projects Repository

📋 Overview

📁 Projects in the Repository

🔎 Project Details

1️⃣ Census Solution: Clustering on Census Income Dataset

2️⃣ Deep Learning using Fashion MNIST

3️⃣ Lending Club Loan Data Analysis

4️⃣ Lyrics Generation using LSTM

5️⃣ Mercedes-Benz Greener Manufacturing

6️⃣ Movielens Case Study

7️⃣ Sentiment Detection on IMDB Dataset

🛠️ Technologies Used

📂 Repository File Structure

⚙️ How to Run

🔮 Future Enhancements

👨‍💻 Contributors

📄 License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Census_Solution.ipynb		Census_Solution.ipynb
Deep Learning using Fashion MNIST.ipynb		Deep Learning using Fashion MNIST.ipynb
Lending Club Loan Data Analysis.ipynb		Lending Club Loan Data Analysis.ipynb
Lyrics_Generation.ipynb		Lyrics_Generation.ipynb
Mercedes-Benz Greener Manufacturing.ipynb		Mercedes-Benz Greener Manufacturing.ipynb
Movielens Case Study .ipynb		Movielens Case Study .ipynb
README.md		README.md
Sentiment_Detection_imdb.ipynb		Sentiment_Detection_imdb.ipynb

Harmanankohli/Projects

Folders and files

Latest commit

History

Repository files navigation

🚀 Machine Learning and Deep Learning Projects Repository

📋 Overview

📁 Projects in the Repository

🔎 Project Details

1️⃣ Census Solution: Clustering on Census Income Dataset

2️⃣ Deep Learning using Fashion MNIST

3️⃣ Lending Club Loan Data Analysis

4️⃣ Lyrics Generation using LSTM

5️⃣ Mercedes-Benz Greener Manufacturing

6️⃣ Movielens Case Study

7️⃣ Sentiment Detection on IMDB Dataset

🛠️ Technologies Used

📂 Repository File Structure

⚙️ How to Run

🔮 Future Enhancements

👨‍💻 Contributors

📄 License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages