This repository is a collection of example projects from students in General Assembly's part-time Data Science course. Its purpose is to give incoming students a sense of the variety and scope of past projects, and to spur their thinking for their own projects.
- NPR One: App Store Reviews Text Analysis
- Can we use app store review data to identify the app features that are critical to its success?
- NPR One: Predicting User Behavior with N-Nearest Neighbor Stories
- Can we predict which stories to play for an app user by measuring the similarity between stories?
- MetroMetric
- How accurate are NextBus predictions of bus arrival time, and can predictions be improved using other data?
- Other links: code, data
- Forecasting the All-NBA Teams
- Can we predict the 2015 All-NBA Teams based on player statistics?
- Other links: paper
- Billboard Top 40 Song Analysis
- Can we predict the popularity, decade, and genre of hit songs from the past 54 years based on song characteristics?
- Social media and brand marketing in the hotel industry
- Predicting Kickstarter
- Can we predict whether a Kickstarter project will be funded based on the project characteristics?
- Other links: code
- Selling Ideas
- Can we predict the amount of venture funding a startup will receive based on its patent applications?
- Other links: paper, code, data
- Predicting Crime in DC
- Can we predict whether a crime is violent or non-violent based on neighborhood characteristics, weather, and other factors?
- Other links: paper, code, data
- Kaggle Allstate Purchase Prediction Challenge
- Can we predict which car insurance options a customer will buy based on the quotes they reviewed?
- Other links: paper, code
- Kaggle Driver Telematics Analysis
- Can drivers be identified using their driving characteristics?
- Kaggle Seizure Prediction Challenge
- Can impending seizures be predicted using brain activity data?
- Predicting Loan Defaults in Peer-to-Peer Lending Markets
- Can we predict which Lending Club loans will default based on the loan and borrower characteristics?
- Other links: paper
- Machine Technical Analysis
- Can visual features be used for predictive modeling of security prices?
- Other links: paper, code, data
- Predicting Treasury Yields
- Can we predict US Treasury yields using macroeconomic data?
- Other links: paper, code, data
- Predicting Dota 2 Matches Using Machine Learning
- Can we predict the outcome of a Dota 2 match before it begins based on hero selection?
- Other links: paper, code, data
- Predicting User Churn
- Can we predict which LearnZillion users will turn into engaged users based on their actions?
- Twitter Music Recommendation Engine
- Can we recommend music for a Twitter user based on their similarity to Twitter feeds of artists?
- Travel Recommendations Using Social Media Data
- Can we recommend countries to visit based on a user's travel history as captured on social media?
- Other links: paper
- Predicting Flight Delays
- Can we predict flight delays using environmental factors and flight characteristics?
- Other links: paper
- Bike Sharing in Mexico City
- Can we predict demand for the Mexico City bikesharing system based on weather data and user characteristics?
- Data Athletics
- Can we predict an athlete's future performance based on their race history and training data?
- Other links: code, data
- Classifying PDFs as Likely Malicious or Likely Benign
- Can we predict whether a PDF contains malware based on its characteristics?
- Stanford's Machine Learning course: Machine learning-focused student projects from Andrew Ng's course. Click on the project links at the very bottom of the page to access hundreds of project papers.