This project analyzes NYC vehicle collision data (2015-2016) to enhance road safety. We examined over 10,000 records to identify key patterns and factors affecting accident severity.
- Languages: Python
- Database: MongoDB
- Analysis Tools: Pymongo, Pandas, NumPy
- Visualization: Matplotlib, Seaborn
- IDE's: Jupyter Notebook, IntelliJ
-
Geospatial Distribution:
- Finding: Brooklyn and Queens have the highest number of accidents (over 40% of total incidents), while Staten Island, despite having fewer accidents, has the highest fatality rate (20% of fatal accidents).
-
Driver Distractions:
- Finding: Driver inattention/distraction is involved in 25% of accidents with injuries. Failure to yield/right of way causes 30% of serious injuries and fatalities.
-
Vehicle Type Analysis:
- Finding: Passenger vehicles account for 50% of accidents. Motorcycles and bicycles have injury rates exceeding 50%, while large commercial vehicles are involved in 15% of fatal accidents.
-
Weather Impact:
- Finding:
- Accidents: 35% occur in rainy or snowy conditions.
- Injuries: Highest injury rates are observed in partially cloudy (25%) and rainy conditions (20%).
- Fatalities: Partially cloudy and rainy conditions show 15% higher fatality rates compared to clear weather.
- Finding:
- Geospatial Insights: Brooklyn and Queens have high accident rates; Staten Island shows a higher fatality rate despite fewer accidents.
- Driver Distractions: Inattention and failure to yield are major factors, contributing to 55% of severe outcomes.
- Vehicle Types: Two-wheelers face the highest injury rates (>50%); large vehicles have significant fatality rates.
- Weather Factors: Rain and partially cloudy conditions are associated with increased fatality rates, though overall weather impact on accident frequency is minimal.
The analysis provides actionable insights to improve road safety, focusing on high-risk areas, driver behavior, and vehicle types.
- NYC Vehicle Collisions Dataset on Kaggle
- Libraries: Matplotlib, Seaborn, Pymongo