Skip to content

Scrapes the Today leaderboard from Strava and outputs a leadboard for each segment and the overall.

License

Notifications You must be signed in to change notification settings

e-staats/strava-enduro-gc-scraper

Repository files navigation

What Is This?

This is a tool that uses an automated browser to collect all the times for a set of Strava segments completed today, then adds up the times and outputs the results. It is intended to simplify the scoring process for an event like an Enduro, where the goal is to set the lowest cumulative time over a variety of segments on the day.

How to Use the Tool

This is an automated browser that scrapes data off the Strava leaderboards, so please use it responsibly. It logs in to your own account, so what you do will be tied back to you. You also need to be a Strava subscriber to get access to the granular leaderboard data.

Usage

  1. Have Python installed on your system.
  2. Create a virtual environment, activate it, and install requirements.txt
  3. Rename secrets_template.py to my_secrets.py
  4. Replace the email and password in the environment variables definitions of my_secrets.py with your username and password.
  5. Create a .txt file called whatever you want and paste in Strava segment IDs, one per line. After each ID, add a # and note what the names of the segments are.
    1. You get the Strava segments by looking at the URL of the segment page. For example, for the UW Arboretum Northbound segment is at https://www.strava.com/segments/704038, so the segment ID is 704038
    2. For example, you could have a file called weekly_shop_ride_sprints.txt and it would look like this:
    29713412  # Rutland-Dunn Kicker
    7321846  # Sun Valley - Only Up
    10768746  # Krooked Tree Sprint
    704038  # UW Arboretum Northbound
    
  6. Run the tool with python cli.py [OPTIONS]. Run python cli.py --help to get help information about the options. Examples:
    1. python cli.py -s ./weekly_shop_ride_sprints.txt to run the scraper on a list of segments
  7. Check the /printouts folder for the output.

Interpreting the output

In the /printouts folder, there's a raw_seconds.csv that can be useful for validating the math and a by_total_time.csv which has the riders ranked by total time, and the times formatted into nice hh:mm:ss. If you're doing prizes on each segment, you can sort this file by the various columns to see the top times for each.

FAQ

It tries to log into Strava and fails

Check your credentials in my_secrets.py. If it fails repeatedly and your credentials are correct, please open an issue on Github with as many details as possible.

It's giving me the wrong data!

See if the output says it's using cached data. If it is, you can delete the .cache folder and it will fetch fresh data.

Automated browser scraping? Really?

Yeah...If Strava would make this information accessible through the API, this would be much easier and better. Alternatively, they could support this type feature natively on their website. Please give kudos to this suggestion!

About

Scrapes the Today leaderboard from Strava and outputs a leadboard for each segment and the overall.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages