Skip to content

royourboat/lcbo-web-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LCBO Web Scraper Bots

Description

My scraper collects three sets of data from the LCBO website:

Designed by pch.vector / Freepik

There are approximately registered products: 9,400 wine and 4,100 non-wine. "Non-wine" products can include beer, liquor, and reusuable bags.

How does it work?

When a Github Actions workflow (see .github/workflows/) is triggered, a bash script is executed. The bash script contains a cURL command that returns a JSON with the desired data. That's it!

For an in-depth guide, check out these blog posts:

  1. Scraping LCBO Data (Part 1: Store Information)
  2. Scraping LCBO Data (Part 2: Product Inventory)
  3. Scraping LCBO Data (Part 3: Product Descriptions)
  4. Making my own scraper bot!
  5. DIY Wine Database with postgreSQL

How can I use it?

  • Fork this repository!
  • Settings > Actions > Workflow permissions: Read and write permissions
  • Modify the frequency (cron) of scraping in the workflow files in .github/workflows.
  • Please scrape gently. I purposely do not run simultaneous scraping jobs because (a) I am in no rush, (b) I don't want LCBO to be mad and change their setup, and (c) it is a waste of free cpu minutes.

Authors

Stephen Ro

Website

Inspirations

  • Simon Willison's Github Actions demo and YT video!
  • My experiences at LCBO stores

License

This project is licensed under the BSD 3-Clause License - see the LICENSE.md file for details

About

A scraper bot for the LCBO

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages