Skip to content

A scraper used to find new Kijiji posts and send a Discord message.

License

Notifications You must be signed in to change notification settings

i-am-fyre/Scraper-Discord-Notification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤖 Scraper w/ Discord Notification

A web scraper for Kijiji.ca based on your desired searches. You can also use excludes to prevent any unrelated items from triggering the notification script. All notifications will be sent to your desired Discord channel via webhook. Tested on Ubuntu 18.04.03 LTS.

A lot of credit goes to @CRutkowski for his Kijiji-Scraper project (https://github.com/CRutkowski/Kijiji-Scraper) that served as the base for mine.

Scraper - Notification - Embed

Installation

Make sure that your machine is up to date.

$ sudo apt update

$ sudo apt upgrade

Use git to pull the repository files to your machine in the /github/ directory:

$ sudo git clone https://github.com/Fyre-Homelab/Scraper-Discord-Notification.git /github/Scraper-Discord-Notification

Install some required packages:

$ sudo apt install python3-pip python3-bs4

Install some required packages for python:

$ sudo -H pip3 install -r /github/Scraper-Discord-Notification/requirements.txt

Setup

Give main.py executable permissions.

$ sudo chmod +x /github/Scraper-Discord-Notification/main.py

Give ads.json write permissions so that old ads can be stored.

$ sudo chmod a+w /github/Scraper-Discord-Notification/ads.json

Open config.yaml to change some settings.

$ sudo nano /github/Scraper-Discord-Notification/config.yaml

Get a Discord Webhook

Check out: https://support.discordapp.com/hc/en-us/articles/228383668-Intro-to-Webhooks

config.yaml

webhook: insert the url provided by Discord. No quotations.

bot name: the name you want your webhook to use. No quotations.

url's to scrape: use a '-' followed by a space and the url of the kijiji search you want to be scraped. Repeat for each search.

- url: https://www.kijiji.ca/b-free-stuff/manitoba/c17220001l9006
- url: https://www.kijiji.ca/b-cars-vehicles/manitoba/tesla/k0c27l9006

To get a search URL, go to http://www.kijiji.ca and use the search box for the item you're looking for.

Kijiji - Search

Use any of the filters on the left hand side of the page to narrow down your search as closely as you want (e.g. regions, price, etc.).

Kijiji - URL

terms to exclude from scraping: include a list of terms that you want to exclude from being posted to Discord. Follows YAML spacing. RegEx is supported. Use quotations.

- url: https://www.kijiji.ca/b-manitoba/lto/k0l9006
  exclude:
    - "LTO1"
    - "LTO2"
    - "LTO3"
    - "2[0-9][0-9]5"

Initial Start-Up (Optional)

If it is your first time running the script, it may find ten to a few hundred search results, most of them will be older than you care to be bothered with. So we can run the script manually once in "silent mode" by appending -s in the same manner as the following command:

sudo python3 /github/Scraper-Discord-Notification/main.py -s

This will populate a 'ads.json' file in the /github/Scraper-Discord-Notification/ directory with all search results without sending you Discord notifications. Then after you've set up a re-occurring check via crontab below, it will only search for new entries.

How to setup as a re-occurring check

$ crontab -e

*/15 * * * * /github/Scraper-Discord-Notification/main.py

This sets it up to scrape every 15 minutes for example.

NOTE: do not use 'sudo crontab -e' as it will not run properly!

About

A scraper used to find new Kijiji posts and send a Discord message.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages