Reddit bot which summarizes link containing news articles on /r/india subreddit.
Uses two summarizing libraries
1)Smrzr
2)Sumy
Uses Newspaper for scraping content.
Thanks to sr33 for relevant news links idea and implementation
Find the bot on reddit here
The main files are samacharbot2.py, altsummary.py and blacklist.py, ignore the rest, they are either config files for heroku or files for testing.
Help me in identifying websites which break the bot which provide summaries like "Javascript is not enabled" or "Email not sent" or other such messages.
Fork, edit the blacklist file by appending domain name (you can check this by reading what's written next to the reddit title in brackets) to it and send a pull request!
Thanks!
Currently working on the following subreddits:
- india
- TESTBOTTEST
- TILinIndia
- willis7737_news
- freesoftware
- parabola
- libreboot
- mumbai
- UpliftingKhabre
- TODO:
- Fix duplicate comment bug
- Update to latest version PRAW
- Replace goose with newspaper to aggregate content.