This is my first project in order to learn Python. I have built it to access the news faster and in a more aggregated way than just reading the site. It scrapes the news categories of thepressproject.gr site.
Any contributions or ideas are welcome!
It has been tested in Python 3.10 and Windows 10.
Important Note: Unfortunately, the updater.exe inside the one-file executable scraper_tpp_gui.exe is flagged as a threat by Avira antivirus, although it does not contain, of course, any malware.
The usage is pretty straightforward.
The GUI automatically loads all the news titles and their date. The user can renew the titles through menu>renew titles.
If no news was loaded, try to renew titles via menu>renew titles(bypass). It requires Chromedriver and Chrome in order to bypass Cloudflare bot protection. A Chrome window will be launched off the screen to access the news (headless mode gets detected by Cloudflare).
There are 8 themes.
The default theme is Azure dark. If the user clicks again on the azure theme, it will switch to Azure light and vice versa.
The GUI:
The script can be converted to an .exe by running in your terminal:
cd {path/to/scrape_tpp_gui_folder}
py scrape_tpp_gui_pyinstaller.py
You should also convert updater.py to updater.exe to use check for updates command in the menu. Currently, auto-updating does not work as a py script.
cd {path/to/updater_folder}
py pyinstaller_updater.py
-
SQLite Database
-
Save to db option in menu
-
Periodically autosave to db
-
Let the user choose the frequency of autosaving
-
Let the user whether to autosave or not
-
Create a toplevel window containing all the news from the database
-
Add search option based on date (Greek format DD-MM-YY)
-
Add advanced search (author, category, date)
-
-
-
Code refactoring
- Move the files from ./classes to ./source/classes dir & consequently, fix the paths for the rest of the code
- Move images to source
Thanks to all the 3rd party packages maintainers and the StackOverflow users.
Do not forget to donate monthly to ThePressProject team. Recurrent monthly donation/funding is the only way for a truly independent journalism to exist.
ThePressProject Trademark, name and all of its content belong to the ThePressProject team. The 3rd party packages have their own licenses. All the code written by me is released under the MIT license.