- Removing Chinese, German, French and Spanish language data to reduce filesize
- Updating readme with link to language data
- Updating vendorised pytesseract
- Updating mac tesseract dependencies
- Updated build script for tesseract for mac
- The above fixes #27, #28, #29
- Attempting to fix error where image does not exist
- Improved exception display to end user if processing fails unexpectedly, adding debug info
- Fixes #26, thanks @bwhurd for the bug report
- Other small fixes to support Anki 2.2.41 and beyond
- Drop support for Anki versions prior to 2.1.41, but it should still work.
- Fix raising of KeyError when img src is not found, thanks @thiswillbeyourgithub for the fix!
- Fix error on some Linux environments, see #20 , thanks to user thiswillbeyourgithub for the fix!
- Hotfix to include accidentally gitignored tesseract mac libs
- Added bundled tesseract for Mac, no longer any need to install it seperately
- Split out
tessdata
to its own folder, allowing easier installation of new languages - Change in the way note ID's are processed, no longer limited to 1000 cards
- Fixed issue causing a crash in anki versions > 2.1.40
- Added some log text that will appear when invalid notes are encountered during a processing run
- Hotfix for config.json syntax error
- Add
num_threads
config option to allow manual setting of number of threads - Add
use_batching
config option to allow disabling of batching for those for which this causes performance issues - Added more unit tests to releasing new versions
- Fixed an issue where OCR text containing "::" would break clozes, now cleans duplicate colons in text
- Reduced batch_size default to 5 to improve the progress bar updating frequency and feel of speed
- added total time readout to final message on completion
- added ability to cancel during processing
- Major feature update, now is multithreaded for roughly a 10x speed improvement
- Complete refactor of code for readibility and maintability
- Addition of basic unit tests for OCR section of codebase
- Improved progress bar messaging
- Config setting for
text_output_location
is now read properly when starting OCR class - More detailed exception readout when exception occurs during processing
- New method for storing the OCR text, now stores it in
title
attr of the img html tag - Handle old versions of Anki not having different progressbar.update()
- Add alternate import method for Collection due to API changes in Anki
- Changed order of operations so that OCR is attempted before notes are modified to elimainate risk of database errors
- Updated path to tesseract executable for mac and linux
- HOTFIX for tesseract cmd path on Mac
- Removed the install file for Tesseract-OCR for windows, now that the binaries themselves are included.
- Updated the inital message the user sees to notify re: the database change message Anki will show.
- HOTFIX for Fixing tesseract executable detection
- Now packaged with windows binaries for Tesseract-OCR, no install neccesary!
- Added flag in config.json to indicate valid tesseract exec
- Updates to README to reflect above changes
- Initial release