This project is a simple Optical Character Recognition (OCR) system that extracts text from images using the Tesseract OCR engine and OpenCV for image preprocessing.
This program provides a basic OCR functionality by extracting text from images. It loads an image, preprocesses it using OpenCV (grayscale conversion, and optional preprocessing steps), and then performs OCR using the Tesseract OCR engine. The extracted text is then printed to the console.
- Tesseract OCR
- OpenCV
- Python 3.x
- Install Tesseract OCR. You can download it from the Tesseract GitHub repository.
- Install OpenCV using pip:
pip install opencv-python
- Install the pytesseract library:
pip install pytesseract
- Clone the repository or download the Python script.
- Ensure you have an image file (e.g., JPEG, PNG) containing text.
- Run the script:
python ocr_script.py
- Paste in the path to your image file when prompted by the terminal.
- Andrew Reyes
- Github: @areyes42
- 0.1
- Initial Release
This project is licensed under the MIT License - see the LICENSE.md file for details