-
Notifications
You must be signed in to change notification settings - Fork 0
pareddy113/image-word-finder
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Most of the documents, images, newspaper we see are paper based, and it’s always frustrating to search through thousands of words in an image or book or paper for a word. Wouldn’t it be great if we can search through them and find the locations of all the occurrences of the word in the image? How about if we can check if the detected works in the image/ new paper are further processed to be accurate? How does spell check and correction of the output sound? This project targets and features all the above stated problems. It takes an image as an input, process the image and detects the text in the image based on super powerful Tesseract OCR engine, then does the spell check and correction to further corroborate the consistency in the output detected. The spell check and correction is based on Peter Norvik’s Algorithm which is easy but has an accuracy of 90% and can process around 10 words a second. And you can input the words that you want to search, it searches through the entire image and not only gives the if the word is present but also the exact location of occurrences of all the words which will make your life easy. Pre-requisites: Python 3.5 OpenCV for Python3 Numpy Pillow 3.x Tesseract & pytesseract Packages Installation commands for Mac OSX: pip3 install opencv-python → lib for Python 3.x pip3 install opencv-contrib-python pip3 install pillow → PIL for Python 3.x, dependency for tesseract OCR brew install tesseract → OCR engine pip3 install pytesseract → Python 3.x wrapper around tesseract OCR Run the program: Keep the download folder in any folder. Run the following command from the folder: python3 main.py spell.py —> Used for spell check and correction using Peter Norvig's Algorithm corpus.txt —> most used words for spell check and correction Peter Norvig's Algorithm: http://norvig.com/spell-correct.html you can interact with the program and search the word. The search image is 4.png by default, you can change it to your required image.eou can interact with the program and search the word. The search image is 4.png by default, you can change it to your required image.
About
To search for words in an image using OCR, Spell check and correction using Peter Norvig's Algorithm
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published