The shape mining pipeline is designed to extract veselprofiles with corresponding metadata from book scans.
- Extract figureID and pageID from book scan
- Extract veselprofiles from book scan
- Extract only characteristic shape of the veselprofiles
- kNN approach to find most similar vesel shapes
There are two Docker Containers. One for a GPU machine and one for a CPU machine. Install and run docker container. Container hosts a jupyter server. Ther URL for accessing the server will be shown in the terminal.
chmod +x start_docker.sh
./start_docker.sh
or use docker-compose for your preferred config (CPU or GPU). Example for CPU
docker-compose -f Docker_CPU/docker-compose.yml up
Access Docker shell. The container has to run for that.
docker ps (to check COTAINER_ID)
docker exec -it CONTAINER_ID bash
- Install Visual Studio Code with the following extensions:
- Open devcontainer file and choose in entry "dockerComposeFile" the GPU or CPU container.
- Create following directories if you use a Linux OS:
- vscode_remote/extensions
- vscode_remote/bashhistory
- vscode_remote/insiders
- In VSCodee press Shift+P and run "Remote-Containers:Rebuild and Reopen in Container" command.
Source code is located at /home/Code
Tensorflow objection detection API at /models/research/object_detection
Models for mining shapes can be downloaded at Mining Pages
Mount correct volumes to docker-compose file
Run file mining_pages.py