Replies: 2 comments 1 reply
-
Hi @agombert 👋, Sounds more like an layout detection problem to me :) I think you would have more success by using our contrib module (https://mindee.github.io/doctr/using_doctr/using_contrib_modules.html) The only limitation atm is that it works only with straight boxes no oriented bounding boxes (OBB) inference supported yet. Best, Felix |
Beta Was this translation helpful? Give feedback.
-
Hey @felixdittrich92, It is indeed. But I struggled to find some code to fine-tune such model (I'm more into NLP generally). Low amount would be between 100 and 1k pages only. Maybe fine-tuning yolov11 could be a solution and then apply it with your pipeline. But maybe the Yolov11 script can be a better way to go. I need to investigate. When I tried to fine-tune the detector, it did not learn anything therefore maybe it needs more data or it's not adapted or I don't do it well. , Best, Arnault |
Beta Was this translation helpful? Give feedback.
-
Hey,
First thank you very much for this library which is really great and helpful.
My use case is handwritten data from archives which are complex text. Before going into the OCR, I'd like to detect accurately the bounding boxes of the text.
I followed the code instructions with different classes. And then using this code to work on the data:
And I get some predicitons for
text
andmargin
. I used the--pretrained
parameter as I just want to work by block and I have low amount of data.What would be your recommandation on this task with
db_resnet50
regarding the amount of data to fine-tune ? Would you consider the block detection as possible with your pipeline ?Best,
Arnault
Beta Was this translation helpful? Give feedback.
All reactions