Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix alpha shape #87

Open
wants to merge 87 commits into
base: master
Choose a base branch
from
Open

Conversation

bertsky
Copy link
Collaborator

@bertsky bertsky commented Mar 12, 2022

Some early fixes for the recent #77 – sorry to get back to you so soon @finkf (and thanks for merging so fast).

BTW, if you contemplate making a new release, here is a list of things that have changed since 0.1.5:

Fixed:

* align: fix logging and `--dump-json` #82
* align: avoid superfluous TextEquiv #76
* binarize: traverse regions in reading-order (so derived images are, too)
* ocrolib.morph.spread_labels: fix when no labels exist
* ocrolib.morph.label: :fire: fix ncomps (+1)
* ocrolib.morph.select_regions: fix `dtype` for just 1 label
* binarize/denoise/deskew/dewarp/segment: skip zero-size segments to avoid numpy problems
* common.compute_hlines/separators: fix h/v kernel size
* common.compute_hlines/separators: early length filter must be softer than final criterion
* segment: fix reference before assignment when partitioning
* segment: set correct pageId for image files
* segment (region level): fix and speed up horizontal merging
* segment (page/table level): continue more gracefully when recursive XY-cut fails
* segment (page/table level): fix significance criterion for partitions' line labels
* segment (page level): prevent empty `ReadingOrder` group
* segment (page level): avoid adding existing regions to RO group unless they are immediate children
* resegment: skip empty line polygons
* resegment: prevent overflow in numpy slices due to rounding errors when cropping #79
* resegment: set correct pageId for image files #80
* resegment: use `set_points` to ensure invalidating existing line images
* polygon_for_parent: ensure path validity before checking consistency
* polygon_for_parent: ensure valid polygons for new coords
* segment/polygon_for_parent: skip segment if polygon cannot be made valid

Changed:

* clip: avoid suppressing overlapping components on both sides
* clip: require independence instead of `min_fraction` threshold
* deskew: delegate to OCR-D/core for reflection and rotation
* dewarp: expose `smoothness` parameter
* segment (region level): ignore separators and other existing regions
* segment: do not suppress neighbours if they cover the segment completely
* segment: do not suppress neighbours if already clipped
* segment (region level): annotate clipped images on region level, too
* segment (region level): improve horizontal merging (transitivity, don't cross separators, enlarge region mask, too)
* segment (page/table level): avoid grouping new text lines with existing regions in a XY-cut
* segment (page/table level): re-order grouped new and existing regions in a XY-cut
* segment (page/table level): avoid creating convex hulls for new regions if these would create additional overlaps with existing regions
* segment (page level): hmerge line labels (within each region) here, too
* segment: upgrade segmentation failures from warning to error
* ocrolib.morph: add `dist_labels` (distance transform of semantic segmentation
* ocrolib.morph: for CC analysis, use 4-way instead of 8-way connectivity
* ocrolib.morph: new function `rb_reconstruction` based on repeated dilation and masking
* common.compute_images/hlines/separators: use that instead of `spread_labels`
* re/segment: before spreading lines, assign diacritics to seeds below
* resegment: :tada: complete rewrite (now polygonal and global):
  - polygonal calculus instead of pixel/morphology operations (for efficiency)
  - optimise assignments globally instead of locally (to avoid conflicting assignments)
  - add param `level-of-operation` with new level `page`, also as new default
  - suppress all non-text regions and non-text non-regions before text line segmentation
  - on page level, merge horizontally adjacent labels, but avoid creating new region conflicts in doing so
  - general algorithm: 
    * after line segmentation, find contours and polygonalize, then compare overlaps once
    * allow assigning multiple new labels to existing lines and combine them via a (slightly concave) hull polygon
    * assign existing lines to new lines such that among those candidates covering high fg (90%) and bg (60%) shares of  the new line, the one with the largest fg and bg share of the existing line wins
    * bail out of resegmentation if the new polygon would loose a share of `threshold` fg or `threshold / 3` bg, or if some new, but unassigned line would be lost entirely 
    * subtract matching lines from non-matching lines
* resegment: allow detection of colseps if some regions exist already
* resegment: :tada: compute true alpha shape instead of eroded convex hull
* resegment: :tada: implement alternative `method=ccomps`:
  - calculate connected component analysis
  - calculate distance transform of existing labels
  - find new line seeds by flattening existing labels (via maximum distance)
  - propagate line seeds across connected components (by majority in case of conflict)
  - spread ccomps labels against each other into background
  - for each line,
    * if enough background and foreground wille be retained
    * find the hull polygon of the new line via alpha shape
    * annotate as new coordinates
* resegment: :tada: implement alternative `method=baseline`:
  - calculate connected component analysis
  - find new line seeds based on the existing baselines (by applying dilation above)
  - propagate line seeds across connected components (by majority in case of conflict)
  - spread ccomps labels against each other into the background
  - for each line,
    * if enough background and foreground will be retained
    * find the hull polygon of the new line via alpha shape
    * annotate as new coordinates
* segment (page/table level): :tada: improve splitting by separators:
  - when trying to partition slices by separators,
     * also treat pre-existing regions like separators, and
     * fix the condition on smallest allowed partitions (insignificant but complete lines)
  - fall back to (new) topological partitioning
    * when no cut or separator-split partition can be found for the current slice, then attempt to find another separator-split by grouping lines along their mutual horizontal neighbourship with fg separators
  - repeatedly allow both kinds of partitioning, if interspersed

@lgtm-com
Copy link

lgtm-com bot commented Mar 20, 2022

This pull request introduces 1 alert when merging 200313f into a30ce3b - view on LGTM.com

new alerts:

  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented Mar 24, 2022

This pull request introduces 1 alert when merging b64e66a into a30ce3b - view on LGTM.com

new alerts:

  • 1 for Unused import

@bertsky
Copy link
Collaborator Author

bertsky commented Mar 25, 2022

dammit, 8841abc contains accidentally commited parts that crash

bertsky and others added 8 commits April 6, 2022 02:23
instead of detecting hlines and vlines independently,
and via costly horizontal/vertical morphology operations,
analyse image by medial axis transform (skeleton and distance
transform of all connected components);
then filter components that are too compact (inner vs outer size),
also filter by statistics of distance along the skeleton: filter
if too wide on average or too variant;
then apply morphological closing to reconnect broken segments, linking
only those components that roughly extend each other in the same direction;
finally, sort by size and filter components that are too small in inner
(skeleton length) or outer size (bbox diagonal), selecting only the topmost
candidates;
propagate from skeleton to full component and then spread a little into the
background
@bertsky
Copy link
Collaborator Author

bertsky commented Apr 12, 2022

Here's some illustration of the recent improvements.

Resegmentation using method=baseline

before after
kraken-poly tmp_resegmented
steps
1. use existing baselines (dilated mask) as seed 2. propagate to connected components by majority rule
tmpbavr88ig_baseline-seeds tmpea7l9xb2_majority-propagated
3. spread into background with full scale distance 4. propagate to connected components again (now catching more fg, esp. diacritics)
tmpn7zvrbl3_scale-spread tmpqskewmsn_propagated-again
5. spread into background with only half scale 6. polygonize
tmp1bkz9fwh_spread-again tmp_resegmented

Resegmentation using method=lineest (also annotating baselines)

before after
tmp_gt-lines tmp_resegmented
steps
1. existing line labels with overlaps 2. new line labels
tmpbwx9o9xz_line_labels tmptdrw_vtg_new_line_labels
3. new baselines 4. match+assign parts and polygonize
tmp6c9g099v_baselines tmp_resegmented
before after
FILE_0002_BINSBBCROP IMG-CROP_pv FILE_0002_BINSBBCROP-RESEG_pv
steps
1. existing line labels with overlaps 2. new line labels
FILE_0002_BINSBBCROP-RESEG_linelabels FILE_0002_BINSBBCROP-RESEG_newlinelabels
3. new baselines 4. match+assign parts and polygonize
FILE_0002_BINSBBCROP-RESEG_baselines FILE_0002_BINSBBCROP-RESEG_pv

Resegmentation using method=ccomps

before after
FILE_0002_BINSBBCROP IMG-CROP_pv FILE_0002_BINSBBCROP-RESEG_ccomps_pv
steps
1. use existing segmentation (flattened via maximum of distance transform) as seed 2. propagate to connected components by majority rule
FILE_0002_BINSBBCROP-RESEG_ccomps_lineseeds FILE_0002_BINSBBCROP-RESEG_ccomps_propagated
3. spread into background with full scale distance 4. propagate to connected components again (now catching more fg, esp. diacritics)
FILE_0002_BINSBBCROP-RESEG_ccomps_spread-full FILE_0002_BINSBBCROP-RESEG_ccomps_propagated-again
5. spread into background with only half scale 6. polygonize
FILE_0002_BINSBBCROP-RESEG_ccomps_spread-again FILE_0002_BINSBBCROP-RESEG_ccomps_pv

Page segmentation with improved separator detection and partitioning

1. input image 2. binarized (SBB)
filemax00005 BINSBB_0005 IMG-BIN
non-text detection
3. detect images 4. detect separators: medial axis transform
tmp1b0ol66t_images6_dilated tmpem1x1sm6_medial-axis
5. connected component labels of skeleton 6. filter by compactness and distance statistics
tmp2l69bqyc_skel-labels tmph88tuzz6_seps-raw
7. morphological closing of skeleton 8. link newly connected labels if direction is consistent
tmp1cvgsp0s_seps-closed tmp5l4uz2dj_seps-raw-linked
9. sort and filter candidates by size 10. propagate from skeleton to full components
tmpebwv73tl_sep-top tmpede5vokh_seps-top-propagated
11. spread separators into background 12. polygonize and suppress images+separators
tmprqzhwmme_seps-top-spread OCROREGIONSXYMASK-BINSBB_0005 IMG-CLIP
whitespace separator detection
13. vertical gradients 14. background
tmp8s3qpxlu_colwsseps2_grad-raw tmpx7vp4645_colwsseps1_thresh
15. combined bg seps 16. combined separator mask
tmpax4j6xsg_colwsseps3_seps tmphk8548zo_sepmask
textline detection
17. horizontal gradient 18. filtered lineseeds
tmpw93wdlk__gradmap tmp3d637l99_lineseeds_filtered
19. final ordered line labels 20. line labels spread against separators
tmpi14wb8ix_llabels tmp1m9raote_lineseeds_spread
textregion detection
21. final rlabels 22. final result
tmp3gsso1nh_rlabels OCROREGIONSXYMASK-BINSBB_0005 IMG-pv
1. input image 22. final (not optimal) result
FILE_0002_ORIGINAL FILE_0002_BINSBBCROP-OCRO2 IMG-pv

@bertsky bertsky mentioned this pull request Jul 31, 2024
@bertsky bertsky linked an issue Jul 31, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants