Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support ALTO 4.3 #30

Open
bertsky opened this issue Jun 10, 2022 · 0 comments
Open

support ALTO 4.3 #30

bertsky opened this issue Jun 10, 2022 · 0 comments

Comments

@bertsky
Copy link
Collaborator

bertsky commented Jun 10, 2022

New features:

  1. Add BASEDIRECTION attribute defining base direction and line orientation to TextLine and BlockType.
  2. Add support for explicit reading order definitions with "ReadingOrder" element containing "UnorderedGroup"s, "OrderedGroup"s, and "ElementRef"s.

Regarding @BASEDIRECTION the docs state:

Describes the inline base direction and line orientation of a line or of all lines inside a text block.
The meaning of these terms is defined by the W3C writing modes document
These values should correspond to the base direction set in the BiDi algorithm to the respective elements during Unicode encoding. A value of "ttb" (top-to-bottom) implies a base direction of left-to-right, a value of "btt" (bottom-to-top) a base direction of right-to-left.

  • ltr
  • rtl
  • ttb
  • btt

It sounds a lot like @readingDirection in PAGE, but there is no mention of bidirectionality here. @chris1010010, can you help?

As to ReadingOrder, that has been directly adopted from PAGE, with subtle differences though:

  • in ALTO ReadingOrder can have any number of groups (with alternative semantics), in PAGE it must have exactly one
  • the syntactic clutter is minimized, i.e. no Indexed variants and no explicit @index. OrderedGroup simply has sequence semantics and UnorderedGroup set semantics, otherwise they are the same and appear in the same places.
  • @REF is explicitly allowed for sub-region elements, which is allowed in PAGE syntactically but forbidden by documentation (and I suppose PRImA's libraries won't tolerate such use)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant