Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User guide documentation update #5

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 32 additions & 1 deletion user_guide.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2079,7 +2079,38 @@ beginner Kaitai Struct users.

=== Specifying size creates a substream

TODO
In the following example script, an erronous attempt is made to parse
an input file with a file size of 2000 bytes:

[source,yaml]
----
seq:
- id: body
type: some_body_type
size: 1000
types:
some_body_type:
seq:
- id: payload
size: 999
- id: overflow
size: 2
----

The parser can successfully copy the required 999 bytes into
`body.payload` as the `body` substream has 1000 bytes available to
be requested, and the root stream has 2000 bytes available.

Where an exception occurs is upon attempting to copy data from the
`body` substream into the `overflow` object. After data has been
copied from the `body` substream into the `payload` object, the
`body` substream will only have 1 byte of data still available for
the parser to request. As 2 bytes of data are attempted to be
requested, the `body` substream is exhausted of available data and
thus an exception occurs. The fact that the root stream still has
1001 bytes available to be requested from the input file does not
matter, as the `body` substream never has the opportunity to request
any more than the first 1000 bytes of the input file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually not a pitfall, but a legitimate behavior, and well-explained in previous section.

The "pitfall" I was thinking about in this section is the following: when a new substream is created, all parse instances with positions act within that substream by default.

So, this one works as expected:

seq:
  - id: skipped
    size: 1000
  - id: indexing
    type: file_index_entry
    # but adding "size: 24" here will ruin "file_body" instance,
    # although it looks legitimate at the first glance
types:
  file_index_entry:
    seq:
      - id: file_name
        type: str
        size: 16
      - id: file_pos
        type: u4
      - id: file_len
        type: u4
    instances:
      file_body:
        pos: file_pos
        size: file_len

To overcome that, one needs to use something like io: _root._io in file_body. Of course, documentation warrants a somewhat better example and explanation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. I didn't know about io: either, so that's a good one to document! Nice feature!


=== Applying `process` without a size

Expand Down