Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix canonicalize path #2108

Closed

Conversation

digit-google
Copy link
Contributor

This patch fixes a few annoying bugs in the implementation of
CanonicalizePath():

  • It canonicalized "/" and "/foo/.." incorrectly to "",
    while the correct value should be "/". And this was
    enforced by unit-tests :-(

    This was the root cause for issue ninja thinks the root directory doesn't exist #2008.

  • In certain cases, it would copy the first byte after the
    path buffer into the destination (assuming in the comment
    that this is a terminating zero), though there is no guarantee
    that this is the case, or that this address is accessible.

    This is important because the function is being called in
    build.cc:xxx and :xxx with arguments coming from StringPiece
    values, which do not guarantee anything about the bytes/memory
    that follow the content they point to. It just happens to work
    due to the way the parser are currently implemented, but could
    break easily in the future if they are modified (e.g.
    to read directly from read-only memory-mapped files).

Performance comparisons with canonical_perfcompare shows this
implementation to be slightly faster as well, though benchmarking
ninja with a large Fuchsia build plan doesn't show any difference
overall.

Minimal times reported by canon_perftest (best of 5 runs each):

  compiler           BEFORE       AFTER

  g++-11 -O2          120ms    110ms
  g++-11 -O3          123ms    110ms
  clang++-13 -O2      115ms    114ms
  clang++-13 -O3      115ms    110ms

Note that the loop had to be unrolled for gcc -O3 due to some
surprising compiler optimization failure :-(

@digit-google digit-google force-pushed the fix-canonicalize-path branch 3 times, most recently from c710b9c to 4ed960b Compare March 22, 2022 13:06
// clang-13 -O2 114ms 114ms
// clang-13 -O3 110ms 110ms
// gcc-11 -O2 126ms 110ms
// gcc-11 -O3 275ms !! 110ms

This comment was marked as abuse.

- It canonicalized "/" and "/foo/.." incorrectly to "",
  while the correct value should be "/". And this was
  enforced by unit-tests :-(

  This was the root cause for issue ninja-build#2008.

- In certain cases, it would copy the first byte _after_ the
  path buffer into the destination (assuming in the comment
  that this is a terminating zero), though there is no guarantee
  that this is the case, or that this address is accessible.

  This is important because the function is being called in
  build.cc:xxx and :xxx with arguments coming from StringPiece
  values, which do not guarantee anything about the bytes/memory
  that follow the content they point to. It just happens to work
  due to the way the parser are currently implemented, but could
  break easily in the future if they are modified (e.g.
  to read directly from read-only memory-mapped files).

Performance comparisons with `canonical_perfcompare` shows this
implementation to be slightly faster as well, though benchmarking
ninja with a large Fuchsia build plan doesn't show any difference
overall.

Minimal times reported by canon_perftest (best of 5 runs each):

  compiler           BEFORE       AFTER

  g++-11 -O2          120ms    110ms
  g++-11 -O3          123ms    110ms
  clang++-13 -O2      115ms    114ms
  clang++-13 -O3      115ms    110ms

Note that the loop had to be unrolled for `gcc -O3` due to some
surprising compiler optimization failure :-(
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants