Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up parsing #2519

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Speed up parsing #2519

wants to merge 8 commits into from

Commits on Nov 4, 2024

  1. Reuse some vectors in ManifestParser.

    For a no-op build of Chromium (Linux, Zen 2),
    this reduces time spent from 5.76 to 5.48 seconds.
    Steinar H. Gunderson committed Nov 4, 2024
    Configuration menu
    Copy the full SHA
    d85cfed View commit details
    Browse the repository at this point in the history

Commits on Nov 5, 2024

  1. Apply a short-vector optimization to EvalString.

    This very often holds only a single RAW token, so we do not
    need to allocate elements on an std::vector for it in the
    common case.
    
    For a no-op build of Chromium (Linux, Zen 2),
    this reduces time spent from 5.48 to 5.14 seconds.
    
    Note that this opens up for a potential optimization where
    EvalString::Evaluate() could just return a StringPiece, without
    making a std::string out of it (which requires allocation; this is
    about 5% of remaining runtime). However, this would also require
    that CanonicalizePath() somehow learned to work with StringPiece
    (presumably allocating a new StringPiece if and only if changes
    were needed).
    Steinar H. Gunderson committed Nov 5, 2024
    Configuration menu
    Copy the full SHA
    40efd00 View commit details
    Browse the repository at this point in the history
  2. Switch hash tables to emhash8::HashMap.

    This is much faster than std::unordered_map, and also slightly faster
    than phmap::flat_hash_map that was included in PR ninja-build#2468.
    It is MIT-licensed, and we just include the .h file wholesale.
    
    I haven't done a detailed test of all the various unordered_maps
    out there, but this is the overall highest-ranking contender on
    
      https://martin.ankerl.com/2022/08/27/hashmap-bench-01/
    
    except for ankerl::unordered_dense::map, which requires C++17.
    
    For a no-op build of Chromium (Linux, Zen 2),
    this reduces time spent from 5.14 to 4.62 seconds.
    Steinar H. Gunderson committed Nov 5, 2024
    Configuration menu
    Copy the full SHA
    f1a2f42 View commit details
    Browse the repository at this point in the history
  3. Fix emhash8 compilation for MinGW.

    Steinar H. Gunderson committed Nov 5, 2024
    Configuration menu
    Copy the full SHA
    22a0eba View commit details
    Browse the repository at this point in the history
  4. Fix some spelling errors in emhash8.

    Steinar H. Gunderson committed Nov 5, 2024
    Configuration menu
    Copy the full SHA
    c3e3fb9 View commit details
    Browse the repository at this point in the history
  5. Switch hash to rapidhash.

    This is the currently fastest hash that passes SMHasher and does not
    require special instructions (e.g. SIMD). Like emhash8, it is
    MIT-licensed, and we include the .h file directly.
    
    For a no-op build of Chromium (Linux, Zen 2),
    this reduces time spent from 4.62 to 4.22 seconds.
    (NOTE: This is a more difficult measurement than the previous ones,
    as it necessarily involves removing the entire build log and doing
    a clean build. However, just switching the HashMap hash takes
    to 4.47 seconds or so.)
    Steinar H. Gunderson committed Nov 5, 2024
    Configuration menu
    Copy the full SHA
    74a8560 View commit details
    Browse the repository at this point in the history
  6. Stop calling ftell() in a loop.

    ftell() must go ask the kernel for the file offset, in case
    someone knew the underlying file descriptor number and seeked it.
    Thus, we can save a couple hundred thousand syscalls by just
    caching the offset and maintaining it ourselves.
    
    This cuts another ~170ms off a no-op Chromium build.
    Steinar H. Gunderson authored and sesse committed Nov 5, 2024
    Configuration menu
    Copy the full SHA
    bbdfcaf View commit details
    Browse the repository at this point in the history
  7. Microoptimization in LoadDepsFromLog().

    This cuts off another ~100 ms, most likely because the compiler
    doesn't have smart enough alias analysis to do the same (trivial)
    transformation.
    Steinar H. Gunderson authored and sesse committed Nov 5, 2024
    Configuration menu
    Copy the full SHA
    3ce7eb0 View commit details
    Browse the repository at this point in the history