Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update run configurations for gemm test #204

Open
wants to merge 31 commits into
base: main
Choose a base branch
from

Commits on Sep 24, 2024

  1. Reorder iter args to match ordering of init args and outputs (iree-or…

    …g#161)
    
    This PR modifies the insertion point for iter args to ensure that the
    iter args are in the same order as the init args and outputs. This
    simplifies the mapping between init args, iter args and outputs.
    
    Signed-off-by: Harsh Menon <[email protected]>
    harsh-nod authored Sep 24, 2024
    Configuration menu
    Copy the full SHA
    0327398 View commit details
    Browse the repository at this point in the history
  2. [ExportedProgram] Add mutable attribute to buffer (iree-org#123)

    Fixes iree-org#85
    
    PR based on the work of @maxbartel 
    
    Requires changes in torch-mlir:
    [llvm/torch-mlir/#3688](llvm/torch-mlir#3688)
    
    Adds the mutable modifier to a global buffer and lifts said buffer to a
    global if there is a store-producer node associated with it.
    
    Signed-off-by: Christopher McGirr <[email protected]>
    Co-authored-by: Maximilian Bartel <[email protected]>
    chrsmcgrr and maxbartel authored Sep 24, 2024
    Configuration menu
    Copy the full SHA
    4e95351 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    65eb532 View commit details
    Browse the repository at this point in the history
  4. [TKW] Fix indexing of Reduction and GetResult to enable post-tile op. (

    …iree-org#162)
    
    This PR introduces changes to handle elementwise or general arithmetic
    operations after we did some tiled-loop-reduction ("Reduction")
    operation.
    
    The main problem with the current stack is indexing_dims information for
    Reduction relies on the user. This would work if it's user/consumer is
    tkw.write, but in other cases such as BinaryPyOp or UnaryPyOp, it will
    lack such information.
    
    To make matters worst BinaryPyOp/UnaryPyOp depends on it's src/producer
    for indexing dim, while Reduction op depends on it's dst/consumer for
    its' indexing dim information. This would ended up causing infinite loop
    between UnaryPyOp/BinaryPyOp <-> Reduction.
    
    This PR fixes the indexing dimension logic Reduction and GetResult
    (required for expanded Reduction) to be based on it's reduction axis(for
    Reduction) and it's source/consumer information.
    
    ---------
    
    Signed-off-by: Stanley Winata <[email protected]>
    raikonenfnu authored Sep 24, 2024
    Configuration menu
    Copy the full SHA
    909411a View commit details
    Browse the repository at this point in the history

Commits on Sep 26, 2024

  1. Get GEMMs working without minimize_global_loads (iree-org#167)

    This PR removes the need for propagating indices using
    post expansion. The new approach propagates the MMA
    indices to the MMA dimensions of all tensors (rather
    than just MMA nodes) and then specializes them depending
    on whether they lie within the backward slices of the
    LHS and RHS or forward slices of the ACC.
    
    ---------
    
    Signed-off-by: Harsh Menon <[email protected]>
    harsh-nod authored Sep 26, 2024
    Configuration menu
    Copy the full SHA
    d37c6a4 View commit details
    Browse the repository at this point in the history
  2. Add first draft of introduction (iree-org#168)

    This PR adds more documentation about tkw. Specifically, it provides a
    first draft of the introduction and adds a section on memory access
    patterns.
    
    Signed-off-by: Harsh Menon <[email protected]>
    harsh-nod authored Sep 26, 2024
    Configuration menu
    Copy the full SHA
    04a4ba5 View commit details
    Browse the repository at this point in the history
  3. [TKW] igemm shared mem tests (iree-org#171)

    Signed-off-by: Ivan Butygin <[email protected]>
    Hardcode84 authored Sep 26, 2024
    Configuration menu
    Copy the full SHA
    7686157 View commit details
    Browse the repository at this point in the history

Commits on Sep 27, 2024

  1. [TKW] Implement support for multiple iter args on Reduction (iree-or…

    …g#166)
    
    The main motivation behind this PR is to enable multiple induction
    variable/iterArg on the same tiled "Reduction" loop. To enable above we
    did a couple things:
    
    1. Enable lowering/expansion on `operator.getitem` (the op that extract
    multiple results in python i.e `res0, res1 = fn`) by templating it
    on`GetResult(CustomOp)` since they have the same args and interface and
    can reuse most of the indexing/expansion helper.
    
    2. Introduce `res_idx`, a variable to represent which result index of an
    op we are referring to, during expansion and context map. This is useful
    for ops that has more than one results / variables as outputs.
    
    3. bug fix in expand_reduction, where we hoist out iterating and
    expanding of `reduction.init_args` out of the loop that iterates and
    expands over the `yield`/`return_val` of the reduction loop. It is
    expected that the size of `init_args` is the same as size of
    `yield`/`return_val`. Hence if we had N iter_args/yields, we ended up
    expanding the `init_args` N x N time instead of N times. We haven't seen
    it thus far because we have been only playing with 1 init_arg/iterArg,
    and 1x1 == 1.
    
    4. Introduce a canonicalization pattern to fold chains of GetResult.
    this is because GetResult by semantic/design is only expected to extract
    and have one result. Hence a chain of GetResult should just be replaced
    by itself. This help clean up the IR.
    
    num.4 also helps circumvent issue where Reduction and GetResult is
    expanded completely by itself not following the DFS structure per
    dimension like the rest of the expansion code. This becomes especially
    problematic for multiple IterArg since Getitem is not expecting its'
    source value to be expanded without it.
    
    ---------
    
    Signed-off-by: Stanley Winata <[email protected]>
    raikonenfnu authored Sep 27, 2024
    Configuration menu
    Copy the full SHA
    0e16d54 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    192a786 View commit details
    Browse the repository at this point in the history

Commits on Sep 30, 2024

  1. [TKW] Rework vector mask generation (iree-org#172)

    Instead of generating individual element comparisons and doing
    `vector.insertelement` generate the whole mask using vector ops.
    
    Add support for vector codegen when generating MLIR IR from sympy
    expressions. Add method `IndexingContext.iota` to generate special
    symbols which map to `(1,2 ... n-1)` vec expressions. `gen_sympy_index`
    will start to generate vector ops when encountering such symbols,
    inserting proper `splat`'s between scalar vals when necessary.
    
    ---------
    
    Signed-off-by: Ivan Butygin <[email protected]>
    Hardcode84 authored Sep 30, 2024
    Configuration menu
    Copy the full SHA
    92ad900 View commit details
    Browse the repository at this point in the history
  2. Enable import_symbolic_shape_expressions in the FxImporter. (iree-org…

    …#179)
    
    * Adds an option to `aot.export(import_symbolic_shape_expressions=True)`
    to enable emission of torch-mlir symbolic shape constraints. This is
    currently set to False until IREE is ready to ingest these by default.
    
    Rough sequence of work in IREE proper:
    
    * Custom lowering of `torch.symbolic_int` and
    `torch.bind_symbolic_shape` ops to IREE util "assume" ops. Note that we
    are only planning to lower "terminal" bindings (basically function
    arguments and a couple of other such categories).
    * Canonicalizations to ensure that assume equalities are == 0 (versus
    the native form from torch where they assume a non zero equality).
    * Fusion will clone corresponding bindings on dependent dims into
    dispatch regions.
    * Existing linalg shape analysis extended and queryable by codegen.
    
    ---------
    
    Signed-off-by: Stella Laurenzo <[email protected]>
    stellaraccident authored Sep 30, 2024
    Configuration menu
    Copy the full SHA
    621cbe1 View commit details
    Browse the repository at this point in the history

Commits on Oct 1, 2024

  1. Add code to construct pipelined loop from schedule (iree-org#160)

    This PR adds code to construct the epilogue, kernel
    and prologue once we have computed a schedule. We
    simulate rotating registers in software and add
    visualization tools to show the pipelined graphs.
    
    ---------
    
    Signed-off-by: Harsh Menon <[email protected]>
    harsh-nod authored Oct 1, 2024
    Configuration menu
    Copy the full SHA
    84320ea View commit details
    Browse the repository at this point in the history
  2. Add support for dynamic dims (iree-org#178)

    This PR adds support for dynamic dimensions in the
    kernels. The user specifies the dynamic dimensions
    by
    - Not adding them to the hyperparameter dictionary
    - Explicitly specifying them with the dynamic_symbols kwarg
      and the dynamic_symbols_mapping kwarg to specify which
      values to use for the dynamic dims at runtime
    
    This PR does not modify the codegen and so incorrect or
    unsupported values for the dynamic dims will result
    in incorrect results. (garbage in -> garbage out)
    
    ---------
    
    Signed-off-by: Harsh Menon <[email protected]>
    harsh-nod authored Oct 1, 2024
    Configuration menu
    Copy the full SHA
    553e929 View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2024

  1. [TKW] Fix sympy expr lowering and add some more igemm test shapes (ir…

    …ee-org#184)
    
    * Rework how we are lowering `rational` sympy expressions, instead of
    delayed materialization via lambdas introduce `_Rational` type and
    propagate `numerator/denominator` values independently. Division will
    only be materialized on explicit `sympy.floor/ceiling` op.
    * Rework how igemm test cases are generated and introduce few real
    shapes.
    * Use custom pytest markers to separate perf/non-perf tests
    
    ---------
    
    Signed-off-by: Ivan Butygin <[email protected]>
    Hardcode84 authored Oct 3, 2024
    Configuration menu
    Copy the full SHA
    9ed388a View commit details
    Browse the repository at this point in the history
  2. Add benchmark support for e2e tests (iree-org#183)

    Signed-off-by: erman-gurses <[email protected]>
    erman-gurses authored Oct 3, 2024
    Configuration menu
    Copy the full SHA
    0f00c6d View commit details
    Browse the repository at this point in the history
  3. [TKW] Thread Shape analysis (iree-org#186)

    The motivation of this pass is to generalize the register analysis pass
    which is used to determine the thread shape of TKW.Register, to all
    other operations.
    
    One main use case for such is to allow reduction, and later on
    "broadcast" to use thread shape information from the kernel as opposed
    to relying on vector_shape which may not always be valid.
    
    We generalize the register analysis metho by finding a few anchor ops
    who's thread shape information is determined, and then propagate to it's
    successors and ancestors.
    
    In addition to that we also implemented a couple helper
    function/attributes.
    
    1. Control_fn on BFS, ForwardSlice, BackwardSlice. This is to make it
    easier for us to control/stop the search when we hit ops we do not want
    to explore. In this case, we do not want to explore/propagate onto other
    anchor ops and their children.
    
    2. Introducing parent_op to IterArg and region of Reduction, for
    developer ergonomics.
    
    3. Move handling of IterArg and GetUser in BackwardSlice/BFS's get_input
    exploration phase to be handled individually as opposed to being handled
    when its' consumer is being explored. Previously to explore/propagate
    IterArg/GetUser, we need to explore its' consumer, just exploring
    IterArg/GetUser will not get handled correctly. This is useful for the
    case where we want to propagate/explore mma.acc (usually IterArg)
    directly.
    
    ---------
    
    Signed-off-by: Stanley Winata <[email protected]>
    raikonenfnu authored Oct 3, 2024
    Configuration menu
    Copy the full SHA
    e0a8fdf View commit details
    Browse the repository at this point in the history
  4. Disable benchmarking on all e2e tests for now (iree-org#189)

    We would like this to be controlled with a flag.
    
    Signed-off-by: Harsh Menon <[email protected]>
    harsh-nod authored Oct 3, 2024
    Configuration menu
    Copy the full SHA
    d98e521 View commit details
    Browse the repository at this point in the history
  5. Set fail-fast: false (iree-org#190)

    Our tests are flaky, `fail-fast: false` won't allow failing builds abort
    other.
    
    Signed-off-by: Ivan Butygin <[email protected]>
    Hardcode84 authored Oct 3, 2024
    Configuration menu
    Copy the full SHA
    a04ea80 View commit details
    Browse the repository at this point in the history

Commits on Oct 4, 2024

  1. [TKW] IGEMM Benchmarking (iree-org#187)

    Initial version of IGEMM benchmarking.
    
    * If `--runperf` pytest option is set, generate IREE ref code and run
    both TKW and ref code with `run_bench=True`
    * Add `--dump-perf-files-path` option to save perf info files into
    provided directory (filenames based on test name and params)
    
    ---------
    
    Signed-off-by: Ivan Butygin <[email protected]>
    Hardcode84 authored Oct 4, 2024
    Configuration menu
    Copy the full SHA
    207efd9 View commit details
    Browse the repository at this point in the history
  2. [TKW] Update IR interpreter (iree-org#182)

    * Add `arith.andi`, `arith.cmpi`, `vector.maskedload`, `vector.gather`,
    `vector.contant_mask`, `vector.insertelement`, `vectot.splat`, support
    non-splatted contants.
    * Add `interpret_ndrange` helper
    
    ---------
    
    Signed-off-by: Ivan Butygin <[email protected]>
    Hardcode84 authored Oct 4, 2024
    Configuration menu
    Copy the full SHA
    64b7d27 View commit details
    Browse the repository at this point in the history
  3. [TKW] Implement broadcastOp class, lowering and insertion (iree-org#176)

    Motivation of this PR is to be able to codegen/lower broadcast properly.
    With that in mind, we implemented these things:
    
    1. BroadcastOp class, op and lowering, to represent and store
    broadcasting information. Mostly S.T we can query target shape
    information and the source operand of broadcast.
    2. Treat broadcast-add as an index conflict and handle it by emitting
    broadcastOp.
    
    ---------
    
    Signed-off-by: Stanley Winata <[email protected]>
    raikonenfnu authored Oct 4, 2024
    Configuration menu
    Copy the full SHA
    7617c94 View commit details
    Browse the repository at this point in the history
  4. Add ability to dump intermediates (iree-org#194)

    This PR adds a flag to dump intermediates which include .ll and .s files
    to see what instructions were
    generated.
    
    ---------
    
    Signed-off-by: Harsh Menon <[email protected]>
    harsh-nod authored Oct 4, 2024
    Configuration menu
    Copy the full SHA
    83bbc40 View commit details
    Browse the repository at this point in the history
  5. Split TK CI from main CI (iree-org#195)

    * Main CI is flaky, add a separate pipeline, which tests only TK as temp
    solution
    * Make `pytest` output more verbose
    * Remove unnecessary stuff from perf pipeline
    
    ---------
    
    Signed-off-by: Ivan Butygin <[email protected]>
    Hardcode84 authored Oct 4, 2024
    Configuration menu
    Copy the full SHA
    39acab8 View commit details
    Browse the repository at this point in the history
  6. Add parameterization for benchmark flag (iree-org#192)

    Signed-off-by: erman-gurses <[email protected]>
    erman-gurses authored Oct 4, 2024
    Configuration menu
    Copy the full SHA
    4fec47c View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    da3436d View commit details
    Browse the repository at this point in the history

Commits on Oct 5, 2024

  1. Rename shark-turbine -> iree.turbine (iree-org#197)

    * Move files from files from `shark-turbine` to `iree/turbine`.
    * Update imports
    * Update `setup.py`
    * Make backward redirect `shark-turbine` -> `iree.turbine` (do we need
    this?)
    
    Progress on iree-org#28
    
    ---------
    
    Signed-off-by: Ivan Butygin <[email protected]>
    Hardcode84 authored Oct 5, 2024
    Configuration menu
    Copy the full SHA
    b0ef345 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    796f3a5 View commit details
    Browse the repository at this point in the history

Commits on Oct 7, 2024

  1. [TKW] Test multiple igemm layouts (iree-org#201)

    * Test `nchw_fchw` and `nhwc_hwcf` igemm conv layouts.
    * Perf test will use `nhwc_hwcf` as IREE seems to produce the best
    result for it.
    
    ---------
    
    Signed-off-by: Ivan Butygin <[email protected]>
    Hardcode84 authored Oct 7, 2024
    Configuration menu
    Copy the full SHA
    c076126 View commit details
    Browse the repository at this point in the history
  2. Update documentation (iree-org#199)

    This PR adds more information about the language.
    
    Signed-off-by: Harsh Menon <[email protected]>
    harsh-nod authored Oct 7, 2024
    Configuration menu
    Copy the full SHA
    5986c3c View commit details
    Browse the repository at this point in the history

Commits on Oct 8, 2024

  1. Add support for scheduling barriers (iree-org#185)

    This PR adds op for scheduling barriers and
    scheduling group barriers. These are placed
    after every cycle in the kernel.
    
    Signed-off-by: Harsh Menon <[email protected]>
    harsh-nod authored Oct 8, 2024
    Configuration menu
    Copy the full SHA
    f207ca5 View commit details
    Browse the repository at this point in the history

Commits on Oct 9, 2024

  1. Update run configurations for gemm test

    Also add the ability to output schedule files
    and use user-modified schedule files.
    
    Signed-off-by: Harsh Menon <[email protected]>
    harsh-nod committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    0f76d54 View commit details
    Browse the repository at this point in the history