Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from vllm-project:main #32

Merged
merged 60 commits into from
May 31, 2024
Merged

Commits on May 21, 2024

  1. Configuration menu
    Copy the full SHA
    757b62c View commit details
    Browse the repository at this point in the history
  2. [Bugfix] Fix flag name for max_seq_len_to_capture (#4935)

    Signed-off-by: kerthcet <[email protected]>
    kerthcet authored May 21, 2024
    Configuration menu
    Copy the full SHA
    14772ee View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    99eff67 View commit details
    Browse the repository at this point in the history

Commits on May 22, 2024

  1. Configuration menu
    Copy the full SHA
    9b9a10d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5f6d10c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c74c913 View commit details
    Browse the repository at this point in the history
  4. [Kernel] Fixup for CUTLASS kernels in CUDA graphs (#4954)

    Pass the CUDA stream into the CUTLASS GEMMs, to avoid future issues with CUDA graphs
    tlrmchlsmth authored May 22, 2024
    Configuration menu
    Copy the full SHA
    8674f98 View commit details
    Browse the repository at this point in the history
  5. [Misc] Load FP8 kv-cache scaling factors from checkpoints (#4893)

    The 2nd PR for #4532.
    
    This PR supports loading FP8 kv-cache scaling factors from a FP8 checkpoint (with .kv_scale parameter).
    comaniac authored May 22, 2024
    Configuration menu
    Copy the full SHA
    a3a73ab View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    97b0300 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    eb6d3c2 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    a36de68 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    ee3eea0 View commit details
    Browse the repository at this point in the history

Commits on May 23, 2024

  1. Configuration menu
    Copy the full SHA
    6066253 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2ba80be View commit details
    Browse the repository at this point in the history
  3. [Core][1/N] Support send/recv in PyNCCL Groups (#4988)

    Signed-off-by: Muralidhar Andoorveedu <[email protected]>
    andoorve authored May 23, 2024
    Configuration menu
    Copy the full SHA
    5eda2ea View commit details
    Browse the repository at this point in the history
  4. [Kernel] Initial Activation Quantization Support (#4525)

    Co-authored-by: Varun Sundar Rabindranath <[email protected]>
    Co-authored-by: Varun Sundar Rabindranath <[email protected]>
    3 people authored May 23, 2024
    Configuration menu
    Copy the full SHA
    a124232 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    e3470f8 View commit details
    Browse the repository at this point in the history
  6. [Doc] add ccache guide in doc (#5012)

    Co-authored-by: Michael Goin <[email protected]>
    youkaichao and mgoin authored May 23, 2024
    Configuration menu
    Copy the full SHA
    6a50f4c View commit details
    Browse the repository at this point in the history

Commits on May 24, 2024

  1. Configuration menu
    Copy the full SHA
    9197709 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e64fde4 View commit details
    Browse the repository at this point in the history

Commits on May 25, 2024

  1. [Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3…

    …-Small model (#4799)
    
    Co-authored-by: beagleski <[email protected]>
    Co-authored-by: bapatra <[email protected]>
    Co-authored-by: Barun Patra <[email protected]>
    Co-authored-by: Michael Goin <[email protected]>
    5 people authored May 25, 2024
    Configuration menu
    Copy the full SHA
    8e192ff View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    325c119 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    d5a1697 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    f17a1a8 View commit details
    Browse the repository at this point in the history

Commits on May 27, 2024

  1. [Bugfix / Core] Prefix Caching Guards (merged with main) (#4846)

    Co-authored-by: rsnm2 <[email protected]>
    Co-authored-by: Robert Shaw <[email protected]>
    3 people authored May 27, 2024
    Configuration menu
    Copy the full SHA
    1102bef View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    fbdb7b3 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    890aa93 View commit details
    Browse the repository at this point in the history

Commits on May 28, 2024

  1. [Core] Sliding window for block manager v2 (#4545)

    Co-authored-by: Ruth Evans <[email protected]>
    mmoskal and Ruth Evans authored May 28, 2024
    Configuration menu
    Copy the full SHA
    d4f3985 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    9ba4155 View commit details
    Browse the repository at this point in the history
  3. [Kernel][ROCm][AMD] Add fused_moe Triton configs for MI300X (#4951)

    This PR adds Triton kernel configs for the MoE kernel for MI300X
    divakar-amd authored May 28, 2024
    Configuration menu
    Copy the full SHA
    dd8de11 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    290f4ad View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    5ae5ed1 View commit details
    Browse the repository at this point in the history

Commits on May 29, 2024

  1. Configuration menu
    Copy the full SHA
    dfba529 View commit details
    Browse the repository at this point in the history
  2. [Misc] add gpu_memory_utilization arg (#5079)

    Signed-off-by: pandyamarut <[email protected]>
    pandyamarut authored May 29, 2024
    Configuration menu
    Copy the full SHA
    616e600 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5bd3c65 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    18c1f16 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    594392d View commit details
    Browse the repository at this point in the history
  6. [Core] Cross-attention KV caching and memory-management (towards even…

    …tual encoder/decoder model support) (#4837)
    afeldman-nm authored May 29, 2024
    Configuration menu
    Copy the full SHA
    4238bc8 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    ae495c7 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    eecd864 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    eb6c50c View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    b1c2556 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    7c3604f View commit details
    Browse the repository at this point in the history
  12. [Doc][Build] update after removing vllm-nccl (#5103)

    Co-authored-by: Roger Wang <[email protected]>
    youkaichao and ywang96 authored May 29, 2024
    Configuration menu
    Copy the full SHA
    4fbcb0f View commit details
    Browse the repository at this point in the history

Commits on May 30, 2024

  1. Configuration menu
    Copy the full SHA
    5bf185a View commit details
    Browse the repository at this point in the history
  2. [CI/Build] Docker cleanup functionality for amd servers (#5112)

    Co-authored-by: Alexey Kondratiev <[email protected]>
    Co-authored-by: Alexei-V-Ivanov-AMD <[email protected]>
    Co-authored-by: Alexei V. Ivanov <[email protected]>
    Co-authored-by: omkarkakarparthi <okakarpa>
    4 people authored May 30, 2024
    Configuration menu
    Copy the full SHA
    e07aff9 View commit details
    Browse the repository at this point in the history
  3. [BUGFIX] [FRONTEND] Correct chat logprobs (#5029)

    Co-authored-by: Breno Faria <[email protected]>
    br3no and br3no authored May 30, 2024
    Configuration menu
    Copy the full SHA
    87d41c8 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    d910816 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    f758505 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    d79d9ea View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    a9bcc7a View commit details
    Browse the repository at this point in the history
  8. add doc about serving option on dstack (#3074)

    Co-authored-by: Roger Wang <[email protected]>
    deep-diver and ywang96 authored May 30, 2024
    Configuration menu
    Copy the full SHA
    429d897 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    87a658c View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    45a1a69 View commit details
    Browse the repository at this point in the history

Commits on May 31, 2024

  1. Configuration menu
    Copy the full SHA
    b35be54 View commit details
    Browse the repository at this point in the history
  2. [Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::orde…

    …red_metadata modifier (introduced with PTX 8.5) (#5136)
    alexm-neuralmagic authored May 31, 2024
    Configuration menu
    Copy the full SHA
    6d21fa1 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    533c217 View commit details
    Browse the repository at this point in the history
  4. [Model] Support MAP-NEO model (#5081)

    Co-authored-by: Zhuohan Li <[email protected]>
    xingweiqu and zhuohan123 authored May 31, 2024
    Configuration menu
    Copy the full SHA
    a22dea5 View commit details
    Browse the repository at this point in the history
  5. Revert "[Kernel] Marlin_24: Ensure the mma.sp instruction is using th…

    …e ::ordered_metadata modifier (introduced with PTX 8.5)" (#5149)
    simon-mo authored May 31, 2024
    Configuration menu
    Copy the full SHA
    e9d3aa0 View commit details
    Browse the repository at this point in the history
  6. [Misc]: optimize eager mode host time (#4196)

    Co-authored-by: xuhao <[email protected]>
    FuncSherl and xuhao authored May 31, 2024
    Configuration menu
    Copy the full SHA
    a377f0b View commit details
    Browse the repository at this point in the history