-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] main from vllm-project:main #32
Commits on May 21, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 757b62c - Browse repository at this point
Copy the full SHA 757b62cView commit details -
[Bugfix] Fix flag name for
max_seq_len_to_capture
(#4935)Signed-off-by: kerthcet <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 14772ee - Browse repository at this point
Copy the full SHA 14772eeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 99eff67 - Browse repository at this point
Copy the full SHA 99eff67View commit details
Commits on May 22, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 9b9a10d - Browse repository at this point
Copy the full SHA 9b9a10dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5f6d10c - Browse repository at this point
Copy the full SHA 5f6d10cView commit details -
Configuration menu - View commit details
-
Copy full SHA for c74c913 - Browse repository at this point
Copy the full SHA c74c913View commit details -
[Kernel] Fixup for CUTLASS kernels in CUDA graphs (#4954)
Pass the CUDA stream into the CUTLASS GEMMs, to avoid future issues with CUDA graphs
Configuration menu - View commit details
-
Copy full SHA for 8674f98 - Browse repository at this point
Copy the full SHA 8674f98View commit details -
[Misc] Load FP8 kv-cache scaling factors from checkpoints (#4893)
The 2nd PR for #4532. This PR supports loading FP8 kv-cache scaling factors from a FP8 checkpoint (with .kv_scale parameter).
Configuration menu - View commit details
-
Copy full SHA for a3a73ab - Browse repository at this point
Copy the full SHA a3a73abView commit details -
Configuration menu - View commit details
-
Copy full SHA for 97b0300 - Browse repository at this point
Copy the full SHA 97b0300View commit details -
Configuration menu - View commit details
-
Copy full SHA for eb6d3c2 - Browse repository at this point
Copy the full SHA eb6d3c2View commit details -
Configuration menu - View commit details
-
Copy full SHA for a36de68 - Browse repository at this point
Copy the full SHA a36de68View commit details -
Configuration menu - View commit details
-
Copy full SHA for ee3eea0 - Browse repository at this point
Copy the full SHA ee3eea0View commit details
Commits on May 23, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 6066253 - Browse repository at this point
Copy the full SHA 6066253View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2ba80be - Browse repository at this point
Copy the full SHA 2ba80beView commit details -
[Core][1/N] Support send/recv in PyNCCL Groups (#4988)
Signed-off-by: Muralidhar Andoorveedu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5eda2ea - Browse repository at this point
Copy the full SHA 5eda2eaView commit details -
[Kernel] Initial Activation Quantization Support (#4525)
Co-authored-by: Varun Sundar Rabindranath <[email protected]> Co-authored-by: Varun Sundar Rabindranath <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a124232 - Browse repository at this point
Copy the full SHA a124232View commit details -
[Core]: Option To Use Prompt Token Ids Inside Logits Processor (#4985)
Co-authored-by: Elisei Smirnov <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e3470f8 - Browse repository at this point
Copy the full SHA e3470f8View commit details -
[Doc] add ccache guide in doc (#5012)
Co-authored-by: Michael Goin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6a50f4c - Browse repository at this point
Copy the full SHA 6a50f4cView commit details
Commits on May 24, 2024
-
[Bugfix] Fix Mistral v0.3 Weight Loading (#5005)
Co-authored-by: Cody Yu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9197709 - Browse repository at this point
Copy the full SHA 9197709View commit details -
[Core][Bugfix]: fix prefix caching for blockv2 (#4764)
Co-authored-by: Lei Wen <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e64fde4 - Browse repository at this point
Copy the full SHA e64fde4View commit details
Commits on May 25, 2024
-
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3…
…-Small model (#4799) Co-authored-by: beagleski <[email protected]> Co-authored-by: bapatra <[email protected]> Co-authored-by: Barun Patra <[email protected]> Co-authored-by: Michael Goin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8e192ff - Browse repository at this point
Copy the full SHA 8e192ffView commit details -
Configuration menu - View commit details
-
Copy full SHA for 325c119 - Browse repository at this point
Copy the full SHA 325c119View commit details -
Configuration menu - View commit details
-
Copy full SHA for d5a1697 - Browse repository at this point
Copy the full SHA d5a1697View commit details -
Configuration menu - View commit details
-
Copy full SHA for f17a1a8 - Browse repository at this point
Copy the full SHA f17a1a8View commit details
Commits on May 27, 2024
-
[Bugfix / Core] Prefix Caching Guards (merged with main) (#4846)
Co-authored-by: rsnm2 <[email protected]> Co-authored-by: Robert Shaw <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1102bef - Browse repository at this point
Copy the full SHA 1102befView commit details -
Configuration menu - View commit details
-
Copy full SHA for fbdb7b3 - Browse repository at this point
Copy the full SHA fbdb7b3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 890aa93 - Browse repository at this point
Copy the full SHA 890aa93View commit details
Commits on May 28, 2024
-
[Core] Sliding window for block manager v2 (#4545)
Co-authored-by: Ruth Evans <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d4f3985 - Browse repository at this point
Copy the full SHA d4f3985View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9ba4155 - Browse repository at this point
Copy the full SHA 9ba4155View commit details -
[Kernel][ROCm][AMD] Add fused_moe Triton configs for MI300X (#4951)
This PR adds Triton kernel configs for the MoE kernel for MI300X
Configuration menu - View commit details
-
Copy full SHA for dd8de11 - Browse repository at this point
Copy the full SHA dd8de11View commit details -
Configuration menu - View commit details
-
Copy full SHA for 290f4ad - Browse repository at this point
Copy the full SHA 290f4adView commit details -
[Core] Consolidate prompt arguments to LLM engines (#4328)
Co-authored-by: Roger Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5ae5ed1 - Browse repository at this point
Copy the full SHA 5ae5ed1View commit details
Commits on May 29, 2024
-
Configuration menu - View commit details
-
Copy full SHA for dfba529 - Browse repository at this point
Copy the full SHA dfba529View commit details -
[Misc] add gpu_memory_utilization arg (#5079)
Signed-off-by: pandyamarut <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 616e600 - Browse repository at this point
Copy the full SHA 616e600View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5bd3c65 - Browse repository at this point
Copy the full SHA 5bd3c65View commit details -
Configuration menu - View commit details
-
Copy full SHA for 18c1f16 - Browse repository at this point
Copy the full SHA 18c1f16View commit details -
Configuration menu - View commit details
-
Copy full SHA for 594392d - Browse repository at this point
Copy the full SHA 594392dView commit details -
[Core] Cross-attention KV caching and memory-management (towards even…
…tual encoder/decoder model support) (#4837)
Configuration menu - View commit details
-
Copy full SHA for 4238bc8 - Browse repository at this point
Copy the full SHA 4238bc8View commit details -
Configuration menu - View commit details
-
Copy full SHA for ae495c7 - Browse repository at this point
Copy the full SHA ae495c7View commit details -
Configuration menu - View commit details
-
Copy full SHA for eecd864 - Browse repository at this point
Copy the full SHA eecd864View commit details -
Configuration menu - View commit details
-
Copy full SHA for eb6c50c - Browse repository at this point
Copy the full SHA eb6c50cView commit details -
Configuration menu - View commit details
-
Copy full SHA for b1c2556 - Browse repository at this point
Copy the full SHA b1c2556View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7c3604f - Browse repository at this point
Copy the full SHA 7c3604fView commit details -
[Doc][Build] update after removing vllm-nccl (#5103)
Co-authored-by: Roger Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 4fbcb0f - Browse repository at this point
Copy the full SHA 4fbcb0fView commit details
Commits on May 30, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 5bf185a - Browse repository at this point
Copy the full SHA 5bf185aView commit details -
[CI/Build] Docker cleanup functionality for amd servers (#5112)
Co-authored-by: Alexey Kondratiev <[email protected]> Co-authored-by: Alexei-V-Ivanov-AMD <[email protected]> Co-authored-by: Alexei V. Ivanov <[email protected]> Co-authored-by: omkarkakarparthi <okakarpa>
Configuration menu - View commit details
-
Copy full SHA for e07aff9 - Browse repository at this point
Copy the full SHA e07aff9View commit details -
[BUGFIX] [FRONTEND] Correct chat logprobs (#5029)
Co-authored-by: Breno Faria <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 87d41c8 - Browse repository at this point
Copy the full SHA 87d41c8View commit details -
Configuration menu - View commit details
-
Copy full SHA for d910816 - Browse repository at this point
Copy the full SHA d910816View commit details -
Configuration menu - View commit details
-
Copy full SHA for f758505 - Browse repository at this point
Copy the full SHA f758505View commit details -
Configuration menu - View commit details
-
Copy full SHA for d79d9ea - Browse repository at this point
Copy the full SHA d79d9eaView commit details -
Configuration menu - View commit details
-
Copy full SHA for a9bcc7a - Browse repository at this point
Copy the full SHA a9bcc7aView commit details -
add doc about serving option on dstack (#3074)
Co-authored-by: Roger Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 429d897 - Browse repository at this point
Copy the full SHA 429d897View commit details -
Configuration menu - View commit details
-
Copy full SHA for 87a658c - Browse repository at this point
Copy the full SHA 87a658cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 45a1a69 - Browse repository at this point
Copy the full SHA 45a1a69View commit details
Commits on May 31, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b35be54 - Browse repository at this point
Copy the full SHA b35be54View commit details -
[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::orde…
…red_metadata modifier (introduced with PTX 8.5) (#5136)
Configuration menu - View commit details
-
Copy full SHA for 6d21fa1 - Browse repository at this point
Copy the full SHA 6d21fa1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 533c217 - Browse repository at this point
Copy the full SHA 533c217View commit details -
[Model] Support MAP-NEO model (#5081)
Co-authored-by: Zhuohan Li <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a22dea5 - Browse repository at this point
Copy the full SHA a22dea5View commit details -
Revert "[Kernel] Marlin_24: Ensure the mma.sp instruction is using th…
…e ::ordered_metadata modifier (introduced with PTX 8.5)" (#5149)
Configuration menu - View commit details
-
Copy full SHA for e9d3aa0 - Browse repository at this point
Copy the full SHA e9d3aa0View commit details -
[Misc]: optimize eager mode host time (#4196)
Co-authored-by: xuhao <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a377f0b - Browse repository at this point
Copy the full SHA a377f0bView commit details