Pernekhan

Pernekhan Utemuratov Pernekhan

Achievements

vllm-project/vllm vllm-project/vllm Public

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 28.8k 4.3k
NVIDIA/TensorRT-LLM NVIDIA/TensorRT-LLM Public

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8.5k 957
triton-inference-server/tensorrtllm_backend triton-inference-server/tensorrtllm_backend Public

The Triton TensorRT-LLM Backend

Python 687 101
templates templates Public

Vim Script