v0.0.0beta32
What's Changed
- Add emitting token count metrics to datadog statsd by @seanshi-scale in #458
- Downgrade sse-starlette version by @squeakymouse in #478
- Return 400 for botocore client errors by @yunfeng-scale in #479
- Increase Kaniko Memory by @saiatmakuri in #481
- Batch job metrics by @yunfeng-scale in #480
- Use base model name as metric tag by @yunfeng-scale in #483
- Change LLM Engine base path from global var by @squeakymouse in #482
- Remove fine-tune limit for internal users by @squeakymouse in #484
- Parallel Python execution for tool completion by @yunfeng-scale in #470
- Allow JSONL for fine-tuning datasets by @squeakymouse in #486
- Fix throughput_benchmarks ITL calculation, add option to use a json file of prompts by @seanshi-scale in #485
- Add Model.update() to Python client by @squeakymouse in #490
- Bump idna from 3.4 to 3.7 in /clients/python by @dependabot in #491
- Bump idna from 3.4 to 3.7 in /model-engine by @dependabot in #492
- Properly add mixtral 8x22b by @yunfeng-scale in #493
- support mixtral 8x22b instruct by @saiatmakuri in #495
- fix return_token_log_probs on vLLM > 0.3.3 endpoints by @saiatmakuri in #498
- Package update + more docs on dev setup by @dmchoiboi in #500
- Add Llama 3 models by @yunfeng-scale in #501
- Enforce model checkpoints existing for endpoint/bundle creation by @dmchoiboi in #503
- guided decoding with grammar by @saiatmakuri in #488
- adding asyncenginedead error catch by @ian-scale in #504
- Default include_stop_str_in_output to None by @squeakymouse in #506
- get latest inference framework tag from configmap by @saiatmakuri in #505
- integration tests for completions by @saiatmakuri in #507
- patch service config identifier by @saiatmakuri in #509
- require safetensors for LLM endpoint creation by @saiatmakuri in #510
- Add py.typed for proper typechecking support on clients by @dmchoiboi in #513
- Fix package name mapping in setup.py by @dmchoiboi in #514
New Contributors
- @dmchoiboi made their first contribution in #500
Full Changelog: v0.0.0beta28...v0.0.0beta32