Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix InferenceEndpointsLLM not using cached token #690

Merged
merged 4 commits into from
Jun 3, 2024

Conversation

gabrielmbmb
Copy link
Member

Description

InferenceEndpointsLLM was not automatically using the cached Hugging Face token (cached because huggingface-cli login).

In addition, this PR has updated the AsyncLLM.__del__ to just close the async loop if it was newly created by the AsyncLLM, otherwise it could raise a RuntimeError when calling close if the AsyncLLM didn't create it, because it's managed by other lib which might be still using it (such as pytest-asyncio).

@gabrielmbmb gabrielmbmb added the fix label Jun 3, 2024
@gabrielmbmb gabrielmbmb added this to the 1.2.0 milestone Jun 3, 2024
@gabrielmbmb gabrielmbmb self-assigned this Jun 3, 2024
Copy link

codspeed-hq bot commented Jun 3, 2024

CodSpeed Performance Report

Merging #690 will not alter performance

Comparing fix-inference-endpoints-llm-cached-token (27241da) with develop (918c19f)

Summary

✅ 1 untouched benchmarks

@gabrielmbmb gabrielmbmb merged commit e61b598 into develop Jun 3, 2024
7 checks passed
@gabrielmbmb gabrielmbmb deleted the fix-inference-endpoints-llm-cached-token branch June 3, 2024 11:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants