Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactoring] Enhancing Flexibility in LiteLLMClient Initialization #19

Open
brunocapelao opened this issue Jan 6, 2024 · 0 comments
Open
Labels
enhancement New feature or request

Comments

@brunocapelao
Copy link
Owner

Detailed Description:

The LiteLLMClient class currently initializes with limited configuration options, restricting users to only specifying the LiteLLM model to use. This limitation prevents users from fine-tuning the behavior of the LiteLLMClient according to their specific requirements. To address this issue, we propose refactoring the LiteLLMClient to allow for a broader range of configuration parameters during initialization. This enhancement will provide users with more control over the LiteLLMClient's behavior and output.

The following optional fields should be added to the LiteLLMClient initialization to allow users to customize the behavior of the client:

  • temperature (number or null): Specifies the sampling temperature, ranging from 0 to 2. Higher values like 0.8 produce more random outputs, while lower values like 0.2 make outputs more focused and deterministic.
  • top_p (number or null): An alternative to temperature-based sampling, instructs the model to consider the results of tokens with a top_p probability. For example, 0.1 means only tokens comprising the top 10% probability mass are considered.
  • n (integer or null): Determines the number of chat completion choices to generate for each input message.
  • stream (boolean or null): If set to true, it sends partial message deltas, allowing tokens to be sent as they become available. The stream is terminated by a [DONE] message.
  • stop (string/array/null): Specifies up to 4 sequences where the API will stop generating further tokens.
  • max_tokens (integer): Sets the maximum number of tokens to generate in the chat completion.
  • presence_penalty (number or null): Penalizes new tokens based on their existence in the text so far.
  • response_format (object): Specifies the format the model must output, enabling JSON mode with { "type": "json_object" }.
  • seed (integer or null): In beta, specifies a seed for deterministic sampling, making repeated requests with the same seed and parameters return the same result.
  • tools (array): Lists tools the model may call, currently supporting functions as tools.
  • type (string): The type of tool (currently only "function" is supported).
  • function (object): Required when specifying a tool.
  • tool_choice (string/object): Controls which function, if any, is called by the model (options: "none," "auto," or specify a function).
  • frequency_penalty (number or null): Penalizes new tokens based on their frequency in the text.
  • logit_bias (map): Modifies the probability of specific tokens appearing in the completion.
  • user (string): A unique identifier representing your end-user.
  • timeout (int): Timeout in seconds for completion requests (defaults to 600 seconds).
  • logprobs (bool): Determines whether to return log probabilities of the output tokens.
  • top_logprobs (int): Specifies the number of most likely tokens to return at each token position if logprobs is set to true (ranges from 0 to 5).

By incorporating these optional fields into the LiteLLMClient initialization, we aim to provide users with a more flexible and customizable experience when working with the LiteLLMClient.

Acceptance Criteria:

  • LiteLLMClient should be able to initialize with the optional fields mentioned above.
  • API calls should correctly reflect the settings passed during initialization.
  • The refactoring should not break compatibility with existing implementations that use LiteLLMClient.
  • Updated documentation should accurately describe the new configuration options and their usage.
  • Unit tests should be implemented to validate the new functionalities and ensure there are no regressions.

https://docs.litellm.ai/docs/completion/input

@brunocapelao brunocapelao added the enhancement New feature or request label Jan 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

1 participant