-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add StructuredGeneration
task and support for grammar
in InferenceEndpointsLLM
#680
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Now the `generate` method in the `LLM` can receive either a chat or a tuple with the chat and the grammar for that chat - `grammar` is an arg at `LLM` level - The `grammar` can be specified per row via the `StructuredGeneration`, while when specifying a global `grammar` then the `grammar` arg within the `LLM` can be used via the `TextGeneration` task instead
CodSpeed Performance ReportMerging #680 will not alter performanceComparing Summary
|
plaguss
approved these changes
Jun 3, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR adds the
StructuredGeneration
task, similarly to theTextGeneration
one, but also expecting the inputgrammar
and producing both the chat-like input and thegrammar
within theformat_input
. In order to achieve that, the typing has been updated / modified inTask.process
and alsoLLM.generate
so that the received input/s contain either only the chat for most of the cases, or the chat and thegrammar
for theStructuredGeneration
case.Note
This is still a work in progress and subject to changes, but at the moment this seems the most straight forward / intuitive way to do so.
Additionally, this PR adds the
grammar
arg withinInferenceEndpointsLLM
so that it can be provided via the init or via runtime parameter.Note
The main difference between using the
grammar
arg compared to using theStructuredGeneration
task relies on the fact that thegrammar
arg is intended to be used with any task, whenever we want the output to match a certain format e.g. inUltraFeedback
we may want the output to match a certain regex to avoid output parsing issues; while on the other hand, the taskStructuredGeneration
is intended when we have different grammars per row and we want to generate an output for the given instruction based on a different grammar per row, e.g. a function calling scenario where we want each generation for a given instruction to match a certain function schema.Examples
grammar
at LLM-level (samegrammar
for every generation)grammar
viaStructuredGeneration (one
grammar` per row)