Skip to content

Commit

Permalink
Move all LLMs to distilabel.models (#1045)
Browse files Browse the repository at this point in the history
  • Loading branch information
plaguss authored Oct 25, 2024
1 parent 1f75593 commit 7c8976b
Show file tree
Hide file tree
Showing 162 changed files with 608 additions and 437 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/docs-pr-close.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ concurrency:
group: distilabel-docs
cancel-in-progress: false

permissions:
contents: write
pull-requests: write

jobs:
cleanup:
runs-on: ubuntu-latest
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/docs-pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ concurrency:
group: distilabel-docs
cancel-in-progress: false

permissions:
contents: write
pull-requests: write

jobs:
publish:
runs-on: ubuntu-latest
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ concurrency:
group: distilabel-docs
cancel-in-progress: false

permissions:
contents: write
pull-requests: write

jobs:
publish:
runs-on: ubuntu-latest
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ pip install "distilabel[hf-inference-endpoints]" --upgrade
Then run:

```python
from distilabel.llms import InferenceEndpointsLLM
from distilabel.models import InferenceEndpointsLLM
from distilabel.pipeline import Pipeline
from distilabel.steps import LoadDataFromHub
from distilabel.steps.tasks import TextGeneration
Expand Down
8 changes: 0 additions & 8 deletions docs/api/embedding/embedding_gallery.md

This file was deleted.

7 changes: 0 additions & 7 deletions docs/api/llm/index.md

This file was deleted.

10 changes: 0 additions & 10 deletions docs/api/llm/llm_gallery.md

This file was deleted.

8 changes: 8 additions & 0 deletions docs/api/models/embedding/embedding_gallery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Embedding Gallery

This section contains the existing [`Embeddings`][distilabel.models.embeddings] subclasses implemented in `distilabel`.

::: distilabel.models.embeddings
options:
filters:
- "!^Embeddings$"
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ This section contains the API reference for the `distilabel` embeddings.

For more information on how the [`Embeddings`][distilabel.steps.tasks.Task] works and see some examples.

::: distilabel.embeddings.base
::: distilabel.models.embeddings.base
7 changes: 7 additions & 0 deletions docs/api/models/llm/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# LLM

This section contains the API reference for the `distilabel` LLMs, both for the [`LLM`][distilabel.models.llms.LLM] synchronous implementation, and for the [`AsyncLLM`][distilabel.models.llms.AsyncLLM] asynchronous one.

For more information and examples on how to use existing LLMs or create custom ones, please refer to [Tutorial - LLM](../../../sections/how_to_guides/basic/llm/index.md).

::: distilabel.models.llms.base
10 changes: 10 additions & 0 deletions docs/api/models/llm/llm_gallery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# LLM Gallery

This section contains the existing [`LLM`][distilabel.models.llms] subclasses implemented in `distilabel`.

::: distilabel.models.llms
options:
filters:
- "!^LLM$"
- "!^AsyncLLM$"
- "!typing"
8 changes: 4 additions & 4 deletions docs/sections/getting_started/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,13 @@ hide:
You can serve the LLM using a solution like TGI or vLLM, and then connect to it using an `AsyncLLM` client like `InferenceEndpointsLLM` or `OpenAILLM`. Please refer to [Serving LLMs guide](../how_to_guides/advanced/serving_an_llm_for_reuse.md) for more information.

??? faq "Can `distilabel` be used with [OpenAI Batch API](https://platform.openai.com/docs/guides/batch)?"
Yes, `distilabel` is integrated with OpenAI Batch API via [OpenAILLM][distilabel.llms.openai.OpenAILLM]. Check [LLMs - Offline Batch Generation](../how_to_guides/basic/llm/index.md#offline-batch-generation) for a small example on how to use it and [Advanced - Offline Batch Generation](../how_to_guides/advanced/offline_batch_generation.md) for a more detailed guide.
Yes, `distilabel` is integrated with OpenAI Batch API via [OpenAILLM][distilabel.models.llms.openai.OpenAILLM]. Check [LLMs - Offline Batch Generation](../how_to_guides/basic/llm/index.md#offline-batch-generation) for a small example on how to use it and [Advanced - Offline Batch Generation](../how_to_guides/advanced/offline_batch_generation.md) for a more detailed guide.

??? faq "Prevent overloads on [Free Serverless Endpoints][distilabel.llms.huggingface.InferenceEndpointsLLM]"
When running a task using the [InferenceEndpointsLLM][distilabel.llms.huggingface.InferenceEndpointsLLM] with Free Serverless Endpoints, you may be facing some errors such as `Model is overloaded` if you let the batch size to the default (set at 50). To fix the issue, lower the value or even better set `input_batch_size=1` in your task. It may take a longer time to finish, but please remember this is a free service.
??? faq "Prevent overloads on [Free Serverless Endpoints][distilabel.models.llms.huggingface.InferenceEndpointsLLM]"
When running a task using the [InferenceEndpointsLLM][distilabel.models.llms.huggingface.InferenceEndpointsLLM] with Free Serverless Endpoints, you may be facing some errors such as `Model is overloaded` if you let the batch size to the default (set at 50). To fix the issue, lower the value or even better set `input_batch_size=1` in your task. It may take a longer time to finish, but please remember this is a free service.

```python
from distilabel.llms.huggingface import InferenceEndpointsLLM
from distilabel.models import InferenceEndpointsLLM
from distilabel.steps import TextGeneration

TextGeneration(
Expand Down
2 changes: 1 addition & 1 deletion docs/sections/getting_started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ Additionally, as part of `distilabel` some extra dependencies are available, mai

## Recommendations / Notes

The [`mistralai`](https://github.com/mistralai/client-python) dependency requires Python 3.9 or higher, so if you're willing to use the `distilabel.llms.MistralLLM` implementation, you will need to have Python 3.9 or higher.
The [`mistralai`](https://github.com/mistralai/client-python) dependency requires Python 3.9 or higher, so if you're willing to use the `distilabel.models.llms.MistralLLM` implementation, you will need to have Python 3.9 or higher.

In some cases like [`transformers`](https://github.com/huggingface/transformers) and [`vllm`](https://github.com/vllm-project/vllm), the installation of [`flash-attn`](https://github.com/Dao-AILab/flash-attention) is recommended if you are using a GPU accelerator since it will speed up the inference process, but the installation needs to be done separately, as it's not included in the `distilabel` dependencies.

Expand Down
8 changes: 4 additions & 4 deletions docs/sections/getting_started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,12 @@ pip install distilabel[hf-inference-endpoints] --upgrade

## Define a pipeline

In this guide we will walk you through the process of creating a simple pipeline that uses the [`InferenceEndpointsLLM`][distilabel.llms.InferenceEndpointsLLM] class to generate text. The [`Pipeline`][distilabel.pipeline.Pipeline] will load a dataset that contains a column named `prompt` from the Hugging Face Hub via the step [`LoadDataFromHub`][distilabel.steps.LoadDataFromHub] and then use the [`InferenceEndpointsLLM`][distilabel.llms.InferenceEndpointsLLM] class to generate text based on the dataset using the [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task.
In this guide we will walk you through the process of creating a simple pipeline that uses the [`InferenceEndpointsLLM`][distilabel.models.llms.InferenceEndpointsLLM] class to generate text. The [`Pipeline`][distilabel.pipeline.Pipeline] will load a dataset that contains a column named `prompt` from the Hugging Face Hub via the step [`LoadDataFromHub`][distilabel.steps.LoadDataFromHub] and then use the [`InferenceEndpointsLLM`][distilabel.models.llms.InferenceEndpointsLLM] class to generate text based on the dataset using the [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task.

> You can check the available models in the [Hugging Face Model Hub](https://huggingface.co/models?pipeline_tag=text-generation&sort=trending) and filter by `Inference status`.
```python
from distilabel.llms import InferenceEndpointsLLM
from distilabel.models import InferenceEndpointsLLM
from distilabel.pipeline import Pipeline
from distilabel.steps import LoadDataFromHub
from distilabel.steps.tasks import TextGeneration
Expand Down Expand Up @@ -85,9 +85,9 @@ if __name__ == "__main__":

3. We define a [`LoadDataFromHub`][distilabel.steps.LoadDataFromHub] step named `load_dataset` that will load a dataset from the Hugging Face Hub, as provided via runtime parameters in the `pipeline.run` method below, but it can also be defined within the class instance via the arg `repo_id=...`. This step will produce output batches with the rows from the dataset, and the column `prompt` will be mapped to the `instruction` field.

4. We define a [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task named `text_generation` that will generate text based on the `instruction` field from the dataset. This task will use the [`InferenceEndpointsLLM`][distilabel.llms.InferenceEndpointsLLM] class with the model `Meta-Llama-3.1-8B-Instruct`.
4. We define a [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task named `text_generation` that will generate text based on the `instruction` field from the dataset. This task will use the [`InferenceEndpointsLLM`][distilabel.models.llms.InferenceEndpointsLLM] class with the model `Meta-Llama-3.1-8B-Instruct`.

5. We define the [`InferenceEndpointsLLM`][distilabel.llms.InferenceEndpointsLLM] class with the model `Meta-Llama-3.1-8B-Instruct` that will be used by the [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task. In this case, since the [`InferenceEndpointsLLM`][distilabel.llms.InferenceEndpointsLLM] is used, we assume that the `HF_TOKEN` environment variable is set.
5. We define the [`InferenceEndpointsLLM`][distilabel.models.llms.InferenceEndpointsLLM] class with the model `Meta-Llama-3.1-8B-Instruct` that will be used by the [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task. In this case, since the [`InferenceEndpointsLLM`][distilabel.models.llms.InferenceEndpointsLLM] is used, we assume that the `HF_TOKEN` environment variable is set.

6. Both `system_prompt` and `template` are optional fields. The `template` must be informed as a string following the [Jinja2](https://jinja.palletsprojects.com/en/3.1.x/templates/#synopsis) template format, and the fields that appear there ("instruction" in this case, which corresponds to the default) must be informed in the `columns` attribute. The component gallery for [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) has examples to get you started.

Expand Down
4 changes: 2 additions & 2 deletions docs/sections/how_to_guides/advanced/argilla.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The dataset will be pushed with the following configuration:
The [`TextGenerationToArgilla`][distilabel.steps.TextGenerationToArgilla] step will only work as is if the [`Pipeline`][distilabel.pipeline.Pipeline] contains one or multiple [`TextGeneration`][distilabel.steps.tasks.TextGeneration] steps, or if the columns `instruction` and `generation` are available within the batch data. Otherwise, the variable `input_mappings` will need to be set so that either both or one of `instruction` and `generation` are mapped to one of the existing columns in the batch data.

```python
from distilabel.llms import OpenAILLM
from distilabel.models import OpenAILLM
from distilabel.steps import LoadDataFromDicts, TextGenerationToArgilla
from distilabel.steps.tasks import TextGeneration

Expand Down Expand Up @@ -74,7 +74,7 @@ The dataset will be pushed with the following configuration:
Additionally, if the [`Pipeline`][distilabel.pipeline.Pipeline] contains an [`UltraFeedback`][distilabel.steps.tasks.UltraFeedback] step, the `ratings` and `rationales` will also be available and be automatically injected as suggestions to the existing dataset.

```python
from distilabel.llms import OpenAILLM
from distilabel.models import OpenAILLM
from distilabel.steps import LoadDataFromDicts, PreferenceToArgilla
from distilabel.steps.tasks import TextGeneration

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ When dealing with complex pipelines that get executed in a distributed environme

```python
from distilabel.pipeline import Pipeline
from distilabel.llms import vLLM
from distilabel.models import vLLM
from distilabel.steps import StepResources
from distilabel.steps.tasks import PrometheusEval

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ The [offline batch generation](../basic/llm/index.md#offline-batch-generation) i
## Example pipeline using `OpenAILLM` with offline batch generation

```python
from distilabel.llms import OpenAILLM
from distilabel.models import OpenAILLM
from distilabel.pipeline import Pipeline
from distilabel.steps import LoadDataFromHub
from distilabel.steps.tasks import TextGeneration
Expand Down
2 changes: 1 addition & 1 deletion docs/sections/how_to_guides/advanced/scaling_with_ray.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ pip install distilabel[ray]
For the purpose of explaining how to execute a pipeline with Ray, we'll use the following pipeline throughout the examples:

```python
from distilabel.llms import vLLM
from distilabel.models import vLLM
from distilabel.pipeline import Pipeline
from distilabel.steps import LoadDataFromHub
from distilabel.steps.tasks import TextGeneration
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data \
And then we can use `InferenceEndpointsLLM` with `base_url=http://localhost:8080` (pointing to our `TGI` local deployment):

```python
from distilabel.llms import InferenceEndpointsLLM
from distilabel.models import InferenceEndpointsLLM
from distilabel.pipeline import Pipeline
from distilabel.steps import LoadDataFromDicts
from distilabel.steps.tasks import TextGeneration, UltraFeedback
Expand Down Expand Up @@ -66,7 +66,7 @@ docker run --gpus all \
And then we can use `OpenAILLM` with `base_url=http://localhost:8000` (pointing to our `vLLM` local deployment):

```python
from distilabel.llms import OpenAILLM
from distilabel.models import OpenAILLM
from distilabel.pipeline import Pipeline
from distilabel.steps import LoadDataFromDicts
from distilabel.steps.tasks import TextGeneration, UltraFeedback
Expand Down
14 changes: 7 additions & 7 deletions docs/sections/how_to_guides/advanced/structured_generation.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Structured data generation

`Distilabel` has integrations with relevant libraries to generate structured text i.e. to guide the [`LLM`][distilabel.llms.LLM] towards the generation of structured outputs following a JSON schema, a regex, etc.
`Distilabel` has integrations with relevant libraries to generate structured text i.e. to guide the [`LLM`][distilabel.models.llms.LLM] towards the generation of structured outputs following a JSON schema, a regex, etc.

## Outlines

`Distilabel` integrates [`outlines`](https://outlines-dev.github.io/outlines/welcome/) within some [`LLM`][distilabel.llms.LLM] subclasses. At the moment, the following LLMs integrated with `outlines` are supported in `distilabel`: [`TransformersLLM`][distilabel.llms.TransformersLLM], [`vLLM`][distilabel.llms.vLLM] or [`LlamaCppLLM`][distilabel.llms.LlamaCppLLM], so that anyone can generate structured outputs in the form of *JSON* or a parseable *regex*.
`Distilabel` integrates [`outlines`](https://outlines-dev.github.io/outlines/welcome/) within some [`LLM`][distilabel.models.llms.LLM] subclasses. At the moment, the following LLMs integrated with `outlines` are supported in `distilabel`: [`TransformersLLM`][distilabel.models.llms.TransformersLLM], [`vLLM`][distilabel.models.llms.vLLM] or [`LlamaCppLLM`][distilabel.models.llms.LlamaCppLLM], so that anyone can generate structured outputs in the form of *JSON* or a parseable *regex*.

The [`LLM`][distilabel.llms.LLM] has an argument named `structured_output`[^1] that determines how we can generate structured outputs with it, let's see an example using [`LlamaCppLLM`][distilabel.llms.LlamaCppLLM].
The [`LLM`][distilabel.models.llms.LLM] has an argument named `structured_output`[^1] that determines how we can generate structured outputs with it, let's see an example using [`LlamaCppLLM`][distilabel.models.llms.LlamaCppLLM].

!!! Note

Expand Down Expand Up @@ -36,7 +36,7 @@ class User(BaseModel):
And then we provide that schema to the `structured_output` argument of the LLM.

```python
from distilabel.llms import LlamaCppLLM
from distilabel.models import LlamaCppLLM

llm = LlamaCppLLM(
model_path="./openhermes-2.5-mistral-7b.Q4_K_M.gguf" # (1)
Expand Down Expand Up @@ -129,7 +129,7 @@ These were some simple examples, but one can see the options this opens.

## Instructor

For other LLM providers behind APIs, there's no direct way of accessing the internal logit processor like `outlines` does, but thanks to [`instructor`](https://python.useinstructor.com/) we can generate structured output from LLM providers based on `pydantic.BaseModel` objects. We have integrated `instructor` to deal with the [`AsyncLLM`][distilabel.llms.AsyncLLM].
For other LLM providers behind APIs, there's no direct way of accessing the internal logit processor like `outlines` does, but thanks to [`instructor`](https://python.useinstructor.com/) we can generate structured output from LLM providers based on `pydantic.BaseModel` objects. We have integrated `instructor` to deal with the [`AsyncLLM`][distilabel.models.llms.AsyncLLM].

!!! Note
For `instructor` integration to work you may need to install the corresponding dependencies:
Expand Down Expand Up @@ -159,7 +159,7 @@ And then we provide that schema to the `structured_output` argument of the LLM:
In this example we are using *Meta Llama 3.1 8B Instruct*, keep in mind not all the models support structured outputs.

```python
from distilabel.llms import MistralLLM
from distilabel.models import MistralLLM

llm = InferenceEndpointsLLM(
model_id="meta-llama/Meta-Llama-3.1-8B-Instruct",
Expand Down Expand Up @@ -204,7 +204,7 @@ Contrary to what we have via `outlines`, JSON mode will not guarantee the output
Other than the reference to generating JSON, to ensure the model generates parseable JSON we can pass the argument `response_format="json"`[^3]:

```python
from distilabel.llms import OpenAILLM
from distilabel.models import OpenAILLM
llm = OpenAILLM(model="gpt4-turbo", api_key="api.key")
llm.generate(..., response_format="json")
```
Expand Down
Loading

0 comments on commit 7c8976b

Please sign in to comment.