Move all LLMs to distilabel.models (#1045)

argilla-io · Oct 25, 2024 · 7c8976b · 7c8976b
1 parent 1f75593
commit 7c8976b
Show file tree

Hide file tree

Showing 162 changed files with 608 additions and 437 deletions.
diff --git a/.github/workflows/docs-pr-close.yml b/.github/workflows/docs-pr-close.yml
@@ -8,6 +8,10 @@ concurrency:
   group: distilabel-docs
   cancel-in-progress: false
 
+permissions:
+  contents: write
+  pull-requests: write
+
 jobs:
   cleanup:
     runs-on: ubuntu-latest

diff --git a/.github/workflows/docs-pr.yml b/.github/workflows/docs-pr.yml
@@ -10,6 +10,10 @@ concurrency:
   group: distilabel-docs
   cancel-in-progress: false
 
+permissions:
+  contents: write
+  pull-requests: write
+
 jobs:
   publish:
     runs-on: ubuntu-latest

diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
@@ -12,6 +12,10 @@ concurrency:
   group: distilabel-docs
   cancel-in-progress: false
 
+permissions:
+  contents: write
+  pull-requests: write
+
 jobs:
   publish:
     runs-on: ubuntu-latest

diff --git a/README.md b/README.md
@@ -118,7 +118,7 @@ pip install "distilabel[hf-inference-endpoints]" --upgrade
 Then run:
 
 ```python
-from distilabel.llms import InferenceEndpointsLLM
+from distilabel.models import InferenceEndpointsLLM
 from distilabel.pipeline import Pipeline
 from distilabel.steps import LoadDataFromHub
 from distilabel.steps.tasks import TextGeneration

diff --git a/docs/api/embedding/embedding_gallery.md b/docs/api/embedding/embedding_gallery.md
diff --git a/docs/api/llm/index.md b/docs/api/llm/index.md
diff --git a/docs/api/llm/llm_gallery.md b/docs/api/llm/llm_gallery.md
diff --git a/docs/api/models/embedding/embedding_gallery.md b/docs/api/models/embedding/embedding_gallery.md
@@ -0,0 +1,8 @@
+# Embedding Gallery
+
+This section contains the existing [`Embeddings`][distilabel.models.embeddings] subclasses implemented in `distilabel`.
+
+::: distilabel.models.embeddings
+    options:
+        filters:
+        - "!^Embeddings$"
diff --git a/docs/api/embedding/index.md → docs/api/models/embedding/index.md b/docs/api/embedding/index.md → docs/api/models/embedding/index.md
@@ -4,4 +4,4 @@ This section contains the API reference for the `distilabel` embeddings.
 
 For more information on how the [`Embeddings`][distilabel.steps.tasks.Task] works and see some examples.
 
-::: distilabel.embeddings.base
+::: distilabel.models.embeddings.base
diff --git a/docs/api/models/llm/index.md b/docs/api/models/llm/index.md
@@ -0,0 +1,7 @@
+# LLM
+
+This section contains the API reference for the `distilabel` LLMs, both for the [`LLM`][distilabel.models.llms.LLM] synchronous implementation, and for the [`AsyncLLM`][distilabel.models.llms.AsyncLLM] asynchronous one.
+
+For more information and examples on how to use existing LLMs or create custom ones, please refer to [Tutorial - LLM](../../../sections/how_to_guides/basic/llm/index.md).
+
+::: distilabel.models.llms.base
diff --git a/docs/api/models/llm/llm_gallery.md b/docs/api/models/llm/llm_gallery.md
@@ -0,0 +1,10 @@
+# LLM Gallery
+
+This section contains the existing [`LLM`][distilabel.models.llms] subclasses implemented in `distilabel`.
+
+::: distilabel.models.llms
+    options:
+        filters:
+        - "!^LLM$"
+        - "!^AsyncLLM$"
+        - "!typing"
diff --git a/docs/sections/getting_started/faq.md b/docs/sections/getting_started/faq.md
@@ -44,13 +44,13 @@ hide:
     You can serve the LLM using a solution like TGI or vLLM, and then connect to it using an `AsyncLLM` client like `InferenceEndpointsLLM` or `OpenAILLM`. Please refer to [Serving LLMs guide](../how_to_guides/advanced/serving_an_llm_for_reuse.md) for more information.
 
 ??? faq "Can `distilabel` be used with [OpenAI Batch API](https://platform.openai.com/docs/guides/batch)?"
-    Yes, `distilabel` is integrated with OpenAI Batch API via [OpenAILLM][distilabel.llms.openai.OpenAILLM]. Check [LLMs - Offline Batch Generation](../how_to_guides/basic/llm/index.md#offline-batch-generation) for a small example on how to use it and [Advanced - Offline Batch Generation](../how_to_guides/advanced/offline_batch_generation.md) for a more detailed guide.
+    Yes, `distilabel` is integrated with OpenAI Batch API via [OpenAILLM][distilabel.models.llms.openai.OpenAILLM]. Check [LLMs - Offline Batch Generation](../how_to_guides/basic/llm/index.md#offline-batch-generation) for a small example on how to use it and [Advanced - Offline Batch Generation](../how_to_guides/advanced/offline_batch_generation.md) for a more detailed guide.
 
-??? faq "Prevent overloads on [Free Serverless Endpoints][distilabel.llms.huggingface.InferenceEndpointsLLM]"
-    When running a task using the [InferenceEndpointsLLM][distilabel.llms.huggingface.InferenceEndpointsLLM] with Free Serverless Endpoints, you may be facing some errors such as `Model is overloaded` if you let the batch size to the default (set at 50). To fix the issue, lower the value or even better set `input_batch_size=1` in your task. It may take a longer time to finish, but please remember this is a free service.
+??? faq "Prevent overloads on [Free Serverless Endpoints][distilabel.models.llms.huggingface.InferenceEndpointsLLM]"
+    When running a task using the [InferenceEndpointsLLM][distilabel.models.llms.huggingface.InferenceEndpointsLLM] with Free Serverless Endpoints, you may be facing some errors such as `Model is overloaded` if you let the batch size to the default (set at 50). To fix the issue, lower the value or even better set `input_batch_size=1` in your task. It may take a longer time to finish, but please remember this is a free service.
 
     ```python
-    from distilabel.llms.huggingface import InferenceEndpointsLLM
+    from distilabel.models import InferenceEndpointsLLM
     from distilabel.steps import TextGeneration
 
     TextGeneration(

diff --git a/docs/sections/getting_started/installation.md b/docs/sections/getting_started/installation.md
@@ -75,7 +75,7 @@ Additionally, as part of `distilabel` some extra dependencies are available, mai
 
 ## Recommendations / Notes
 
-The [`mistralai`](https://github.com/mistralai/client-python) dependency requires Python 3.9 or higher, so if you're willing to use the `distilabel.llms.MistralLLM` implementation, you will need to have Python 3.9 or higher.
+The [`mistralai`](https://github.com/mistralai/client-python) dependency requires Python 3.9 or higher, so if you're willing to use the `distilabel.models.llms.MistralLLM` implementation, you will need to have Python 3.9 or higher.
 
 In some cases like [`transformers`](https://github.com/huggingface/transformers) and [`vllm`](https://github.com/vllm-project/vllm), the installation of [`flash-attn`](https://github.com/Dao-AILab/flash-attention) is recommended if you are using a GPU accelerator since it will speed up the inference process, but the installation needs to be done separately, as it's not included in the `distilabel` dependencies.
 

diff --git a/docs/sections/getting_started/quickstart.md b/docs/sections/getting_started/quickstart.md
@@ -30,12 +30,12 @@ pip install distilabel[hf-inference-endpoints] --upgrade
 
 ## Define a pipeline
 
-In this guide we will walk you through the process of creating a simple pipeline that uses the [`InferenceEndpointsLLM`][distilabel.llms.InferenceEndpointsLLM] class to generate text. The [`Pipeline`][distilabel.pipeline.Pipeline] will load a dataset that contains a column named `prompt` from the Hugging Face Hub via the step [`LoadDataFromHub`][distilabel.steps.LoadDataFromHub] and then use the [`InferenceEndpointsLLM`][distilabel.llms.InferenceEndpointsLLM] class to generate text based on the dataset using the [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task.
+In this guide we will walk you through the process of creating a simple pipeline that uses the [`InferenceEndpointsLLM`][distilabel.models.llms.InferenceEndpointsLLM] class to generate text. The [`Pipeline`][distilabel.pipeline.Pipeline] will load a dataset that contains a column named `prompt` from the Hugging Face Hub via the step [`LoadDataFromHub`][distilabel.steps.LoadDataFromHub] and then use the [`InferenceEndpointsLLM`][distilabel.models.llms.InferenceEndpointsLLM] class to generate text based on the dataset using the [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task.
 
 > You can check the available models in the [Hugging Face Model Hub](https://huggingface.co/models?pipeline_tag=text-generation&sort=trending) and filter by `Inference status`.
 
 ```python
-from distilabel.llms import InferenceEndpointsLLM
+from distilabel.models import InferenceEndpointsLLM
 from distilabel.pipeline import Pipeline
 from distilabel.steps import LoadDataFromHub
 from distilabel.steps.tasks import TextGeneration
@@ -85,9 +85,9 @@ if __name__ == "__main__":
 
 3. We define a [`LoadDataFromHub`][distilabel.steps.LoadDataFromHub] step named `load_dataset` that will load a dataset from the Hugging Face Hub, as provided via runtime parameters in the `pipeline.run` method below, but it can also be defined within the class instance via the arg `repo_id=...`. This step will produce output batches with the rows from the dataset, and the column `prompt` will be mapped to the `instruction` field.
 
-4. We define a [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task named `text_generation` that will generate text based on the `instruction` field from the dataset. This task will use the [`InferenceEndpointsLLM`][distilabel.llms.InferenceEndpointsLLM] class with the model `Meta-Llama-3.1-8B-Instruct`.
+4. We define a [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task named `text_generation` that will generate text based on the `instruction` field from the dataset. This task will use the [`InferenceEndpointsLLM`][distilabel.models.llms.InferenceEndpointsLLM] class with the model `Meta-Llama-3.1-8B-Instruct`.
 
-5. We define the [`InferenceEndpointsLLM`][distilabel.llms.InferenceEndpointsLLM] class with the model `Meta-Llama-3.1-8B-Instruct` that will be used by the [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task. In this case, since the [`InferenceEndpointsLLM`][distilabel.llms.InferenceEndpointsLLM] is used, we assume that the `HF_TOKEN` environment variable is set.
+5. We define the [`InferenceEndpointsLLM`][distilabel.models.llms.InferenceEndpointsLLM] class with the model `Meta-Llama-3.1-8B-Instruct` that will be used by the [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) task. In this case, since the [`InferenceEndpointsLLM`][distilabel.models.llms.InferenceEndpointsLLM] is used, we assume that the `HF_TOKEN` environment variable is set.
 
 6. Both `system_prompt` and `template` are optional fields. The `template` must be informed as a string following the [Jinja2](https://jinja.palletsprojects.com/en/3.1.x/templates/#synopsis) template format, and the fields that appear there ("instruction" in this case, which corresponds to the default) must be informed in the `columns` attribute. The component gallery for [`TextGeneration`](https://distilabel.argilla.io/dev/components-gallery/tasks/textgeneration/) has examples to get you started. 
 

diff --git a/docs/sections/how_to_guides/advanced/argilla.md b/docs/sections/how_to_guides/advanced/argilla.md
@@ -23,7 +23,7 @@ The dataset will be pushed with the following configuration:
     The [`TextGenerationToArgilla`][distilabel.steps.TextGenerationToArgilla] step will only work as is if the [`Pipeline`][distilabel.pipeline.Pipeline] contains one or multiple [`TextGeneration`][distilabel.steps.tasks.TextGeneration] steps, or if the columns `instruction` and `generation` are available within the batch data. Otherwise, the variable `input_mappings` will need to be set so that either both or one of `instruction` and `generation` are mapped to one of the existing columns in the batch data.
 
 ```python
-from distilabel.llms import OpenAILLM
+from distilabel.models import OpenAILLM
 from distilabel.steps import LoadDataFromDicts, TextGenerationToArgilla
 from distilabel.steps.tasks import TextGeneration
 
@@ -74,7 +74,7 @@ The dataset will be pushed with the following configuration:
     Additionally, if the [`Pipeline`][distilabel.pipeline.Pipeline] contains an [`UltraFeedback`][distilabel.steps.tasks.UltraFeedback] step, the `ratings` and `rationales` will also be available and be automatically injected as suggestions to the existing dataset.
 
 ```python
-from distilabel.llms import OpenAILLM
+from distilabel.models import OpenAILLM
 from distilabel.steps import LoadDataFromDicts, PreferenceToArgilla
 from distilabel.steps.tasks import TextGeneration
 

diff --git a/docs/sections/how_to_guides/advanced/assigning_resources_to_step.md b/docs/sections/how_to_guides/advanced/assigning_resources_to_step.md
@@ -4,7 +4,7 @@ When dealing with complex pipelines that get executed in a distributed environme
 
 ```python
 from distilabel.pipeline import Pipeline
-from distilabel.llms import vLLM
+from distilabel.models import vLLM
 from distilabel.steps import StepResources
 from distilabel.steps.tasks import PrometheusEval
 

diff --git a/docs/sections/how_to_guides/advanced/offline_batch_generation.md b/docs/sections/how_to_guides/advanced/offline_batch_generation.md
@@ -14,7 +14,7 @@ The [offline batch generation](../basic/llm/index.md#offline-batch-generation) i
 ## Example pipeline using `OpenAILLM` with offline batch generation
 
 ```python
-from distilabel.llms import OpenAILLM
+from distilabel.models import OpenAILLM
 from distilabel.pipeline import Pipeline
 from distilabel.steps import LoadDataFromHub
 from distilabel.steps.tasks import TextGeneration

diff --git a/docs/sections/how_to_guides/advanced/scaling_with_ray.md b/docs/sections/how_to_guides/advanced/scaling_with_ray.md
@@ -41,7 +41,7 @@ pip install distilabel[ray]
 For the purpose of explaining how to execute a pipeline with Ray, we'll use the following pipeline throughout the examples:
 
 ```python
-from distilabel.llms import vLLM
+from distilabel.models import vLLM
 from distilabel.pipeline import Pipeline
 from distilabel.steps import LoadDataFromHub
 from distilabel.steps.tasks import TextGeneration

diff --git a/docs/sections/how_to_guides/advanced/serving_an_llm_for_reuse.md b/docs/sections/how_to_guides/advanced/serving_an_llm_for_reuse.md
@@ -21,7 +21,7 @@ docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data \
 And then we can use `InferenceEndpointsLLM` with `base_url=http://localhost:8080` (pointing to our `TGI` local deployment):
 
 ```python
-from distilabel.llms import InferenceEndpointsLLM
+from distilabel.models import InferenceEndpointsLLM
 from distilabel.pipeline import Pipeline
 from distilabel.steps import LoadDataFromDicts
 from distilabel.steps.tasks import TextGeneration, UltraFeedback
@@ -66,7 +66,7 @@ docker run --gpus all \
 And then we can use `OpenAILLM` with `base_url=http://localhost:8000` (pointing to our `vLLM` local deployment):
 
 ```python
-from distilabel.llms import OpenAILLM
+from distilabel.models import OpenAILLM
 from distilabel.pipeline import Pipeline
 from distilabel.steps import LoadDataFromDicts
 from distilabel.steps.tasks import TextGeneration, UltraFeedback

diff --git a/docs/sections/how_to_guides/advanced/structured_generation.md b/docs/sections/how_to_guides/advanced/structured_generation.md
@@ -1,12 +1,12 @@
 # Structured data generation
 
-`Distilabel` has integrations with relevant libraries to generate structured text i.e. to guide the [`LLM`][distilabel.llms.LLM] towards the generation of structured outputs following a JSON schema, a regex, etc.
+`Distilabel` has integrations with relevant libraries to generate structured text i.e. to guide the [`LLM`][distilabel.models.llms.LLM] towards the generation of structured outputs following a JSON schema, a regex, etc.
 
 ## Outlines
 
-`Distilabel` integrates [`outlines`](https://outlines-dev.github.io/outlines/welcome/) within some [`LLM`][distilabel.llms.LLM] subclasses. At the moment, the following LLMs integrated with `outlines` are supported in `distilabel`: [`TransformersLLM`][distilabel.llms.TransformersLLM], [`vLLM`][distilabel.llms.vLLM] or [`LlamaCppLLM`][distilabel.llms.LlamaCppLLM], so that anyone can generate structured outputs in the form of *JSON* or a parseable *regex*.
+`Distilabel` integrates [`outlines`](https://outlines-dev.github.io/outlines/welcome/) within some [`LLM`][distilabel.models.llms.LLM] subclasses. At the moment, the following LLMs integrated with `outlines` are supported in `distilabel`: [`TransformersLLM`][distilabel.models.llms.TransformersLLM], [`vLLM`][distilabel.models.llms.vLLM] or [`LlamaCppLLM`][distilabel.models.llms.LlamaCppLLM], so that anyone can generate structured outputs in the form of *JSON* or a parseable *regex*.
 
-The [`LLM`][distilabel.llms.LLM] has an argument named `structured_output`[^1] that determines how we can generate structured outputs with it, let's see an example using [`LlamaCppLLM`][distilabel.llms.LlamaCppLLM].
+The [`LLM`][distilabel.models.llms.LLM] has an argument named `structured_output`[^1] that determines how we can generate structured outputs with it, let's see an example using [`LlamaCppLLM`][distilabel.models.llms.LlamaCppLLM].
 
 !!! Note
 
@@ -36,7 +36,7 @@ class User(BaseModel):
 And then we provide that schema to the `structured_output` argument of the LLM.
 
 ```python
-from distilabel.llms import LlamaCppLLM
+from distilabel.models import LlamaCppLLM
 
 llm = LlamaCppLLM(
     model_path="./openhermes-2.5-mistral-7b.Q4_K_M.gguf"  # (1)
@@ -129,7 +129,7 @@ These were some simple examples, but one can see the options this opens.
 
 ## Instructor
 
-For other LLM providers behind APIs, there's no direct way of accessing the internal logit processor like `outlines` does, but thanks to [`instructor`](https://python.useinstructor.com/) we can generate structured output from LLM providers based on `pydantic.BaseModel` objects. We have integrated `instructor` to deal with the [`AsyncLLM`][distilabel.llms.AsyncLLM].
+For other LLM providers behind APIs, there's no direct way of accessing the internal logit processor like `outlines` does, but thanks to [`instructor`](https://python.useinstructor.com/) we can generate structured output from LLM providers based on `pydantic.BaseModel` objects. We have integrated `instructor` to deal with the [`AsyncLLM`][distilabel.models.llms.AsyncLLM].
 
 !!! Note
     For `instructor` integration to work you may need to install the corresponding dependencies:
@@ -159,7 +159,7 @@ And then we provide that schema to the `structured_output` argument of the LLM:
     In this example we are using *Meta Llama 3.1 8B Instruct*, keep in mind not all the models support structured outputs.
 
 ```python
-from distilabel.llms import MistralLLM
+from distilabel.models import MistralLLM
 
 llm = InferenceEndpointsLLM(
     model_id="meta-llama/Meta-Llama-3.1-8B-Instruct",
@@ -204,7 +204,7 @@ Contrary to what we have via `outlines`, JSON mode will not guarantee the output
 Other than the reference to generating JSON, to ensure the model generates parseable JSON we can pass the argument `response_format="json"`[^3]:
 
 ```python
-from distilabel.llms import OpenAILLM
+from distilabel.models import OpenAILLM
 llm = OpenAILLM(model="gpt4-turbo", api_key="api.key")
 llm.generate(..., response_format="json")
 ```