Skip to content

Commit

Permalink
Add StepResources docs
Browse files Browse the repository at this point in the history
  • Loading branch information
gabrielmbmb committed Jun 26, 2024
1 parent 5c685e9 commit 849a806
Show file tree
Hide file tree
Showing 5 changed files with 37 additions and 2 deletions.
3 changes: 3 additions & 0 deletions docs/api/step/resources.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# StepResources

::: distilabel.steps.base.StepResources
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Assigning resources to a `Step`

When dealing with complex pipelines that gets executed in a distributed environment with abundant resources (CPUs and GPUs), sometimes it's necessary to allocate these resources judiciously among the `Step`s. This is why `distilabel` allows to specify the number of `replicas`, `cpus` and `gpus` for each `Step`. Let's see that with an example:

```python
from distilabel.pipeline import Pipeline
from distilabel.llms import vLLM
from distilabel.steps import StepResources
from distilabel.steps.tasks import PrometheusEval


with Pipeline(name="resources") as pipeline:
...

prometheus = PrometheusEval(
llm=vLLM(
model="prometheus-eval/prometheus-7b-v2.0",
chat_template="[INST] {{ messages[0]['content'] }}\\n{{ messages[1]['content'] }}[/INST]",
),
resources=StepResources(replicas=2, cpus=1, gpus=1)
mode="absolute",
rubric="factual-validity",
reference=False,
num_generations=1,
group_generations=False,
)
```

In the example above, we're creating a `PrometheusEval` task (remember that `Task`s are `Step`s) that will use `vLLM` to serve `prometheus-eval/prometheus-7b-v2.0` model. This task is resource intensive as it requires an LLM, which in turn requires a GPU to run fast. With that in mind, we have specified the `resources` required for the task using the [`StepResources`][distilabel.steps.base.StepResources] class, and we have defined that we need `1` GPU and `1` CPU per replica of the task. In addition, we have defined that we need `2` replicas i.e. we will run two instances of the task so the computation for the whole dataset runs faster. When running the pipeline, `distilabel` will create the tasks in nodes that have available the specified resources.

Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ These were some simple examples, but one can see the options this opens.

!!! Tip
A full pipeline example can be seen in the following script:
[`examples/structured_generation_with_outlines.py`](../../pipeline_samples/examples/#llama-cpp-with-outlines)
[`examples/structured_generation_with_outlines.py`](../../pipeline_samples/examples/index.md#llamacpp-with-outlines)

[^1]:
You can check the variable type by importing it from:
Expand Down
2 changes: 1 addition & 1 deletion docs/sections/pipeline_samples/examples/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This section contains different example pipelines that showcase different tasks, maybe you can take inspiration from them.

### [llama.cpp with `outlines`](#llama-cpp-with-outlines)
### [llama.cpp with `outlines`](#llamacpp-with-outlines)

Generate RPG characters following a `pydantic.BaseModel` with `outlines` in `distilabel`.

Expand Down
2 changes: 2 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,7 @@ nav:
- Using a file system to pass data of batches between steps: "sections/how_to_guides/advanced/fs_to_pass_data.md"
- Using CLI to explore and re-run existing Pipelines: "sections/how_to_guides/advanced/cli/index.md"
- Cache and recover pipeline executions: "sections/how_to_guides/advanced/caching.md"
- Assigning resources to a step: "sections/how_to_guides/advanced/assigning_resources_to_step.md"
- Structured data generation: "sections/how_to_guides/advanced/structured_generation.md"
- Serving an LLM for sharing it between several tasks: "sections/how_to_guides/advanced/serving_an_llm_for_reuse.md"
- Pipeline Samples:
Expand All @@ -176,6 +177,7 @@ nav:
- GeneratorStep: "api/step/generator_step.md"
- GlobalStep: "api/step/global_step.md"
- "@step": "api/step/decorator.md"
- StepResources: "api/step/resources.md"
- Step Gallery:
- Argilla: "api/step_gallery/argilla.md"
- Hugging Face: "api/step_gallery/hugging_face.md"
Expand Down

0 comments on commit 849a806

Please sign in to comment.