Global Rate Limiting for Hatchet Steps #305

grutt · 2024-03-29T21:28:26Z

grutt
Mar 29, 2024
Maintainer

Problem

Hatchet steps often depend on external dependencies, such as OpenAI, which have their own rate limits on requests per second (RPS) or requests per minute (RPM). Currently, Hatchet lacks a built-in mechanism to enforce these rate limits across workflow runs, leading to potential issues such as:

Overloading external services and causing throttling or service disruptions
Unpredictable workflow execution due to rate limit violations
Difficulty in managing and monitoring rate limit compliance

To address these problems, we propose a feature that allows users to define rate limits at the tenant level and consume them in steps, ensuring compliance with external service limits and providing better control over workflow execution.

Objectives

Allow users to define rate limits at the workflow level and share them across multiple steps
Enable steps to consume predefined rate limits
Support rate limits specified in various units of time (e.g., RPS, RPM)
Provide an optional key for rate limits to enable per-user or per-entity rate limiting
Ensure the Hatchet engine schedules tasks in compliance with the declared rate limits

Proposed Solution

Rate Limit Definition

Introduce a new top-level object called rateLimits which can be declared via the Hatchet client. This will be an object where the keys represent the rate limit identifiers, and the values are the rate limit configurations.

interface RateLimitConfig {
  resource: string;
  limit: number;
  unit: "second" | "minute" | "hour";
  key?: (ctx: Context) => string;
}

hatchet.admin.put_rate_limits(**limits: RateLimitConfig[])

Example usage:

 hatchet.admin.put_rate_limits([
  {
      resource: "openai",
      limit: 10,
      unit: "second",
  },
  {
      resource: "user",
      limit: 100,
      unit: "minute",
      key: (ctx) => ctx.workflowInput().userId,
  }
])

If the key function is not specified for a rate limit, it will be treated as a global rate limit shared across all workflow runs. Otherwise the limit will only apply to requests for the same key

Here's the revised "Rate Limit Consumption" section addressing the granular control over rate limit consumption:

Rate Limit Consumption

Introduce a new field called rateLimitConsumers in the step definition object. This field will be an object where the keys represent the rate limit policy IDs, and the values specify the number of units consumed by the step.

interface StepDefinition {
  // ...existing fields...
  rateLimitConsumers?: Record<string, number>;
}

Example usage:

const step1: StepDefinition = {
  // ...
  rateLimitConsumers: {
    openai: 1,
  },
};

const step2: StepDefinition = {
  // ...
  rateLimitConsumers: {
    openai: 2,
    user: 1,
  },
};

In this example, step1 consumes 1 unit of the openaiLimit rate limit policy, while step2 consumes 2 units of the openaiLimit policy and 1 unit of the twitterUserLimit policy.

By using an object instead of an array, we allow steps to specify the exact number of units they consume for each rate limit policy. This provides more granular control over rate limit consumption and enables the Hatchet engine to enforce the limits accurately.

The rateLimitConsumers field is optional, allowing steps to opt-in to rate limit consumption as needed. If a step does not specify the rateLimitConsumers field or does not include a specific rate limit policy ID, it will not consume any units from that policy.

It's important to note that the rate limit policy IDs used in the rateLimitConsumers field should match the IDs defined in the Rate Limit Policies configuration. The Hatchet engine will use these IDs to look up the corresponding rate limit policies and enforce the limits based on the specified consumption units.

Workflow Gating

Workflows can be rate limited by specifying a rateLimitConsumer on the first step in the Workflow DAG.

Rate Limit Enforcement

The Hatchet engine will enforce the rate limits based on the workflow-level definitions and the step-level consumption declarations.

For each rate limit consumed by a step:

If a key function is specified, the engine will maintain a per-key rate limiter based on the key value returned by the function.
If no key function is specified, the engine will use a global rate limiter shared across all workflow runs.

When a step is ready to be executed, the engine will check the rate limiters associated with the step's consumed rate limits. If any of the rate limiters indicate that the limit has been reached, the engine will delay the execution of the step until the rate limit allows for the next request.

Rate Limit Monitoring and Logging

Hatchet will provide monitoring and logging capabilities for rate limit usage and violations, as described in the previous response.

Risks and Considerations

Increased complexity in the Hatchet engine to handle rate limit enforcement and scheduling
Potential impact on workflow execution latency due to rate limit delays
Need for accurate and efficient rate limiting algorithms to minimize false positives and negatives
Compatibility with external dependencies that have unique or complex rate limiting policies

grutt · 2024-04-08T13:39:12Z

grutt
Apr 8, 2024
Maintainer Author

Released in #324

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Global Rate Limiting for Hatchet Steps #305

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Global Rate Limiting for Hatchet Steps #305

grutt Mar 29, 2024 Maintainer

Problem

Objectives

Proposed Solution

Rate Limit Definition

Rate Limit Consumption

Workflow Gating

Rate Limit Enforcement

Rate Limit Monitoring and Logging

Risks and Considerations

Replies: 1 comment

grutt Apr 8, 2024 Maintainer Author

grutt
Mar 29, 2024
Maintainer

grutt
Apr 8, 2024
Maintainer Author