Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow status update delayed by 60 seconds due to dual locks when task is failed with lock enabled #295

Open
pugazhenthi-elangovan-E3338 opened this issue Oct 22, 2024 · 0 comments

Comments

@pugazhenthi-elangovan-E3338

Describe the bug
Whenever the task fails, workflow status has to be updated as failed. When the workflowExecutionLockEnabled is set to true, this particular flow tries to acquire the new lock over existing lock without releasing it. Thus, the workflow state transition to failed status takes lockLeaseTime time (default 60 seconds) which is not the intended.

Details
Conductor version: v3.19.0
Persistence implementation: Postgres
Queue implementation: Postgres
Lock: Postgres
Workflow definition:

{
  "createTime": 1729257249830,
  "accessPolicy": {},
  "name": "google_error_hit",
  "description": "Edit or extend this sample workflow. Set the workflow name to get started",
  "version": 1,
  "tasks": [
    {
      "name": "call_remote_api",
      "taskReferenceName": "call_remote_api",
      "inputParameters": {
        "http_request": {
          "uri": "https://google-wrong.com",
          "method": "GET"
        }
      },
      "type": "HTTP",
      "startDelay": 0,
      "optional": false,
      "asyncComplete": false,
      "permissive": false
    }
  ],
  "inputParameters": [],
  "outputParameters": {
    "data": "${call_remote_api.output.response.body.data}",
  },
  "schemaVersion": 1,
  "restartable": true,
  "workflowStatusListenerEnabled": false,
  "ownerEmail": "[email protected]",
  "timeoutPolicy": "ALERT_ONLY",
  "timeoutSeconds": 0,
  "variables": {},
  "inputTemplate": {}
}

To Reproduce
Steps to reproduce the behavior:

  1. Start the conductor with the app property conductor.app.workflowExecutionLockEnabled as true
  2. Save the provided workflow in 'Worklfow Definitions'
  3. Click on 'Workbench' and execute the google_error_hit workflow.
  4. Click on New Execution created at right top
  5. See Task error and failed status
  6. Workflow will only be failed after 60 seconds of task failure.

Expected behavior
Workflow should get failed immediately after task failure as per the workflow execution flow handled in code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant