-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend workflowcheck tool to detect reads from (WorkflowExecution).RunId and suggest alternatives. #1385
Comments
Can you clarify the scenario? Workflow retry starts a whole new workflow so the fact that RunID changes should not lead to non determinism |
@Quinn-With-Two-Ns It leads to non-determinism if you use the value of RunID as an input to an activity or child workflow. For example, we were rendering the URL to the Temporal UI before kicking off a child workflow that creates a pull request, linking our developers back to the workflow that was created for them. In my scenario, the workflow had previously passed this step successfully so during replay the input value to the child workflow had changed. Edit: In fact this journey led me this field, which documents itself as a more viable alternative. |
Activities and Child workflows are not carried over retries. Non determinism only applies to a single workflow execution. A retried workflow can take a totally different path then the parent. It sounds like your trying to use RunID as an idempotency token across workflow retries, which is an error in your code, but is not non determinism. The workflow check is designed to catch non determinism errors not all possible programming errors, |
Note: in general we do not recommend workflow retries, we recommend handling errors in your workflow. |
I don't follow what you're saying here. The input values need to match during the execution of a workflow task, or you get a non-determinism error when you reach the point in the workflow task when it attempts to run the activity or child workflow, right? In my workflow I execute a child workflow at a point in the event stream, and during the retry the input value is not the same as it was during the previous workflow task because it refers to a Run ID, which has changed due to the retry. To be clear, I don't expect (or desire) that the child workflow is actually run again - we're talking about replaying the event stream to determine the next action for the workflow to take.
In my case I had a workflow fail near the end of it's run due to a program defect. I've corrected the defect and would like to replay the workflow from the point of failure. Are you suggesting that the workflow retry command is not intended for me to do that? What alternative methods are suggested for resuming my terminated, failed workflow execution so that I can complete the intended actions? Edit: Wait, I now see the confusion in the second scenario - I've been saying "retry", I mean workflow "reset". That's what I'm trying to do 😓. I am not using workflow retries at all. |
Ah OK, reset makes more sense
No, input is not checked when the SDK is validating a workflow is deterministic. |
Oh? If that's so that I actually am back at square one and don't understand this determinism error! Will keep digging. |
Only the |
@Quinn-With-Two-Ns You're definitely pointing me in the right direction here, I updated my code to use OriginalRunId and I'm still seeing this error:
That ID is the original Run ID, plus the Event ID where the child was spawned. My current thesis is that since I'm not setting a workflow ID on the child workflow execution, it's comparing a newly generated child workflow ID that does not match that value. |
I think the interesting piece of code is: https://github.com/temporalio/sdk-go/blob/master/internal/internal_event_handlers.go#L538-L540. func (wc *workflowEnvironmentImpl) ExecuteChildWorkflow(
params ExecuteWorkflowParams, callback ResultHandler, startedHandler func(r WorkflowExecution, e error),
) {
if params.WorkflowID == "" {
params.WorkflowID = wc.workflowInfo.WorkflowExecution.RunID + "_" + wc.GenerateSequenceID()
} If I understand this correctly, any child workflow execution without a deterministically set Workflow ID will make the parent execution not "reset safe". |
Ah if your using child workflow without an ID then this is a know issue #723. I would recommend setting your own ID for now since as the linked issue says the SDK generated ID needs to not conflict with future server work that is not currently prioritized. |
Awesome, thanks @Quinn-With-Two-Ns! I'll update our internal guidelines to always set child workflow IDs in the meantime. I wonder if that is a useful check for |
Is your feature request related to a problem? Please describe.
I've recently learned that using the value of RunId within workflow task code is non-deterministic under workflow
retryreset scenarios.Describe the solution you'd like
I'd love for the workflowcheck tool to detect reads of this particular property within workflow functions and suggest use of the WorkflowInfo.OriginalRunId (or FirstRunId) field instead.
Describe alternatives you've considered
N/A.
Additional context
N/A.
The text was updated successfully, but these errors were encountered: