-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Telemetry: Add option for a separate trace for every job attempt/start #2942
Comments
In your use case, do you mean that you have always one active job per user? |
Basically a job in this queue exists as long as the user exists and it runs every 5 minutes and is never moved to completed. An illustrated example of our worker that should help explain: const worker = new Worker(
"pollMessagesForUser",
async (job: Job, token?: string) => {
const userId = job.id;
await messagesService.fetchAndStoreUserMessages(userId);
await job.moveToDelayed(Date.now() + 300_000, token);
throw new DelayedError();
},
{ connection }
); In the meanwhile I was able to solve the issue myself for now by patching diff --git a/dist/cjs/utils.js b/dist/cjs/utils.js
index 6caa7bae682a971ffac6f585aca1b54b9aaa3267..b40cd744c05ca81f8eecdca93b7086fed3c1568a 100644
--- a/dist/cjs/utils.js
+++ b/dist/cjs/utils.js
@@ -238,7 +238,7 @@ async function trace(telemetry, spanKind, queueName, operation, destination, cal
const { tracer, contextManager } = telemetry;
const currentContext = contextManager.active();
let parentContext;
- if (srcPropagationMetadata) {
+ if (false) {
parentContext = contextManager.fromMetadata(currentContext, srcPropagationMetadata);
}
const spanName = destination ? `${operation} ${destination}` : operation; |
@lovre-nc what about having an extra option in Queue.add, where you can specify if you want the given added job to propagate the tracer to the consumer or nor propagate it? |
@manast In our case specifically, I can't think of a scenario where we would want to change this on the job level. In our system, different types of jobs always have their own, separate queues. Not sure if this is related enough, but what could be very useful, is being able to supply a parent trace or context ( not sure about the terminology) manually when adding a job. For example when a few different jobs are part of a larger operation, this would allow us to have all these related jobs and other non-bullmq spans together in one trace. Example trace
|
@lovre-nc yes, If the service that started the whole process is in a different machine than the service that is adding the job, then I guess you would need to have the possibility to specify the telemetry context so that you can get a trace spanning all these processes. In our tutorial we have the case where a express server takes HTTP requests that result in a single trace for all the process: https://blog.taskforce.sh/how-to-integrate-bullmqs-telemetry-on-a-newsletters-subscription-application-2/ |
@lovre-nc actually, it is already possible to specify the context metadata when adding a job, we use this option internally precisely to keep all the spans in the same trace. |
@manast that's perfect, sorry I missed that. The |
Is your feature request related to a problem? Please describe.
Hi, we've attempted to migrate from @appsignal/opentelemetry-instrumentation-bullmq to the bullmq-otel package.
It seems that a single trace is re-used for the whole lifespan of a job. This is a problem for us because we have a queue where individual jobs exist and repeat forever. We have one job per user and it runs every 5 minutes by delaying it at the end of the worker instead of letting it complete. That means that these jobs have thousands or millions of starts.
When using the official bullmq telemetry this seems to result in one endlessly long trace with thousands or millions of spans inside it.
This behavior may be useful for queues where the jobs are one-off jobs to see multiple job attempts together but it does not work at all for these kind of repeating jobs.
Describe the solution you'd like
Ideally this behavior would be configurable per queue.
We'd like to have one trace per job attempt/start instead of one trace per job.
Describe alternatives you've considered
Staying with @appsignal/opentelemetry-instrumentation-bullmq, but we have some issues with it so that's not ideal.
Additional context
None
The text was updated successfully, but these errors were encountered: