Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail on send telemetry on "warm start" and "shutdown" after timeout #195

Open
Vladikamira opened this issue Feb 19, 2024 · 9 comments
Open
Assignees

Comments

@Vladikamira
Copy link

The context:
datadog-lambda-extension version 44
Lambda and DD site is located in the EU.

the problem:
It looks like dd-extension starts telemetry flush and does not stop before each invocation ends when Lambda enters the IDLE state. We see that by Warnings and Errors messages when the function gets invoked either after the warm start on subsequent invocation or the shutdown event.
Error/Warning pops up when more than WaitTimeout (5 seconds) has passed after the previous invocation (I didn't dive deep into the code, but that looks like the proper threshold).

Example of the WARN:

DD_EXTENSION | WARN | Could not send payload: Post "https://http-intake.logs.datadoghq.eu/api/v2/logs": context deadline exceeded (Client. Timeout exceeded while awaiting headers)

example of the ERROR:

 | UTC | DD_EXTENSION | ERROR | Exporting failed. No more retries left. Dropping data.

Here are some examples in screenshots:

WARN on subsequent invocation:

WARN on Invocation

WARN (with debug) on invocation

WARN on shutdown event:
WARN on shutdown

WARN on shutdown with debug

ERROR on shutdown (this one seems to be from the OTLP part of the agent):

ERROR and shutdown event

@Vladikamira
Copy link
Author

problem reproduced on the datadog-lambda-extension version 53 (cannot check 55 as the image is not published)

@zARODz11z
Copy link
Contributor

Hi @Vladikamira i'm able to find v55's tag here https://hub.docker.com/r/datadog/lambda-extension/tags. May you clarify if we are on the same page about

cannot check 55 as the image is not published

@Vladikamira
Copy link
Author

Vladikamira commented Feb 23, 2024

Hi @Vladikamira i'm able to find v55's tag here https://hub.docker.com/r/datadog/lambda-extension/tags. May you clarify if we are on the same page about

cannot check 55 as the image is not published

indeed, I will try that one from DockerHub, thanks! 👍
We are using the AWS one: https://gallery.ecr.aws/datadog/lambda-extension, the last one there is 53

@Vladikamira
Copy link
Author

version 55 has the same problem

Screenshot 2024-02-23 at 10 41 06

@hghotra
Copy link

hghotra commented Sep 10, 2024

Hi @Vladikamira, can you try the latest version of the extension and report back? This was fixed in a recent version of the extension.

@DylanLovesCoffee
Copy link
Contributor

Upgrading to v59+ should include a change that improves the flushing logic for sending logs. Could we give that a try and then keep us updated?

@Vladikamira
Copy link
Author

Thanks! yup, I cannot reproduce the issue anymore on the version v64 🎉

but I got new WARN messages though 😅

2024-09-11 15:09:38 UTC | DD_EXTENSION | WARN | config key flare_stripped_keys is unknown
2024-09-11 15:09:38 UTC | DD_EXTENSION | WARN | failed to get configuration value for key "flare_stripped_keys": unable to cast <nil> of type <nil> to []string
2024-09-11 15:09:38 UTC | DD_EXTENSION | WARN | config key scrubber.additional_keys is unknown
2024-09-11 15:09:38 UTC | DD_EXTENSION | WARN | failed to get configuration value for key "scrubber.additional_keys": unable to cast <nil> of type <nil> to []string

@Vladikamira
Copy link
Author

anyway, this issue is fixed, therefore I'm closing PR, thanks! 🙏
I'Il open another one for a new issue 😄

@Vladikamira
Copy link
Author

sorry, we do not have these massages anymore after the upgrade to v64

DD_EXTENSION | WARN | Could not send payload: Post "https://http-intake.logs.datadoghq.eu/api/v2/logs": EOF (Client.Timeout exceeded while awaiting headers)

but we still have these

 DD_EXTENSION | WARN | SyncForwarder.sendHTTPTransactions failed to send: error while sending transaction, rescheduling it: Post "https://7-55-3-app.agent.datadoghq.eu/api/v1/series": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Screenshot 2024-09-12 at 14 00 09

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants