Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MLOB-1561] LLM Observability SDK API #4773

Merged
merged 36 commits into from
Oct 24, 2024

Conversation

sabrenner
Copy link
Collaborator

@sabrenner sabrenner commented Oct 11, 2024

What does this PR do?

Adds an LLM Observability SDK API onto the tracer.

Important Call-Outs

ML Observability reviewers:

  • index.d.ts contains the TypeScript type definitions which will constitute our API. I think this is most relevant to what we have been solidifying across our SDKs
  • sdk.js is the main file for the SDK, which includes handling different values passed into the API functions and verifying the required data is met

APM Reviewers:

  • packages/dd-trace/src/llmobs/index.js houses the enablement of the LLMObs module. However, since the SDK is always initialized even when LLMObs isn't enabled (it still acts in a no-op state although it is not the no-op SDK), any writers and channel subscribers happen in the module, which is only enabled when a) enable(config) is called from the tracer proxy because LLMObs was enabled during init, or b) llmobs.enable(llmobsConfig) is called after init is called. This way, no writers/periodic flushing and span processing/injection subscribers are registered if LLMObs isn't explicitly enabled.

Motivation

Part of the LLM Observability SDK release. This is the last PR for the main SDK. A follow-up PR will be for an OpenAI LLM Obs plugin.

Copy link

github-actions bot commented Oct 11, 2024

Overall package size

Self size: 7.7 MB
Deduped: 62.44 MB
No deduping: 62.72 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/native-appsec | 8.1.1 | 18.67 MB | 18.68 MB | | @datadog/native-iast-taint-tracking | 3.1.0 | 12.27 MB | 12.28 MB | | @datadog/pprof | 5.3.0 | 9.85 MB | 10.22 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.5.0 | 2.51 MB | 2.59 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 2.0.0 | 898.77 kB | 1.3 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | lru-cache | 7.14.0 | 74.95 kB | 74.95 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | path-to-regexp | 0.1.10 | 6.38 kB | 6.38 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@pr-commenter
Copy link

pr-commenter bot commented Oct 11, 2024

Benchmarks

Benchmark execution time: 2024-10-17 23:55:17

Comparing candidate commit 7c7a328 in PR branch sabrenner/llmobs-sdk-sdk with baseline commit b6452ad in branch sabrenner/llmobs-sdk-release.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 259 metrics, 7 unstable metrics.

@sergey-mamulchenko
Copy link

Hi, we're interested to try out monitoring LLMs used in our Node application. Do you have an idea when JS SDK is planned to be shipped?

index.d.ts Outdated Show resolved Hide resolved
@sabrenner sabrenner marked this pull request as ready for review October 16, 2024 17:04
@sabrenner sabrenner requested a review from a team as a code owner October 16, 2024 17:04
@sabrenner
Copy link
Collaborator Author

Hi @sergey-mamulchenko! We're looking to release this SDK by the end of the month. Feel free to follow along with #4742, as that will be the PR that lands containing the SDK 😄

Copy link

@lievan lievan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking comments, ty!

index.d.ts Show resolved Hide resolved
packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved
packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved
Copy link
Contributor

@Yun-Kim Yun-Kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a small comment about exposing active() and some clarification questions, no major blocking comments from the MLObs team perspective! Great work @sabrenner 🎉

docs/test.ts Show resolved Hide resolved
docs/test.ts Show resolved Hide resolved
index.d.ts Outdated Show resolved Hide resolved
index.d.ts Show resolved Hide resolved
index.d.ts Outdated Show resolved Hide resolved
packages/dd-trace/src/llmobs/sdk.js Show resolved Hide resolved

if (result && typeof result.then === 'function') {
return result.then(value => {
if (value && kind !== 'retrieval' && !LLMObsTagger.tagMap.get(span)?.[OUTPUT_VALUE]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this still attempt to auto annotate the output for an llm-kind span? Asking because LLM span outputs are in message format so we should just avoid auto-annotating here entirely.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh yeah, i actually missed that our _model_decorators do not auto annotate at all. i'll add a check here for llm, and then input annotation check for llm and embedding, and just write a couple small regression tests!

packages/dd-trace/src/llmobs/sdk.js Show resolved Hide resolved
packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved
}

// extracts the argument names from a function string
function parseArgumentNames (str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is parsing the argument and function signature from the entire serialized function? 💀

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah 💀 but i memoized it with a weak reference to the function, so if the same function is invoked a million times, it'll use the parsed argument names from the first time and map over the arguments accordingly. tbh this might need some iteration, because i tried to assume user cases in writing tests but i'm sure i didn't think of everything. try/catched it so we will never crash, but will probably iterate on it accordingly from user reports.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice if JS had some built-in parsers/helpers like Python, but all pretty basic string operations here so it shouldn't be too time consuming, its just all linear

Co-authored-by: Yun Kim <[email protected]>
Copy link
Member

@Kyle-Verhoog Kyle-Verhoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't speak too much to the llm-specific stuff since i'm still ramping up to everything. Main concern is the soft fails instead of hard. Feel free to merge and address in the bigger PR.

docs/test.ts Show resolved Hide resolved
index.d.ts Outdated Show resolved Hide resolved
index.d.ts Show resolved Hide resolved
index.d.ts Outdated Show resolved Hide resolved
index.d.ts Outdated Show resolved Hide resolved
packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved
packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved
packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved
packages/dd-trace/src/llmobs/sdk.js Outdated Show resolved Hide resolved
@sabrenner
Copy link
Collaborator Author

Will merge this PR in to the feature branch, it looks like a serverless benchmark is failing but I will resolve that in the final PR if needed

@sabrenner sabrenner merged commit 7e8e0f7 into sabrenner/llmobs-sdk-release Oct 24, 2024
200 of 202 checks passed
@sabrenner sabrenner deleted the sabrenner/llmobs-sdk-sdk branch October 24, 2024 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants