Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Donation Proposal]: compile time instrumentation for golang #2344

Open
ralf0131 opened this issue Sep 11, 2024 · 2 comments
Open

[Donation Proposal]: compile time instrumentation for golang #2344

ralf0131 opened this issue Sep 11, 2024 · 2 comments
Labels
triage:deciding This issue needs more discussion or consideration.

Comments

@ralf0131
Copy link

ralf0131 commented Sep 11, 2024

Description

This is the formal donation proposal based on the previous discussion: #1961

Alibaba Cloud would like to donate the compile time instrumentation for golang to OpenTelmetry community. The project is a compile time instrumentation solution designed for Go applications. It empowers users to harness the capabilities of OpenTelemetry for enhanced observability without any manual modifications.

The core features of this project are:

  • No user-side code modifications are required for the instrumentation to work.
  • Support asynchronous context propagation, which does not require user to ensure the context is correctly passed.
  • If users have previously added custom spans using the OTel-SDK within their code, the approach integrates these custom spans into the generated trace data, which significantly reduce the overhead for users to migrate from manual instrumentation to compile time instrumentation.
  • Minor performance impact:5% cpu overhead, negligible memory footprint and < 1ms response time overhead
  • Inspired by OpenTelemetry Java Agent, comprehensive test, like muzzle check, have been done to make sure the codes are working as expected

The side effect of this solution are:

  • Increased the final binary size.
  • Requires changes to compile command
  • Extra compile time for applications

Benefits to the OpenTelemetry community

Currently there are several approaches to collect observability data for Golang applications:

  1. Manual Instrumentation, which add custom instrumentation code through OTel-SDK
  2. Compile time Instrumentation, which inject instrumentation code at compile time by leveraging the go build toolexec capability
  3. eBPF based auto instrumentation, which inject instrumentation by injecting eBPF code into the kernel

For the OpenTelemetry community, it is important to have different approaches available to users, and each of the have its own advantage and disadvantage, and let users to choose the approach that best fix to their scenarios.
We found that the compile time instrumentation have not been gained enough awareness by the users and should be equally treated as an alternative to users.

Due to the lack of activity and maintenance, the existing compile time instrumentation Instrgen does not provide a production ready solution for users. The donation of Alibaba Cloud's compile time instrumentation will help the OpenTelemetry project to fill the gap in compile time instrumentation for golang. The OpenTelemetry users will gain yet another a feature rich and production ready solution for Golang and first class support as well as manual and eBPF based instrumentation.

In addition, with this donation, the OpenTelemetry community would benefit from having a team of observability engineers who works across different client instrumentations to (co-)maintain the Golang instrumentation and advance OpenTelemetry's influence especially in the APAC area.

Reasons for donation

Alibaba Cloud is actively embracing OpenTelemetry and the ecosystem, and would like to help expand the influence of OTel community to APAC area. We have completely rebuild our existing instrumentation based on OpenTelemetry, published the OpenTelemetry distribution for Java Agent, and open sourced the Golang compile time instrumentation, supporting OpenTelemetry natively in cloud service and contributing them back to the community.

Repository

https://github.com/alibaba/opentelemetry-go-auto-instrumentation

Existing usage

The project has been open sourced since Jan 2024, and the first release has been published on September. The commercial version which is built base the same core has been published on June and has been used by customers, and some of the customers are running in the production system.

Maintenance

In case of a successful donation Alibaba Cloud observability team would like to (co-)maintain this project and maintainers of existing approach are welcome to achieve a vendor neutral governance of the project.

Licenses

The project is licensed under the Apache 2.0 license.

Trademarks

The project has been open sourced with the aim to donate to OpenTelemetry, therefore it does not have any trademark other than OpenTelemetry.

Other notes

Roadmap of the project during/after the donation:

  1. Support more libraries, such as gRPC / Gin / Kratos / Apache RocketMQ / Apache Dubbo etc.
  2. Support runtime monitoring for golang, like garbage collection and etc.
  3. Support the logging signal.
  4. Support the profiling signal, including cpu and memory profiling, and support the latest OTel profiling data format.
  5. Support trace and profiling correlation.
  6. Extend the instrumentation rules to users to support customized instrumentation.
  7. Support the GenAI semantic convention.
@trask trask added the triage:deciding This issue needs more discussion or consideration. label Sep 17, 2024
@tedsuo
Copy link
Contributor

tedsuo commented Sep 26, 2024

@open-telemetry/go-maintainers @open-telemetry/go-instrumentation-maintainers

Hello @ralf0131, and thank you! We are excited to have Alibaba as part of the OpenTelemetry community. ❤️

We agree that manual, compile time, and eBPF instrumentation are all useful for different reasons, and we believe that it is worthwhile to maintain all three approaches. We also believe that it is important for these three approaches to be coordinated and well documented, so that we can tell our users a clear and coherent story about how to instrument their applications.

Since this donation covers the same area as existing work within OpenTelemetry (Instrgen) and that work is managed out of an existing SIG (go-instrumentation), we would like for this donation to be coordinated through that group. I know that you are already in discussion with @pdelewski about how to merge the projects, so this feels like the natural way to continue.

Currently, Instrgen is maintained as part of go-contrib. Either the new work can get merged there, or a new repository could be created if the go maintainers feel that this project is large enough. Either way, we would like code review for this donation to be handled by the Go Instrumentation SIG, not by the TC, as it is very specialized work.

@ralf0131
Copy link
Author

@tedsuo Thanks for the update, I have watched the video recording of the GC meeting and would like to respond to some of your questions:

  1. Alibaba Cloud will continue to work on the project if it is donated, and will keep the effort to maintain the project in the future. We want to make sure the core codes are open sourced, and the commercial version is build based the the open source project, whether or not it gets donated.
  2. You mentioned about the communication channels, I think the slack channel should be fine with us, and we will be really happy if there is a APAC friendly meeting with Go auto-instrumentation SIG. Please introduce the folks from the SIG and we can have further discussions, either in slack channel or scheduling a meeting will be fine.
  3. Regarding a code base as the starting point, I think this is a key point that we would like to discuss with the Go auto-instrumentation SIG maintainers, as well as maintainer of Instrgen @pdelewski. From our point of view, I would say that the Alibaba code base is a better starting point, since it has more active developers, originated from the commercial version which has been running in the production system, support more plugins, provider better documentation and code quality in terms of code coverage and various tests. The unique feature from Instrgen can be merged, and eventually a dedicate repository should be created, instead of going to the go-contrib directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage:deciding This issue needs more discussion or consideration.
Projects
None yet
Development

No branches or pull requests

3 participants