-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] opentelemetry-go-auto-instrumentation #1961
Comments
In addition to the eBPF solution there is also a code generation solution called instrgen: cc @pdelewski |
Before diving deeper, may I kindly ask if this would be similar in implementation to:
Edit: I have great respect for every project and idea, but based on previous observations, their biggest challenge often lies in the lack of a sufficient number of contributors to sustain active development. However, regardless of that, I am extremely grateful for the approach of auto-instrumentation in Go applications and I hope it can be sustained. |
The idea described by this proposal is generally the same as both projects mentioned above by @jiekun. We should rather focus on what we have so far, build larger community around it and improve. I would be very happy to see more people contributing. |
Thank you all for the comments. Sorry, I didn't notice instrgen before. I'll take some time to learn about it and see how we can improve it together. As @pdelewski said, @jiekun The principle of rewriting code of these projects you mentioned should be similar. |
Thank you @D-D-H. There has been a Go-auto-instrumentation started, with two approaches led by @edeNFed and @pdelewski. I would like to see you join the Go-auto-instrumentation group, which still has a meeting on the OpenTelemetry calendar Tuesdays, 9:30am PST to discuss this proposal. If this group does not have critical mass, we recommend moving this sort of topic into the Go SIG meeting on Thursdays at 10am PST. |
@jmacd |
@D-D-H Feel free to join any time you want. |
Hi, @jmacd @pdelewski @edeNFed I am an engineer working on observability in Alibaba Cloud, I have summarized the main difference between our approach and instrgen and the eBPF solution to the best of our knowledge and please correct me if I am wrong. Comparison with InstrgenInstrGen leverages Golang's
Similar to InstrGen, our approach leverages compiler injection to insert instrumentation code. This approach offers several key advantages for users:
Comparison with the opentelemetry-go-instrumentaiton (eBPF solution)The opentelemetry-go-instrumentation project leverages eBPF uprobes for non-intrusive instrumentation of Go applications. Currently supported library includes, net/http, grpc, kafka, SQL and etc. The benefits of this approach are:
Actually we have tried the eBPF approach for a while, the considerations that we did not adopted this approach:
Please feel free to comment if you have any questions. We would like to have more discussion with the community in the upcoming SIG meeting. However, the meeting time is not quite friendly with us. I wonder is there any Asia-pacific friendly time for the meeting? |
@ralf0131 From
Currently the most important problem we are struggling is to have more people contributing to |
@pdelewski Please see my comments in line.
I agree that the two approaches share the basic idea however the implementation may varies. I think maybe we can discuss about how to combine them. However due to the difference between the two approaches I am not quite sure how to do that.
That will be good. How would you like to do that?
Based on my understanding, assuming user has a function called There is another difference regarding how the instrumented codes are injected. InstrGen analyze code from AutotelEntryPoint and build the call graph, and inject instrumentation code into functions bodies.
Could you elaborate more on how InstrGen will do that?
I agree with that. :) |
@ralf0131 Please see my comment below.
Your understanding is based on what we have so far on the main branch, however there is PR open-telemetry/opentelemetry-go-contrib#4058 (opened almost year ago) which is more or less what you described above, so both tools follows the same techniques. Of course there still might be some implementation differences, but they are rather small from high level view. |
Hi @pdelewski , You're right. My apologies, I hadn't noticed your pull request before and my understanding are based solely on the main branch. I've taken a look at your implementation, and I agree that from the high level view they share the same idea. We're actually planning to open-source the latest version of our code sometime in June. This would allow us for a more in-depth discussion about combining the two approaches once we can see each other's work in detail. What do you think? |
Hi @ralf0131 , That's great idea. Also from timeframe perspective, June sound good to me. |
I saw a great idea and comments from maintainers here. As we know, providing auto-instrumentation for Go programs is difficult, but that's not the reason which cause the previous implementation being less active. I've talked to some developers before, and they haven't heard about instrgen. I assume they will also be not aware of the potential implementation discussed here. So, there are a couple things we need to consider:
It would be great to see new projects/impls, and please also consider how we could keep them active for a long period of time. |
That's right, I think we will try to add some documentation and articles to introduce the compile time instrumentation. We also submitted proposal to the KubeCon China 2024 and hopefully we can be there to present :). |
Let me share my thoughts. It seems that most people focus on ebpf right now, no matter that it's just harder from development perspective and has tradeoffs described above. Having said that, it has one advantage that might be important for some group of people, e.g no need for recompilation or access to source code. Another thing is that it's unfortunate that I also tried to advertise it during KubeCon 2023 in Chicago, however that's not enough. Blogposts, articles and more people involved might help change the situation. |
Just be curious, may I ask why
|
I don't think moving |
@edeNFed Thanks for the clarification. Perhaps we should at least let users know, in the documentation, there are two approaches of instrumentation, one is the eBPF based auto instrumentation and the other is the compile time instrumentation?
Looking at the documentation of Java, there is also a Spring Boot page under automatic folder, which is not a fully runtime instrumentation. To my understanding, if a Java application is compiled into a native image with GraalVM, this process is essentially a compile time instrumentation. My second question for the eBPF-based instrumentation is, eBPF could offer a approach to instrumentation that beyond any specific programming language, why it has not been applied to other languages besides Go? |
just fyi |
From my perspective as a user, I don't think so. The Developers do not need to care about how Java's auto-instrumentation works. All they need to know is that they don't need to modify codes. In short, I expect the solutions provided by the |
The only thing shared between the eBPF instrumentation and the compiled time instrumentation is that they are both targeting Go applications. Everything else is different, the programming language the instrumentation are written in, the contributors working on the project, tests, CI, etc. I agree with @ralf0131 if the goal is to get more visibility into compile time instrumentation, making it more visible in the documentation is preferred in my opinion instead of mixing two unrelated projects into single repository. |
(sort of a side conversation, but just wanted to mention that it's not quite this clear of a distinction for the Java repos at least, where the primary differentiator is that opentelemetry-java-instrumentation modules are maintained by the repo maintainers while opentelemetry-java-contrib is a distributed ownership model where individual components are maintained by component owners) |
Hi all, Update: recently we have just released our first version of compile time instrumentation on the commercial side. Now we are working on the open source our solution. It is a bit late as expected but we are working on it. |
I'd like to note that @DataDog is up to extremely similar work in https://github.com/DataDog/orchestrion. This currently injects instrumentation for @DataDog's "properietary" SDK (https://github.com/DataDog/dd-trace-go)... That said, the code rewriting is configuration-driven, and configuration targeting the OTel SDK instead could absolutely be made. We (at @DataDog) would most certainly welcome contributions in this direction. |
HI All, Update: we have almost finished the initial version the compile time instrumentation approach. We are heading towards the first release. We are planning to introduce the approach in the upcoming tag-observability meeting (Tuesday, Aug 13 · 18:00 – 19:00 (Time zone: America/Los_Angeles), Wednesday, Aug 14 · 9:00 – 10:00 (Time zone: UTC+8)). Anyone who is interested please join to discuss. The meeting information can be found here. |
Hi @ralf0131 any feedback from the tag-observability meeting you can share here? |
@danielgblanco Not yet. Due to some technical issues, the meeting was not held as expected :(. We are working on a new meeting to discuss with it. We really need help on how to reach out to more people who is interested in the community. Any suggestions will be appreciated.
|
This auto instrumentation approach looks good, though I have a few concerns:
|
Hi @XSAM,
We prepared a document for this, please refer to https://github.com/alibaba/opentelemetry-go-auto-instrumentation/blob/main/docs/how-to-debug.md for more details.
We have ensured that all the inserted code is on a single line and have retained the code file after the insertion. In practice, troubleshooting the issue is not difficult. |
Hi @y1yang0, thanks for the explanation! It is good to have a way to know the modified code.
Even a single line number offset can make a difference, not to mention a Go file can contain multiple methods, which means multiple-line number offset is possible. The offset is not a constant number and can be scaled based on the users' code style. Furthermore, I tried the https://github.com/alibaba/opentelemetry-go-auto-instrumentation/blob/main/example/log/main.go example. Even if it does not insert any code inside methods, it will replace the import path to include OTel go dependencies and generated rules. This creates the line number offset to 8. This basically means users need to keep the |
Yes, so we retained the code file after the insertion, users can refer to it to know exactly what line 123 is.
Even if we don't use -debug, we will keep the modified files under |
My point is the line number reported by Go runtime is not the same as the source code as long as the Take this source code as an example, it does not change after running // Copyright (c) 2024 Alibaba Group Holding Ltd.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package main
import (
"go.opentelemetry.io/otel/sdk/trace"
"go.uber.org/zap"
"net/http"
)
func main() {
http.HandleFunc("/log", func(w http.ResponseWriter, r *http.Request) {
logger := zap.NewExample()
logger.Debug("this is debug message")
logger.Info("this is info message")
logger.Info("this is info message with fileds",
zap.Int("age", 37),
zap.String("agender", "man"),
)
logger.Warn("this is warn message")
logger.Error("this is error message")
})
http.HandleFunc("/logwithtrace", func(w http.ResponseWriter, r *http.Request) {
logger := zap.NewExample()
// GetTraceAndSpanId will be added while using otelbuild, users must use otelbuild to build the module
traceId, spanId := trace.GetTraceAndSpanId()
logger.Info("this is info message with fileds",
zap.String("traceId", traceId),
zap.String("spanId", spanId),
zap.Stack("stack"),
)
})
http.ListenAndServe(":9999", nil)
} If I run
It refers to line number 51, which does not exist in the source code. The total line of the source code is 47. And, this is the modified source code after using the // Copyright (c) 2024 Alibaba Group Holding Ltd.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package main
import _ "test/otel_rules"
import _ "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
import _ "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
import _ "go.opentelemetry.io/otel/exporters/otlp/otlptrace"
import _ "go.opentelemetry.io/otel/sdk"
import _ "go.opentelemetry.io/otel"
import (
"go.opentelemetry.io/otel/sdk/trace"
"go.uber.org/zap"
"net/http"
)
func main() {
http.HandleFunc("/log", func(w http.ResponseWriter, r *http.Request) {
logger := zap.NewExample()
logger.Debug("this is debug message")
logger.Info("this is info message")
logger.Info("this is info message with fileds",
zap.Int("age", 37),
zap.String("agender", "man"),
)
logger.Warn("this is warn message")
logger.Error("this is error message")
})
http.HandleFunc("/logwithtrace", func(w http.ResponseWriter, r *http.Request) {
logger := zap.NewExample()
// GetTraceAndSpanId will be added while using otelbuild, users must use otelbuild to build the module
traceId, spanId := trace.GetTraceAndSpanId()
logger.Info("this is info message with fileds",
zap.String("traceId", traceId),
zap.String("spanId", spanId),
zap.Stack("stack"),
)
})
http.ListenAndServe(":9999", nil)
} This reflects the correct code number that the log is referring to, as the line 51 is SummaryUsers have to keep the |
Sorry, this is a bug. We only retained the modified files of the third-party library, but we didn't retain the modified files in the current project. I will fix this issue soon. |
@ralf0131 Sorry for the delay. I did a deeper dive into the code, and as mentioned above, there are some differences in the details. However, both projects are generally based on the same ideas. Since your project now seems to be superior (has more supported libraries), replacing the instrgen source code could be a valid option. One feature that instrgen also had was building a static call graph and injecting it into selected user functions via a simple UI (something we can discuss later). For now, there are a few open questions: Should this be part of opentelemetry-contrib or its own repository? The answer to this question will also determine some of the further development decisions and processes. |
Hi @pdelewski ,
Sure, let's discuss it in the meeting.
I would suggest a dedicated repository instead of under the
There is already a discussion in the |
I'll check the discussion (I haven't been there for a while). Monday should work for me. BTW. I'm based in Europe (Poland). PT - pacific time? Seems to be 2am at my side. After consideration, that might be hard for me. |
Linking the slack thread for others here My thoughts on the slack thread were that there isn't any opposition from the Go-auto sig on your proposals, as long as @pdelewski and @ralf0131 and your teams can work out a way that there aren't 2 compile-time solutions (either by combining them or deprecating one).
Yup We are already spread pretty far across time zones, so picking a time that works for everyone is going to be tough. @MrAlias would you prefer to be on the call? In that case we would need early morning Pacific which should be afternoon Europe and (unfortunately) late China. Otherwise, we could do early Eastern time and I could join, which might work better for Europe and China time zones. |
@damemi It looks like in this thread we have reached a basic consensus that to replace Instrgen with the Alibaba approach, and merge the unique features of Instrgen to the Alibaba approach. @damemi @MrAlias @pdelewski |
@damemi @MrAlias @pdelewski If there is no objection shall we setup the meeting? Normally zoom meeting will be fine, as long as it allows to join the meeting without login(There are some issues for us to organize a zoom meeting or join a zoom meeting that requires login) |
Flight delays are making it seem like I will not be able to make a meeting early Pacific time on Monday. Can we do next week given we don't have something scheduled yet? |
@damemi @MrAlias @pdelewski I am fine with rescheduling the meeting to next Monday.
|
I'm also fine with rescheduling the meeting |
Talked about this on the eBPF sig call yesterday and Monday 10/21 at 8AM PT/17:00 Poland/23:00 China works for us |
@MrAlias will add the call to the OTel google calendar |
@damemi @MrAlias @pdelewski Double check for today's meeting, we are looking forward to discuss with you in the meeting today :) |
@ralf0131 yup! All set for 11am et Here is a link to the zoom call if anyone needs it: https://zoom.us/j/91802290946?pwd=ODl6YmNCTWtTNzUzTGlFcjRtWmhqdz09 |
Notes from the meeting today:
AIs:
Please let me know if I missed anything |
I guess the new project would also have the functionality of instrgen, so it basically provides two modes for instrumentation:
Is that correct? @damemi |
@XSAM my understanding on the call was more that we agreed to deprecate instrgen in favor of the new project. @ralf0131 or @pdelewski can correct me |
Thanks for all your efforts. This proposal looks promising. I'm looking forward to cooperating and providing more options of instrumenting Go applications. |
I have only used the golang instrumentation with ebpf, but recently we have been facing some problems due to kernel requirements. I am not aware of the |
Description
The opentelemetry-go-auto-instrumentation project is an auto-instrumentation solution designed for Go applications. It empowers users to harness the capabilities of OpenTelemetry for enhanced observability without any manual modifications.
Like the opentelemetry-java-instrumentation project, this solution automatically modifies code, the difference is that this all happens during the build process.
The current implementation reuses the existing instrumentation for Go packages and depends on the package
dave/dst
to rewrite Go source code.The side effect of this solution is similar to the impact one would expect from manual code modifications:
Benefits to the OpenTelemetry community
This project significantly lowers the barrier for Go applications to adopt OpenTelemetry.
While there is an existing auto-instrumentation solution based on eBPF, it comes with certain limitations.
Auto-instrumentation based on code rewriting can achieve the same effect as manual instrumentation in most scenarios and is easier to use in production.
Reasons for New Project
Drawing inspiration from the Java language, users generally prefer non-intrusive solutions (those that don't require manual code modifications). Therefore, we believe that for Go applications, this approach is likely to gain widespread acceptance among users. Making it a project of OpenTelemetry, not only ensures better maintenance but also extends the benefits to a broader user base.
Repository of Our Prototype
https://github.com/alibaba/opentelemetry-go-auto-instrumentation
Existing usage
This project is under development and has some simple demos.
Maintenance
The original contributors to this repository will continue to be involved in the project.
Our current roadmap is as follows:
Licenses
Apache License 2.0
Trademarks
No Trademarks
Other notes
No response
The text was updated successfully, but these errors were encountered: