Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/printer: Comment-LineBreak-LineBreak-SelectorExpr-Comment AST modification issue #70978

Open
TACIXAT opened this issue Dec 23, 2024 · 8 comments · May be fixed by #70982 or #70983
Open

go/printer: Comment-LineBreak-LineBreak-SelectorExpr-Comment AST modification issue #70978

TACIXAT opened this issue Dec 23, 2024 · 8 comments · May be fixed by #70982 or #70983

Comments

@TACIXAT
Copy link

TACIXAT commented Dec 23, 2024

Go version

go version go1.23.4 windows/amd64

Output of go env in your module/workspace:

set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\user\AppData\Local\go-build
set GOENV=C:\Users\user\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\user\go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\user\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=C:\Program Files\Go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLCHAIN=auto
set GOTOOLDIR=C:\Program Files\Go\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=go1.23.4
set GODEBUG=
set GOTELEMETRY=local
set GOTELEMETRYDIR=C:\Users\user\AppData\Roaming\go\telemetry
set GCCGO=gccgo
set GOAMD64=v1
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=0
set GOMOD=NUL
set GOWORK=
set CGO_CFLAGS=-O2 -g
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-O2 -g
set CGO_FFLAGS=-O2 -g
set CGO_LDFLAGS=-O2 -g
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -fno-caret-diagnostics -Qunused-arguments -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=C:\Users\user\AppData\Local\Temp\go-build3559419192=/tmp/go-build -gno-record-gcc-switches

What did you do?

While refactoring a codebase of ours programmatically to rename a poorly named package we ran into this issue. The bug is triggered if you modify the X component of a SelectorExpr to be longer than where a following comment starts, and the selector expression has a comment + double newline preceding it.

package main

import (
	"go/ast"
	"go/parser"
	"go/printer"
	"go/token"
	"log"
	"bytes"
)

func init() {
	log.SetFlags(log.LstdFlags|log.Lshortfile)
}

var errCase string = `
package main

import (
	"log"
)

func main() {
	// Comment with line break after it

	log.Println() // Comment
}
`

func main() {
	fset := token.NewFileSet()
	root, err := parser.ParseFile(fset, "error_case.go", errCase,
		parser.SkipObjectResolution|parser.ParseComments)
	if err != nil {
		log.Fatal(err)
	}

	ast.Inspect(root, func (n ast.Node) bool {
		switch n.(type) {
		case *ast.SelectorExpr:
			se := n.(*ast.SelectorExpr)
			switch se.X.(type) {
			case *ast.Ident:
				ident := se.X.(*ast.Ident)
				ident.Name = "123456789012345"
			}
		}

		return true
	})

	ast.Print(fset, root)

	buf := &bytes.Buffer{}
	err = printer.Fprint(buf, fset, root)
	if err != nil {
		log.Fatal(err)
	}

	log.Println(string(buf.Bytes()))
}

What did you see happen?

If there is a selector expression with a comment + double newline preceding it (the second newline is required), and a comment following it, and you modify the selectorExpr.X to be longer than where the following comment starts, go/printer will intersperse the following comment in the middle of the selector expression.

This occurs due to this block in go/printer - https://github.com/golang/go/blob/master/src/go/printer/printer.go#L1018

package main

import (
        "log"
)

func main() {
        // Comment with line break after it

        123456789012345 // Comment
        .Println()
}

There is a variable p.impliedSemi (p = printer) to indicate that a newline implies a semi colon. In this function (print) there is also a local variable impliedSemi that p.impliedSemi is set to at the end of the function.

Before a token is printed, we backtrack and flush is called, flush calls intersperseComments to print any comments preceding that token for comments whose commentOffset > next.Offset and if p.impliedSemi is false (ignoring other parts of the conditional for the purpose of this bug report).

When the IDENT (se.X) token is encountered we set the local variable impliedSemi to true. The function then calls flush where we print the preceding comment. Now, the linked block will print any newlines (up to two) after that comment, before our token. I am not sure why, but this block now overrides impliedSemi and sets it to false. The function finishes and p.impliedSemi is set to false. Now the conditions for our bug are set.

Aside: I am not 100% on the intention of why this block sets impliedSemi to false. It has printed newlines, but those newlines come before the token (IDENT) that affected impliedSemi previously, and we have not yet updated p.impliedSemi.

So, on the next token (the .) we enter flush once again, which enters intersperseComments. Now, the comment following the selector expression meets the conditional that its start comes before the next token (.) and p.impliedSemi is false. So we print the comment before printing the rest of the selector expression.

What did you expect to see?

package main

import (
        "log"
)

func main() {
        // Comment with line break after it

        123456789012345.Println() // Comment
}

If there is not the very specific case of a comment + double newline preceding the selector expression, this works as intended. If there is a statement, or no newline after the comment, the selector expression does not get split up.

Depending on the intention behind setting impliedSemi in the linked block, some solutions may be -

  • Do not set impliedSemi in this block
  • Do not set impliedSemi in this block if the newlines are the result of backtracking for comments

Thank you!

@ianlancetaylor
Copy link
Member

CC @griesemer

@mateusz834
Copy link
Member

FYI, this is not a valid Go identifier:

ident.Name = "123456789012345"

spec:

An identifier is a sequence of one or more letters and digits. The first character in an identifier must be a letter.

But the issue is still present if we make it a valid identifier.

@mateusz834
Copy link
Member

I would say that the root cause of this issue, is that ast.SelectorExpr does not store the position of the ..

go/src/go/ast/ast.go

Lines 343 to 346 in 772f024

SelectorExpr struct {
X Expr // expression
Sel *Ident // field selector
}

So we are not able to set the correct position here (p.setPos(x.Dot)):

go/src/go/printer/nodes.go

Lines 1157 to 1159 in 772f024

func (p *printer) selectorExpr(x *ast.SelectorExpr, depth int, isMethod bool) bool {
p.expr1(x.X, token.HighestPrec, depth)
p.print(token.PERIOD)

Thus we are using the "default one", from writeString which is based on the previous string length, being written:

p.pos.Offset += len(s)
if nlines > 0 {
p.pos.Line += nlines
p.out.Line += nlines
c := len(s) - li
p.pos.Column = c
p.out.Column = c
} else {
p.pos.Column += len(s)
p.out.Column += len(s)
}

mateusz834 added a commit to mateusz834/go that referenced this issue Dec 24, 2024
Workarorund for issue golang#70978, see the same issue for more details.

Fixes golang#70978

Change-Id: I7934d51af0679ac6fc10128d77001b0092bd7392
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/638636 mentions this issue: go/printer: predict the position of the '.' in SelectorExpr

@mateusz834
Copy link
Member

Mailed a workaround: CL 638636 for this issue.

@mateusz834
Copy link
Member

mateusz834 commented Dec 24, 2024

I also do not understand why we are setting impliedSemi to false, here:

go/src/go/printer/printer.go

Lines 1003 to 1021 in 772f024

// intersperse extra newlines if present in the source and
// if they don't cause extra semicolons (don't do this in
// flush as it will cause extra newlines at the end of a file)
if !p.impliedSemi {
n := nlimit(next.Line - p.pos.Line)
// don't exceed maxNewlines if we already wrote one
if wroteNewline && n == maxNewlines {
n = maxNewlines - 1
}
if n > 0 {
ch := byte('\n')
if droppedFF {
ch = '\f' // use formfeed since we dropped one before
}
p.writeByte(ch, n)
impliedSemi = false
}
}

and then setting the p.impliedSemi with that false:

go/src/go/printer/printer.go

Lines 1028 to 1029 in 772f024

p.writeString(next, data, isLit)
p.impliedSemi = impliedSemi

Removing the impliedSemi = false also seems to fix the issue.

@griesemer ?

EDIT: this is basically what was described in the directly in bug report #70978 (comment), i just had to manually come to the conclusion to understand the issue :).

mateusz834 added a commit to mateusz834/go that referenced this issue Dec 24, 2024
…wlines

Fixes golang#70978

Change-Id: I0cc54bd11bb4170235bb5a6d5eb6d8438beada58
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/638637 mentions this issue: go/printer: do not set impliedSemi to false in print, while writing newlines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants