Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gnovm/pkg/gnolang: ReadFile and MustReadFile which are used by the VM's parse should not need to incur byteslice->string conversions when passing in a string that won't be modified to ParseFile #2962

Open
odeke-em opened this issue Oct 16, 2024 · 1 comment
Assignees
Labels
security Security-sensitive issue

Comments

@odeke-em
Copy link

odeke-em commented Oct 16, 2024

Noticed while fuzzing and auditing gnovm in the time that the fuzzer is running that we've got this code

func ReadFile(path string) (*FileNode, error) {
bz, err := os.ReadFile(path)
if err != nil {
return nil, err
}
return ParseFile(path, string(bz))
}

but really if profiling determines that this code is hot (of which preliminary reading shows that it is a popular function to parse files to turn them into code), we can reduce as many allocations as we can by cleverly using unsafe.String like this

diff --git a/gnovm/pkg/gnolang/go2gno.go b/gnovm/pkg/gnolang/go2gno.go
index efdfecf0..693dc74b 100644
--- a/gnovm/pkg/gnolang/go2gno.go
+++ b/gnovm/pkg/gnolang/go2gno.go
@@ -42,6 +42,7 @@ import (
 	"reflect"
 	"strconv"
 	"strings"
+	"unsafe"
 
 	"github.com/davecgh/go-spew/spew"
 	"github.com/gnolang/gno/tm2/pkg/errors"
@@ -70,7 +71,11 @@ func ReadFile(path string) (*FileNode, error) {
 	if err != nil {
 		return nil, err
 	}
-	return ParseFile(path, string(bz))
+
+	// ParseFile will not modify the input byteslice hence
+	// it is safe to simply convert to a string without
+	// incuring a performance penalty for byteslice->string.
+	return ParseFile(path, unsafe.String(&bz[0], len(bz)))
 }
 
 func ParseExpr(expr string) (retx Expr, err error) {

and you can add this benchmark

var _, thisFile, _, _ = runtime.Caller(0)
var pathsToRead = []string{thisFile, "non-existent.bad"}
        
var sink any = nil
                
func BenchmarkReadFile(b *testing.B) {
        b.ReportAllocs()

        for i := 0; i < b.N; i++ {
                for _, path := range pathsToRead {
                        sink, _ = ReadFile(path)
                } 
        }       
        if sink == nil {
                b.Fatal("Benchmark did not run")
        }
        sink = nil
}

which shows a reduction in allocations, but of course majority of the time and allocations are dominated by the AST parsing, but nonetheless we see a small reduction which is a win

 benchstat before.txt after.txt 
name        old time/op    new time/op    delta
ReadFile-8    2.25ms ± 2%    2.27ms ± 2%    ~     (p=0.497 n=10+9)

name        old alloc/op   new alloc/op   delta
ReadFile-8     561kB ± 0%     547kB ± 0%  -2.55%  (p=0.000 n=10+10)

name        old allocs/op  new allocs/op  delta
ReadFile-8     8.07k ± 0%     8.06k ± 0%  -0.01%  (p=0.000 n=10+10)

For an ambitious project like this, as many wins as can be added without spending a ton of time.
Kindly cc-ing @jaekwon

@kristovatlas kristovatlas self-assigned this Oct 29, 2024
@kristovatlas kristovatlas added the security Security-sensitive issue label Oct 29, 2024
@kristovatlas
Copy link
Contributor

Thanks for the report, @odeke-em. We're looking into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security Security-sensitive issue
Projects
None yet
Development

No branches or pull requests

2 participants