Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] string.well_known_regex cannot handle binary HTTP header values #263

Open
ash2k opened this issue Oct 14, 2024 · 2 comments
Open

[BUG] string.well_known_regex cannot handle binary HTTP header values #263

ash2k opened this issue Oct 14, 2024 · 2 comments
Labels
Bug Something isn't working

Comments

@ash2k
Copy link

ash2k commented Oct 14, 2024

Description

well_known_regex is on a string type. string must be a valid UTF-8 encoded Unicode code point sequence. But HTTP header values don't have to be. I.e. valid HTTP header values cannot be represented as a proto string. Yet the validation rule is on the string type.

Steps to Reproduce

message HeaderValues {
  string val = 1 [(buf.validate.field).string.well_known_regex = KNOWN_REGEX_HTTP_HEADER_VALUE];
}
func TestAbc(t *testing.T) {
	val := []byte("\xff")
	valid := utf8.Valid(val)
	assert.True(t, valid, "Invalid UTF-8")

	msg := &HeaderValues{
		Val: "\xff",
	}
	_, err := proto.Marshal(msg)
	assert.NoError(t, err, "proto.Marshal() failed")

	v, err := protovalidate.New()
	require.NoError(t, err)
	err = v.Validate(msg)
	assert.NoError(t, err)
}

Prints:

=== RUN   TestAbc
    prototool_test.go:16: 
        	Error Trace:	file_test.go:16
        	Error:      	Should be true
        	Test:       	TestAbc
        	Messages:   	Invalid UTF-8
    prototool_test.go:27: 
        	Error Trace:	file_test.go:27
        	Error:      	Received unexpected error:
        	            	string field contains invalid UTF-8
        	Test:       	TestAbc
        	Messages:   	proto.Marshal() failed
--- FAIL: TestAbc (0.02s)

FAIL

Process finished with the exit code 1

Expected Behavior

Validation rule should be on bytes. At least there too, not just on string.

Actual Behavior

Validation rule on string that is problematic.

Possible Solution

Additional Context

HTTP header value spec: https://datatracker.ietf.org/doc/html/rfc9110#name-field-values

@ash2k ash2k added the Bug Something isn't working label Oct 14, 2024
@rodaine
Copy link
Member

rodaine commented Oct 16, 2024

See original provenance of these regex here: bufbuild/protoc-gen-validate#297

I'd actually rather see these removed from protovalidate, and users can provide predefined constraints that match the semantics they want (according to the thread above, the RFC recommends only using ASCII).

@ash2k
Copy link
Author

ash2k commented Oct 17, 2024

As I explained in the description, I think the regexes the way they are right now cannot be used. So I agree something needs to change. To be clear, this is not a problem for me, I just wanted to report the issue I noticed.

@ash2k ash2k changed the title [BUG] string.well_known_regex is cannot handle binary HTTP header values [BUG] string.well_known_regex cannot handle binary HTTP header values Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants