Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the size of type information in durable coroutine state #114

Merged
merged 6 commits into from
Nov 21, 2023
Merged

Conversation

chriso
Copy link
Contributor

@chriso chriso commented Nov 21, 2023

Now that we can inspect durable coroutine state (#113), a few optimizations are clear.

Firstly, package names are currently duplicated across types. This PR updates the serialization layer to intern strings so that there's no duplication.

Secondly, the serialization layer currently stores type information for all referenced types (recursively). When custom serializers are registered this type information is not used; we just need a reference to the custom serialization routines. An offline process analyzing the state doesn't need this information either, because the custom serializer might emit entirely different objects/types. Since #112, type information is available for the output of custom serializers, so the input type info isn't necessary. This PR updates the serialization layer to no longer store type information for types with custom serializers. All that's stored now is the interned package name and type name, along with the custom flag which indicate that the type is opaque. When stacked with #104, the reduction in the size of type information can be drastic.

@chriso chriso changed the title Introspection (part 7) Reduce the size of type information in durable coroutine state Nov 21, 2023
proto/coroutine/v1/type.proto Outdated Show resolved Hide resolved
In all cases, IDs start at one; zero is used as
the nil sentinel, so there's no need for signed ints.
Although these are likely unique, interning them
with all other strings means that get more accurate
stats when looking at the size of different parts
of the durable coroutine state.
@chriso chriso merged commit 00b9eb6 into main Nov 21, 2023
2 checks passed
@chriso chriso deleted the proto6 branch November 21, 2023 06:02
chriso added a commit that referenced this pull request Nov 21, 2023
This fixes bugs introduced by
24b7af8.

Since #114, we no longer store type information for types with custom
serializers registered. We allow custom serializers to be registered for
interface types, and since #104 these custom routines are used for all
implementations of those interfaces.

We need to store enough information to be able to recreate the original
type in the program that generated the state. Since no stable identifier
is available for types, we currently use the offset to a known anchor in
the program (see
[1](https://github.com/stealthrocket/coroutine/blob/00b9eb66a6472279b8dc9a4994b2694ff3f06c17/types/unsafe.go#L66-L84)
and
[2](https://github.com/stealthrocket/coroutine/blob/00b9eb66a6472279b8dc9a4994b2694ff3f06c17/types/types.go#L57-L59)).
The first bug introduced by #114 is that it wasn't continuing to use
this facility when a custom serializer had been registered, so the
serialization layer was erroneously returning the interface type when
deserializing all implementations of those interface types. The second
bug was more subtle; when `*A` implements the interface `T` but `A` is
the named type for which an offset is available, we need to retain this
information so that we can correctly create `*A` rather than `T` when
deserializing objects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants