-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion to avoid exra layer of indirection by using a more compact representation of Cid #3
Comments
@kevina Yeah, this is a pretty good point. I'm not opposed to this. |
Okay, this is something I will measure then. I will also measure the benefits (and cons) of using a string for a It may be a while before I can get to it as I can't just change the Cid form a pointer to a value type without lots of code change so I will need to isolate the benchmark code to have a few dependencies as possible before I can measure anything. |
I did some informal test and determined that the overhead for the benchmark in ipfs/kubo#3376 was around 17ms. Most of the time for that benchmark was in the base32 decoding (see whyrusleeping/base32#1). This might still be useful but nothing urgent. |
So, I opened a competing issue (#25) because having to call |
@Stebalien as I am updating GC code that relies heavily on the indexing by Cid having the Cid stored as string possibly could improve it quite a bit. |
Taking a look at my heap profile, it looks like the |
@Stebalien making multihash a string is actually a different issue in a different repo, multiformats/go-multihash#29. @whyrusleeping was concerned about the cost of converting the string to a |
Ah. I see. Either that or just storing CIDs in as byte strings (fully serialized) would work. |
@Stebalien so if I understand you correctly, what you want is for the official CID representation to be a serialized string. That would be the most efficient in memory representative however they may be a non-trivial cost then if we need information on the Cid itself. If there is any interest this is something I could look into. |
My main concern is that I want it to be usable as a key in a map. We can achieve this by:
|
@Stebalien ah, now I see what you are after. In most of these cases we should stop passing around CIDs as pointers (see #38). I am not sure if the Cid type necessary should be directly storable as map keys, but we should definitely aim to reduce the cost of KeyString. |
We are going with a string representation. |
The conversion from plain
Key
toCid
introduced an extra layer of indirection that can likely be avoided.Currently Cid takes up the width of 5*64 bits. With it's current definition of:
May I suggest
As 32 bits should be more than enough for version and codec which I don't think will ever get very large (even 16 bits should be enough). If combined with multiformats/go-multihash#29 the size will now be 3*64 bits, which is the same size an array slice.
As the same size of a slice it becomes practical to pass it around by value which avoids an extra layer of indirection. Even though it is passed by value, the internals can still be kept hidden, as is done with many other go datatypes.
Just a suggestion. It might make a difference when you have an array of 1000's of Cids (which I might in some of my maintenance commands for the filestore, ipfs/kubo#2634).
The text was updated successfully, but these errors were encountered: