-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iteration over loaded values #303
Comments
Hi @turbolent What is the use case for iterating over just the loaded elements? It'll be helpful to know more context (why needed & how used) because some edge cases come to mind:
|
@fxamacker Great questions!
This idea / feature request is related to onflow/cadence#2460. For example, imagine a large container (array or dictionary) is loaded from storage, and only one or a few elements are accessed. When the container is moved, we need to potentially perform some operations for elements of the container (in particular, if references where taken to elements, those need to be invalidated).
If a storage ID storable (a storable that points to another storable stored in another slab) is encountered, it should only be followed if the targeted slab was previously loaded. For example, imagine a Cadence array of resources. The array is backed by an atree array, and the resources in it are backed by atree ordered maps, the array is only directly storing storage ID storables.
As far as I understand, the same logic applies for iteration/loading of internal slabs: I think only the loaded data slabs have to be considered, because by definition, a load of an element that is stored by a data slab in the "gap" should have caused a load of that data slab (and its parents in the tree). |
One moment when reading I got hope for lazy arrays and dictionaries. ( wishful reading ) But then I thought why not have that as it also covers this case. I spent very little time on atree but maybe this gives some ideas so I am sharing. ( It can be totally stupid, in that case sorry in advance to take your time ) What if what we have something like We will be able to navigate all object tree with dehydrated values, hydrate something if we need. In this use case Cadence can iterate and check if value is hydrated. |
@bluesign Great points and explanation! atree's data structures (arrays and ordered maps) are basically already lazy. Loading an array or map from storage only loads their root node, and only once elements are accessed, those accessed elements are loaded, and nothing else. The "dehydrated" elements are The "problem" is that the current iteration functionality operates on high-level/"hydrated" Offering iteration functionality on |
@bluesign Thanks for great points! I think Bastian's reply covered everything. @turbolent Thanks for explanation and links to Cadence issue+PR for more context.
I think this feature is possible but we might need a different approach than suggested to iterate loaded elements. Currently iterator traverses data slabs using next storage ID to avoid loading more metadata slabs. So it doesn't handle "gaps" of unloaded data slabs. We need a different iterator to traverse data slabs from parent slabs. To help me prioritize, when do we need this feature? |
Nice!
It is not urgent, but would be "needed" for the Stable Cadence release, as it features reference invalidation on resource invalidation (move/deletion; see linked Cadence PR). @SupunS is looking into alternative solutions, but it feels like this would be the "most efficient", in terms of avoiding the addition of a new bookkeeping mechanism to Cadence for this purpose. |
Thanks @fxamacker for looking into this! Yeah, like Bastian mentioned, I've been looking into alternatives, but apart from the one in the original PR onflow/cadence#2460 (which is inefficient), other alternatives are turning out to be mostly dead-ends. And yes, it is not a blocker for the moment, but would be a blocker for releasing Stable Cadence. The existing PR (onflow/cadence#2460) would only need a very little change once this is available I believe (meaning that there are no big changes waiting on this feature). Let me know if you need more context on the use-case/how this would be used, I can maybe set up a quick call to walk through an example. |
Quick update. I told @j1010001 this should take about 3-5 days total (most of that for writing tests). Will try to open PR this week (most likely tomorrow if there are no surprises or emergencies because its going much faster than expected). |
Sounds great @fxamacker! This isn't urgent BTW, we only need it for releasing Stable Cadence, which is several months out |
Issue To Be Solved
In Cadence, it would be useful to have a mechanism to iterate over an atree array or ordered map, which only considers slabs and values which have been previously loaded from storage.
For example, if a large array or map is loaded from storage, only its root slab is loaded, and none of the elements (child slabs) are loaded.
If only one element is accessed, then only the slabs in the tree leading to the element is loaded into memory from storage, other slabs and elements are not.
Suggested Solution
I do not have a full solution in mind, but investigated a bit how this could be achieved.
For example, in the array iterator:
atree/array.go
Line 2202 in bc0184e
SlabStorage
, or an additional parameter for theRetrieve
function to indicate if the value should be retrieved if it has not been previously retrieved (default).atree/array.go
Line 2218 in bc0184e
Storable
, or an additional parameter for theStoredValue
(like forRetrieve
).Would it be possible to add such a feature? Would such an approach work?
The text was updated successfully, but these errors were encountered: