You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We subgraph has a lot of complexity: many entities and many many filter options that we will never use.
For example:
We have entities that no-one ever required and for which it is hard to see a use case (i.e. practicaly for each event there is an entity)
We have an enormous amount of never-needed filter options (that are probably automagically generated by subgraph)
This is not an immediate problem, but this kind of bloat does not come for free and may be problematic later:
Insertion will be slower. Each entity will need its own insertion logic (obvious) which takes some time ; in addition each filter option represents a postgresql index, which takes its time as well.
This slow insertion has a direct performance effect on alchemy: i.e. if indexing a block takes 3 seconds longer, then it will take 3 seconds longer for alchemy to be updated with the latest data.
This stuff (bot the extra entities as well as the indexes) take up disk space (ok, who cares, space is a commondity, but still :-)
This complexity takes up head space. This is important. I find myself constantly filtering through lots of noise just to find which entities to query for my info.
SO: can we clean this up?
(As a footnote: why index all events? If we want to listen to events, listening to them on ethereum is faster;while if we want data aggregation (for nice historical graphs or something) we'd probably need a specifically defind data model and not the primitive events)
The text was updated successfully, but these errors were encountered:
yes, this is a known thing. we already discussed that in the past.
we might consider to remove unneeded entities in order to have gain some time and space optimisation.
In parallel or before that we should run benchmarks tests and collect numbers re timing and space. this benchmarks numbers will also gives us some ideas of where we at and what we saved by implementing any specific optimisation.
The issue is basically that we started from indexing all events and then moved to use the domain layer. I think we initially planned to make the domain a simplification basing on the first layer but it came out eventually as kind of a separate layer. I'm also in favor of removing the indexing of the "old" layer, which is where most unnecessary stuff is. I'm just not sure it should be done now as it's not urgent unlike other stuff and would require to first identify which are unnecessary and which are.
opened by @orenyodfat in behalf of @jellegerbrandy
We subgraph has a lot of complexity: many entities and many many filter options that we will never use.
For example:
This is not an immediate problem, but this kind of bloat does not come for free and may be problematic later:
This slow insertion has a direct performance effect on alchemy: i.e. if indexing a block takes 3 seconds longer, then it will take 3 seconds longer for alchemy to be updated with the latest data.
SO: can we clean this up?
(As a footnote: why index all events? If we want to listen to events, listening to them on ethereum is faster;while if we want data aggregation (for nice historical graphs or something) we'd probably need a specifically defind data model and not the primitive events)
The text was updated successfully, but these errors were encountered: