Consider simplifying the graphql model #172

orenyodfat · 2019-04-06T16:42:43Z

opened by @orenyodfat in behalf of @jellegerbrandy

We subgraph has a lot of complexity: many entities and many many filter options that we will never use.

For example:

We have entities that no-one ever required and for which it is hard to see a use case (i.e. practicaly for each event there is an entity)
We have an enormous amount of never-needed filter options (that are probably automagically generated by subgraph)

This is not an immediate problem, but this kind of bloat does not come for free and may be problematic later:

Insertion will be slower. Each entity will need its own insertion logic (obvious) which takes some time ; in addition each filter option represents a postgresql index, which takes its time as well.
This slow insertion has a direct performance effect on alchemy: i.e. if indexing a block takes 3 seconds longer, then it will take 3 seconds longer for alchemy to be updated with the latest data.
This stuff (bot the extra entities as well as the indexes) take up disk space (ok, who cares, space is a commondity, but still :-)
This complexity takes up head space. This is important. I find myself constantly filtering through lots of noise just to find which entities to query for my info.

SO: can we clean this up?

(As a footnote: why index all events? If we want to listen to events, listening to them on ethereum is faster;while if we want data aggregation (for nice historical graphs or something) we'd probably need a specifically defind data model and not the primitive events)

orenyodfat · 2019-04-06T16:51:27Z

yes, this is a known thing. we already discussed that in the past.
we might consider to remove unneeded entities in order to have gain some time and space optimisation.
In parallel or before that we should run benchmarks tests and collect numbers re timing and space. this benchmarks numbers will also gives us some ideas of where we at and what we saved by implementing any specific optimisation.

ben-kaufman · 2019-04-07T11:39:31Z

The issue is basically that we started from indexing all events and then moved to use the domain layer. I think we initially planned to make the domain a simplification basing on the first layer but it came out eventually as kind of a separate layer. I'm also in favor of removing the indexing of the "old" layer, which is where most unnecessary stuff is. I'm just not sure it should be done now as it's not urgent unlike other stuff and would require to first identify which are unnecessary and which are.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider simplifying the graphql model #172

Consider simplifying the graphql model #172

orenyodfat commented Apr 6, 2019 •

edited

Loading

orenyodfat commented Apr 6, 2019

ben-kaufman commented Apr 7, 2019

Consider simplifying the graphql model #172

Consider simplifying the graphql model #172

Comments

orenyodfat commented Apr 6, 2019 • edited Loading

orenyodfat commented Apr 6, 2019

ben-kaufman commented Apr 7, 2019

orenyodfat commented Apr 6, 2019 •

edited

Loading