Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primary key / foreign key in event_tags #592

Open
patriknw opened this issue Sep 6, 2021 · 1 comment
Open

Primary key / foreign key in event_tags #592

patriknw opened this issue Sep 6, 2021 · 1 comment

Comments

@patriknw
Copy link
Member

patriknw commented Sep 6, 2021

I find it strange that we don't used the primary key in the foreign key so I tried that:

CREATE TABLE IF NOT EXISTS public.event_journal2(
  ordering BIGSERIAL,
  persistence_id VARCHAR(255) NOT NULL,
  sequence_number BIGINT NOT NULL,
  event_payload VARCHAR(255) NOT NULL,

  PRIMARY KEY(persistence_id, sequence_number)
);

CREATE UNIQUE INDEX event_journal2_ordering_idx ON public.event_journal2(ordering);


CREATE TABLE IF NOT EXISTS public.event_tag2(
    persistence_id VARCHAR(255) NOT NULL,
    sequence_number BIGINT NOT NULL,
    tag VARCHAR(256),
    PRIMARY KEY(persistence_id, sequence_number, tag),
    CONSTRAINT fk_event_journal2
      FOREIGN KEY(persistence_id, sequence_number)
      REFERENCES event_journal2(persistence_id, sequence_number)
      ON DELETE CASCADE
);

INSERT INTO event_journal2 SELECT n, 'pid-' || n, 1,  'evt-1' FROM generate_series(1, 1000000) as n;
INSERT INTO event_journal2 SELECT 1000000 + n, 'pid-' || n, 2,  'evt-2' FROM generate_series(1, 1000000) as n;
INSERT INTO event_journal2 SELECT 2000000 + n, 'pid-' || n, 3,  'evt-3' FROM generate_series(1, 1000000) as n;

INSERT INTO event_tag2 SELECT 'pid-' || n, 1, mod(n, 256)::bit(8) FROM generate_series(1, 1000000) as n;
INSERT INTO event_tag2 SELECT 'pid-' || n, 2, mod(n, 256)::bit(8) FROM generate_series(1, 1000000) as n;
INSERT INTO event_tag2 SELECT 'pid-' || n, 3, mod(n, 256)::bit(8) FROM generate_series(1, 1000000) as n;

Query:

explain analyze
select x2."ordering",
    x2."persistence_id",
    x2."sequence_number",
    x2."event_payload"
from "event_journal2" x2, "event_tag2" x3
where (
        (x3."tag" = '00000110')
           and (
                   (x2."ordering" > 2999000)
                   and (x2."ordering" <= 3000000))
          )
    and (x2."persistence_id" = x3."persistence_id")
    and (x2."sequence_number" = x3."sequence_number")
order by x2."ordering" limit 500;


 Limit  (cost=0.86..4645.03 rows=4 width=32) (actual time=8.538..46.870 rows=4 loops=1)
   ->  Nested Loop  (cost=0.86..4645.03 rows=4 width=32) (actual time=8.520..46.711 rows=4 loops=1)
         ->  Index Scan using event_journal2_ordering_idx on event_journal2 x2  (cost=0.43..45.45 rows=1051 width=32) (actual time=0.019..9.512 rows=1000 loops=1)
               Index Cond: ((ordering > 2999000) AND (ordering <= 3000000))
         ->  Index Only Scan using event_tag2_pkey on event_tag2 x3  (cost=0.43..4.38 rows=1 width=18) (actual time=0.018..0.018 rows=0 loops=1000)
               Index Cond: ((persistence_id = (x2.persistence_id)::text) AND (sequence_number = x2.sequence_number) AND (tag = '00000110'::text))
               Heap Fetches: 0
 Planning Time: 0.409 ms
 Execution Time: 46.957 ms

That can be compared with the original tables:

CREATE TABLE IF NOT EXISTS public.event_journal(
  ordering BIGSERIAL,
  persistence_id VARCHAR(255) NOT NULL,
  sequence_number BIGINT NOT NULL,
  event_payload VARCHAR(255) NOT NULL,

  PRIMARY KEY(persistence_id, sequence_number)
);

CREATE UNIQUE INDEX event_journal_ordering_idx ON public.event_journal(ordering);

CREATE TABLE IF NOT EXISTS public.event_tag(
    event_id BIGINT,
    tag VARCHAR(256),
    PRIMARY KEY(event_id, tag),
    CONSTRAINT fk_event_journal
      FOREIGN KEY(event_id)
      REFERENCES event_journal(ordering)
      ON DELETE CASCADE
);

INSERT INTO event_journal SELECT n, 'pid-' || n, 1,  'evt-1' FROM generate_series(1, 1000000) as n;
INSERT INTO event_journal SELECT 1000000 + n, 'pid-' || n, 2,  'evt-2' FROM generate_series(1, 1000000) as n;
INSERT INTO event_journal SELECT 2000000 + n, 'pid-' || n, 3,  'evt-3' FROM generate_series(1, 1000000) as n;

INSERT INTO event_tag SELECT n, mod(n, 256)::bit(8) FROM generate_series(1, 1000000) as n;
INSERT INTO event_tag SELECT 1000000 + n, mod(n, 256)::bit(8) FROM generate_series(1, 1000000) as n;
INSERT INTO event_tag SELECT 2000000 + n, mod(n, 256)::bit(8) FROM generate_series(1, 1000000) as n;

The original query:

explain analyze
select x2."ordering",
    x2."persistence_id",
    x2."sequence_number",
    x2."event_payload"
from "event_journal" x2, "event_tag" x3
where (
        (x3."tag" = '00000110')
           and (
                   (x2."ordering" > 2999000)
                   and (x2."ordering" <= 3000000))
          )
    and (x2."ordering" = x3."event_id")
order by x2."ordering" limit 500;

Limit  (cost=45935.14..45935.38 rows=2 width=32) (actual time=88.859..90.649 rows=4 loops=1)
   ->  Gather Merge  (cost=45935.14..45935.38 rows=2 width=32) (actual time=88.816..90.512 rows=4 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=44935.12..44935.13 rows=1 width=32) (actual time=84.768..84.821 rows=1 loops=3)
               Sort Key: x2.ordering
               Sort Method: quicksort  Memory: 25kB
               Worker 0:  Sort Method: quicksort  Memory: 25kB
               Worker 1:  Sort Method: quicksort  Memory: 25kB
               ->  Nested Loop  (cost=0.43..44935.11 rows=1 width=32) (actual time=78.235..84.761 rows=1 loops=3)
                     ->  Parallel Seq Scan on event_journal x2  (cost=0.00..43748.00 rows=272 width=32) (actual time=74.138..76.791 rows=333 loops=3)
                           Filter: ((ordering > 2999000) AND (ordering <= 3000000))
                           Rows Removed by Filter: 999667
                     ->  Index Only Scan using event_tag_pkey on event_tag x3  (cost=0.43..4.36 rows=1 width=8) (actual time=0.008..0.008 rows=0 loops=1000)
                           Index Cond: ((event_id = x2.ordering) AND (tag = '00000110'::text))
                           Heap Fetches: 0
 Planning Time: 0.207 ms
 Execution Time: 90.734 ms

Originally posted by @patriknw in #560 (comment)

@patriknw
Copy link
Member Author

patriknw commented Sep 6, 2021

Note that this also means that the ordering doesn't have to be fetched in the inserts to event_journal, and I would guess that can be a huge performance difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant