Add option to preserve sequence number generators #34

JQuezada0 · 2023-04-18T01:59:14Z

No description provided.

allenap · 2023-04-18T07:00:13Z

src/main.rs

+    fn current_sequence_value_query(&self, sequence: &Sequence) -> String {
+        format!(
+            r#"SELECT COALESCE(MAX("{column_name}"), 1)::text FROM {schema}."{table_name}""#,
+            schema = &self.schema,
+            column_name = sequence.column_name,
+            table_name = &self.name,
+        )
+    }
+    fn restore_sequences_query(&self, sequence: &Sequence, current_sequence_value: u64) -> String {
+        format!(
+            r#"SELECT SETVAL('{schema}."{sequence_name}"', {current_sequence_value}) FROM {schema}."{table_name}""#,
+            schema = &self.schema,
+            sequence_name = sequence.sequence_name,
+            current_sequence_value = current_sequence_value,
+            table_name = &self.name,
+        )
+    }


Use the SqlString trait here to escape, plus a couple of other things:

Suggested change

fn current_sequence_value_query(&self, sequence: &Sequence) -> String {

format!(

r#"SELECT COALESCE(MAX("{column_name}"), 1)::text FROM {schema}."{table_name}""#,

schema = &self.schema,

column_name = sequence.column_name,

table_name = &self.name,

)

}

fn restore_sequences_query(&self, sequence: &Sequence, current_sequence_value: u64) -> String {

format!(

r#"SELECT SETVAL('{schema}."{sequence_name}"', {current_sequence_value}) FROM {schema}."{table_name}""#,

schema = &self.schema,

sequence_name = sequence.sequence_name,

current_sequence_value = current_sequence_value,

table_name = &self.name,

)

}

fn current_sequence_value_query(&self, sequence: &Sequence) -> String {

format!(

r#"SELECT COALESCE(MAX({column_name}), 1)::text FROM {qualified_table_name}"#,

column_name = sequence.column_name.sql_identifier(),

qualified_table_name = self.sql_identifier(),

)

}

fn restore_sequences_query(&self, sequence: &Sequence, current_sequence_value: u64) -> String {

format!(

r#"SELECT SETVAL('{schema}.{sequence_name}', {current_sequence_value}) FROM {qualified_table_name}"#,

schema = &self.schema.sql_identifier(),

sequence_name = sequence.sequence_name.sql_identifier(),

qualified_table_name = self.sql_identifier(),

)

}

It's possible to get a sequence's last value by querying it like a table. Also, we can use sql_value to construct the argument to SETVAL.

Suggested change

fn current_sequence_value_query(&self, sequence: &Sequence) -> String {

format!(

r#"SELECT COALESCE(MAX("{column_name}"), 1)::text FROM {schema}."{table_name}""#,

schema = &self.schema,

column_name = sequence.column_name,

table_name = &self.name,

)

}

fn restore_sequences_query(&self, sequence: &Sequence, current_sequence_value: u64) -> String {

format!(

r#"SELECT SETVAL('{schema}."{sequence_name}"', {current_sequence_value}) FROM {schema}."{table_name}""#,

schema = &self.schema,

sequence_name = sequence.sequence_name,

current_sequence_value = current_sequence_value,

table_name = &self.name,

)

}

fn current_sequence_value_query(&self, sequence: &Sequence) -> String {

let qualified_sequence_name = format!(

"{}.{}",

self.schema.sql_identifier(),

sequence.sequence_name.sql_identifier()

);

format!(r#"SELECT last_value::text FROM {qualified_sequence_name}"#,)

}

fn restore_sequences_query(&self, sequence: &Sequence, current_sequence_value: u64) -> String {

let qualified_sequence_name = format!(

"{}.{}",

self.schema.sql_identifier(),

sequence.sequence_name.sql_identifier()

);

format!(

r#"SELECT SETVAL({qualified_sequence_name_as_value}, {current_sequence_value}) FROM {qualified_table_name}"#,

qualified_sequence_name_as_value = qualified_sequence_name.sql_value(),

qualified_table_name = self.sql_identifier(),

)

}

https://www.postgresql.org/docs/12/functions-sequence.html

src/main.rs

allenap · 2023-04-18T07:40:41Z

src/main.rs

+                from pg_class sequences
+                join pg_depend ON pg_depend.objid = sequences.oid
+            where sequences.relkind = 'S'
+        ) seqs ON seqs.tbl_id = pg_class.oid AND seqs.col_id = pg_attribute.attnum


PostgreSQL sequences don't have to be associated with a column. They're all free-standing, but can be "owned" by a table, but I think that only means that they're dropped when the table is dropped (see OWNED BY in the docs for CREATE SEQUENCE).

I think sequences should be considered as a separate top-level object, and not grouped in with tables. They exist in the same namespace, so the same include/exclude rules can be applied to select which sequences to restore.

This kinda ties into why I didn't use last_value. I originally ran it against a few sequences and found that the last_value doesn't always line up with the max(col); sequences can be shared, can be depended upon by columns, and as you mentioned they can be owned by tables or not, so their scope may or may not be tied to a single table.

For both figuring out what the current value of the sequence should be, and which sequences actually even get their current value preserved, I stuck to the table as the source of truth with the understanding that what pg_parcel primarily slices data by is tables, and so to the extent that we care to copy a slice of table data then we have to care about restoring the value of sequences that each table depends on, but not necessarily other sequences not owned by or depended on by the targeted tables, or the values of the sequences that aren't pertinent to the table.

i.e. what happens if we intentionally skip a table, but then don't explicitly also skip any sequences that the table depends on? If we always restore the sequence value, then the sequence value would be current but the table would be empty, which might be unexpected? Should you always have to explicitly skip both at the same time?

But, that was an assumption I made, would be happy to change both of these with the understanding that pg_parcel restores all sequence values regardless of what particular slice of data is being dumped/restored (if passing the keep flag), it's separately configurable, and that is a deliberate choice.

Would it be fair to say that if we are dumping a table, we should implicitly also dump any associated sequences?

I think that maybe the ability to dump sequences that aren't connected to any (requested) tables should be considered a separate feature.

So how do we feel about two options, one called --keep-table-sequences which does what this PR adds, and then another option called --keep-all-sequences which does all of them?

Or, just one option called --sequences=(none | tables | all) which defaults to tables?

(conceptually; naming can be whatever)

From the README, pg_parcel is:

A very minimal subset of pg_dump --data-only ...

From the pg_dump man page (emphasis mine):

--data-only

Dump only the data, not the schema (data definitions). Table data, large objects, and sequence values are dumped.

Ergo: pg_parcel should dump sequence values (and large objects, but that's not in scope here).

To elaborate, sequences are first-class entities within a schema, on a level with tables. They can be associated with a table but they don't belong to a table. A sequence's value is its data, just as rows are a table's data. I think it's conceptually sound to dump values for all sequences in the schema. In fact I think it's a bug or at least a missing feature that we didn't do that before, by default, and pg_parcel should do it now without any more optional command-line flags.

What are the consequences if we just do that? What are we trying to prevent by not doing it?

add option to keep sequences

f0669df

JQuezada0 requested review from jelder and allenap April 18, 2023 02:01

allenap reviewed Apr 18, 2023

View reviewed changes

use sql_string trait methods + unwrap_or_default

9748bc3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to preserve sequence number generators #34

Add option to preserve sequence number generators #34

JQuezada0 commented Apr 18, 2023

allenap Apr 18, 2023

allenap Apr 18, 2023

allenap Apr 18, 2023

JQuezada0 Apr 23, 2023 •

edited

Loading

jelder Apr 27, 2023 •

edited

Loading

JQuezada0 Apr 27, 2023 •

edited

Loading

allenap May 2, 2023

Add option to preserve sequence number generators #34

Are you sure you want to change the base?

Add option to preserve sequence number generators #34

Conversation

JQuezada0 commented Apr 18, 2023

allenap Apr 18, 2023

Choose a reason for hiding this comment

allenap Apr 18, 2023

Choose a reason for hiding this comment

allenap Apr 18, 2023

Choose a reason for hiding this comment

JQuezada0 Apr 23, 2023 • edited Loading

Choose a reason for hiding this comment

jelder Apr 27, 2023 • edited Loading

Choose a reason for hiding this comment

JQuezada0 Apr 27, 2023 • edited Loading

Choose a reason for hiding this comment

allenap May 2, 2023

Choose a reason for hiding this comment

JQuezada0 Apr 23, 2023 •

edited

Loading

jelder Apr 27, 2023 •

edited

Loading

JQuezada0 Apr 27, 2023 •

edited

Loading