pgbench can be used with PGAdapter, but with some limitations.
Follow these steps to initialize and run benchmarks with pgbench
with PGAdapter:
The default data model that is generated by pgbench
does not include primary keys for the tables.
Cloud Spanner requires all tables to have primary keys. Execute the following command to manually
create the data model for pgbench
:
psql -h /tmp -p 5432 -d my-database \
-c "START BATCH DDL;
CREATE TABLE pgbench_accounts (
aid integer primary key NOT NULL,
bid integer NULL,
abalance integer NULL,
filler varchar(84) NULL
);
CREATE TABLE pgbench_branches (
bid integer primary key NOT NULL,
bbalance integer NULL,
filler varchar(88) NULL
);
CREATE TABLE pgbench_history (
tid integer NOT NULL DEFAULT -1,
bid integer NOT NULL DEFAULT -1,
aid integer NOT NULL DEFAULT -1,
delta integer NULL,
mtime timestamptz NULL,
filler varchar(22) NULL,
primary key (tid, bid, aid)
);
CREATE TABLE pgbench_tellers (
tid integer primary key NOT NULL,
bid integer NULL,
tbalance integer NULL,
filler varchar(84) NULL
);
RUN BATCH;"
pgbench
deletes and inserts data into PostgreSQL using a combination of truncate
, insert
and
copy
statements. These statements all run in a single transaction. The amount of data that is
modified during this transaction will exceed the transaction mutation limits of Cloud Spanner. This
can be worked around by adding the following options to the pgbench
initialization command:
pgbench "host=/tmp port=5432 dbname=my-database \
options='-c spanner.force_autocommit=on -c spanner.autocommit_dml_mode=\'partitioned_non_atomic\''" \
-i -Ig \
--scale=100
These additional options do the following:
spanner.force_autocommit=true
: This instructs PGAdapter to ignore any transaction statements and execute all statements in autocommit mode. This prevents the initialization from being executed as a single, large transaction.spanner.autocommit_dml_mode='partitioned_non_atomic'
: This instructs PGAdapter to use Partitioned DML for (large) update statements. This ensures that a single statement succeeds even if it would exceed the transaction limits of Cloud Spanner, including largecopy
operations.-i
activates initialization mode ofpgbench
.-Ig
instructspgbench
to generate test data client side. PGAdapter does not support generating test data server side.
You can run different benchmarks after finishing the steps above.
Run a default benchmark to verify that everything works as expected.
It is recommended to always use either --protocol=extended
or --protocol=prepared
for Cloud
Spanner. The default --protocol=simple
mode that is used by pgbench
, will include all query
parameters as literals in the query string and cause additional query parsing time on Cloud Spanner.
pgbench "host=/tmp port=5432 dbname=my-database" --protocol=extended
Increase the number of clients and threads to increase the number of parallel transactions.
pgbench "host=/tmp port=5432 dbname=my-database" \
--client=100 --jobs=100 \
--progress=10 \
--protocol=extended
Note that PGAdapter by default creates an internal Cloud Spanner session pool containing at most
400 sessions. Running pgbench
with more than 400 clients requires more sessions.
Starting PGAdapter using Java with a larger session pool:
java -jar pgadapter.jar -p my-project -i my-instance -d my-database \
-r="minSessions=800;maxSessions=800;numChannels=16"
Starting PGAdapter using Docker with a larger session pool:
docker run -p 5432:5432 \
gcr.io/cloud-spanner-pg-adapter/pgadapter \
-p my-project -i my-instance -x \
-r="minSessions=800;maxSessions=800;numChannels=16"
pgbench "host=/tmp port=5432 dbname=my-database" \
--client=800 --jobs=100 \
--progress=10 \
--protocol=extended
Execute the following command to remove the pgbench
tables from your database if you no longer
need them.
psql -h /tmp -p 5432 -d my-database \
-c "START BATCH DDL;
DROP TABLE pgbench_history;
DROP TABLE pgbench_tellers;
DROP TABLE pgbench_branches;
DROP TABLE pgbench_accounts;
RUN BATCH;"