Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pglogical crashes AWS Aurora during replication with Segmentation fault #458

Open
JulienAndonov opened this issue Jan 25, 2024 · 2 comments

Comments

@JulienAndonov
Copy link

Hey guys. I have the following issues.
During normal operation, pglogical crashes on the destination side, which is RDS aurora PGSQL 14.8 using pglogical. 2.4.2:

#Destination side
2024-01-25 13:17:09 UTC::@:[537]:LOG: background worker "pglogical apply 131082:4047160452" (PID 6709) was terminated by signal 11: Segmentation fault
2024-01-25 13:17:09 UTC::@:[537]:LOG: terminating any other active server processes
2024-01-25 13:17:09 UTC::@:[537]:FATAL: Can't handle storage runtime process crash
2024-01-25 13:17:09 UTC::@:[537]:LOG: database system is shutess crash
2024-01-25 13:17:09 UTC::@:[537]:LOG: database system is shut down

After that this initial error, the cluster enters into continuous rebooting and crashing, causing significant CPU usage and resources.

On source side we have some queries which are done couple seconds before that crash, but they don't seem to cause the problem as after re-creating the environment and re-executing the queries, the problem doesn't occur.

On the source cluster we are having these errors after the initial error on the destination:
2024-01-25 13:17:09 UTC:(63772):user@database_name:[26536]:LOG: could not receive data from client: Connection reset by peer
2024-01-25 13:17:09 UTC:
(63772):user@database_name:[26536]:STATEMENT: START_REPLICATION SLOT "replication_slot_name" LOGICAL 12/28C9A430 (expected_encoding 'UTF8', min_proto_version '1', max_proto_version '1', startup_params_format '1', "binary.want_internal_basetypes" '1', "binary.want_binary_basetypes" '1', "binary.basetypes_major_version" '1400', "binary.sizeof_datum" '8', "binary.sizeof_int" '4', "binary.sizeof_long" '8', "binary.bigendian" '0', "binary.float4_byval" '0', "binary.float8_byval" '1', "binary.integer_datetimes" '0', "hooks.setup_function" 'pglogical.pglogical_hooks_setup', "pglogical.forward_origins" '"all"', "pglogical.replication_set_names" 'tenant_service', "relmeta_cache_size" '-1', pg_version '140008', pglogical_version '2.4.2', pglogical_version_num '20402', pglogical_apply_pid '6709')
2024-01-25 13:17:09 UTC:*(63772):user@database_name:[26536]:LOG: unexpected EOF on standby connection

Source and Destination:
RDS Aurora PostgreSQL 14.8
pglogical: 2.4.2

Source:
1 Writer
1 Reader

Destination:
1 Writer

@Karthik-Colligence
Copy link

Hey, Any update on this issue? I had the same issue popping up when i try pglogical in a similar scenario. Let us know if any updates on this "Segmentation fault" issue

@andonovj
Copy link

Yes, the problem was related to virtual column. Check if any of the tables you try to migrate has a virtual column. If yes, you have to remove it from the replication and add it on the destination. That worked for me :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants