Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker images later than 19.09 fail to initialize/migrate database without modification (with workaround(s)) #575

Open
pmpcosta opened this issue Nov 30, 2020 · 10 comments

Comments

@pmpcosta
Copy link

pmpcosta commented Nov 30, 2020

I have not been able to do clean installs of 20+ versions of the galaxy docker as the initial setup invariably fails when setting up or upgrading the database. It works for upgrading already running 19+ versions, with manual upgrade of the db.

The problem may be related to the migration to python3 (compatibility issues with sqlalchemy / -migrate?), as the previously clean db upgrade process is now strewn with complaints of duplicate tables, columns, keys, etc.

The duplication complaints made me suspect of competing concurrent processes, and I found that starting with:
-e "GALAXY_HANDLER_NUMPROCS=1" will "cure" the problem. However, even with this workaround, the database migration apparently restarts several times and errors are logged, so this looks more like a band-aid than a solution. I'm afraid it is as far as I can go. Suggestions?

screengrab of failed install log
uwsgi log-failed

screengrab of successful install log
uwsgi log-successful

PS: tests done on a xen virtual machine with plenty of memory and cores, ubuntu 20.04 + docker 19.3

@pmpcosta
Copy link
Author

I tried also -e "UWSGI_MASTER=true" and it worked the first time, but not any more (don't even know if that is a recognized variable). This could be an intermittent problem...

@bartns
Copy link

bartns commented Dec 8, 2020

I am having the same issue. Starting the docker with defaults and the startup script fails..

@dannon
Copy link

dannon commented Dec 8, 2020

I spent a little bit of time debugging this, too, in a different context. It has to do with the recent full migrate changes (so, instead of incremental migrations, on a fresh database we just go 0->~160 at once). I'll try to see if I can build a test case and fix this upstream.

@eschen42
Copy link

Once upon a time I was able to circumvent this issue by initializing with docker-galaxy-stable 19.09, then upgrading to 20.05 but no more; sh manage_db.sh upgrade aborts along the way, with messages from the script about the schema not having expected columns, etc..

@chambm
Copy link
Contributor

chambm commented May 6, 2021

Once upon a time I was able to circumvent this issue by initializing with docker-galaxy-stable 19.09, then upgrading to 20.05 but no more; sh manage_db.sh upgrade aborts along the way, with messages from the script about the schema not having expected columns, etc..

Ditto. I'm trying to launch 20.09. Not only did it not start migrating the DB automatically, but when I try to run the upgrade manually, I get:

# sh manage_db.sh upgrade
Activating virtualenv at /galaxy_venv
INFO:migrate.versioning.api:22 -> 23...
22 -> 23...

Migration script to add columns for tracking whether pages are deleted and
publicly accessible.

Traceback (most recent call last):
  File "/galaxy_venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1284, in _execute_context
    cursor, statement, parameters, context
  File "/galaxy_venv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.DuplicateColumn: column "published" of relation "page" already exists


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./scripts/manage_db.py", line 28, in <module>
    invoke_migrate_main()
  File "./scripts/manage_db.py", line 24, in invoke_migrate_main
    main(repository=repo, url=db_url)
  File "/galaxy_venv/lib/python3.7/site-packages/migrate/versioning/shell.py", line 209, in main
    ret = command_func(**kwargs)
  File "/galaxy_venv/lib/python3.7/site-packages/migrate/versioning/api.py", line 186, in upgrade
    return _migrate(url, repository, version, upgrade=True, err=err, **opts)
  File "<decorator-gen-15>", line 2, in _migrate
  File "/galaxy_venv/lib/python3.7/site-packages/migrate/versioning/util/__init__.py", line 167, in with_engine
    return f(*a, **kw)
  File "/galaxy_venv/lib/python3.7/site-packages/migrate/versioning/api.py", line 366, in _migrate
    schema.runchange(ver, change, changeset.step)
  File "/galaxy_venv/lib/python3.7/site-packages/migrate/versioning/schema.py", line 93, in runchange
    change.run(self.engine, step)
  File "/galaxy_venv/lib/python3.7/site-packages/migrate/versioning/script/py.py", line 154, in run
    script_func(engine)
  File "/galaxy-central/lib/galaxy/model/orm/../../../galaxy/model/migrate/versions/0023_page_published_and_deleted_columns.py", line 23, in upgrade
    c.create(Page_table, index_name='ix_page_published')
  File "/galaxy_venv/lib/python3.7/site-packages/migrate/changeset/schema.py", line 591, in create
    _run_visitor(engine, visitorcallable, self, connection, **kwargs)
  File "/galaxy_venv/lib/python3.7/site-packages/migrate/changeset/schema.py", line 180, in _run_visitor
    conn.dialect, conn, **kwargs).traverse_single(element)
  File "/galaxy_venv/lib/python3.7/site-packages/migrate/changeset/ansisql.py", line 56, in traverse_single
    ret = super(AlterTableVisitor, self).traverse_single(elem)
  File "/galaxy_venv/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 144, in traverse_single
    return meth(obj, **kw)
  File "/galaxy_venv/lib/python3.7/site-packages/migrate/changeset/ansisql.py", line 104, in visit_column
    self.execute()
  File "/galaxy_venv/lib/python3.7/site-packages/migrate/changeset/ansisql.py", line 44, in execute
    return self.connection.execute(self.buffer.getvalue())
  File "/galaxy_venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1012, in execute
    return self._execute_text(object_, multiparams, params)
  File "/galaxy_venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1187, in _execute_text
    parameters,
  File "/galaxy_venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1324, in _execute_context
    e, statement, parameters, cursor, context
  File "/galaxy_venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1518, in _handle_dbapi_exception
    sqlalchemy_exception, with_traceback=exc_info[2], from_=e
  File "/galaxy_venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
    raise exception
  File "/galaxy_venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1284, in _execute_context
    cursor, statement, parameters, context
  File "/galaxy_venv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.DuplicateColumn) column "published" of relation "page" already exists

[SQL:
ALTER TABLE page ADD published BOOLEAN]
(Background on this error at: http://sqlalche.me/e/f405)

@chambm
Copy link
Contributor

chambm commented Sep 21, 2021

@dannon Any progress on this?

@dannon
Copy link

dannon commented Sep 21, 2021

@chambm It looks like it didn't get xref'd here, but I think the root issue was resolved -- I will try to look for it (so we can make sure that fix is included in the appropriate branches for this).

@dannon
Copy link

dannon commented Sep 28, 2021

@chambm galaxyproject/galaxy#11753 is what I was thinking of; when starting up create_db.sh should work to initialize the database correctly now. If the latest version of docker-galaxy-stable is still having this issue, it might be that it should use create_db.sh as well for the fast table creation, or there may still be another bug here to track down.

@chambm
Copy link
Contributor

chambm commented Sep 28, 2021

Is create_db.sh actually relevant? startup.sh calls manage_db.sh which calls scripts/manage_db.py. Maybe startup.sh should call create_db.sh instead if the database path doesn't exist? That sounds like an easy change.

@dannon
Copy link

dannon commented Sep 29, 2021

@chambm Yeah, should be an easy tweak and I would guess it's worth a try to get these older releases working correctly. Regarding relevance, I did see that the ansible-galaxy roles were updated to drop the create_db.sh usage, here: galaxyproject/galaxy#9787, so I assume the intended path forward is towards ultimately dropping it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants