Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libpll not being included in raxml-ng-mpi.so #91

Open
tardigradus opened this issue May 20, 2020 · 14 comments
Open

libpll not being included in raxml-ng-mpi.so #91

tardigradus opened this issue May 20, 2020 · 14 comments

Comments

@tardigradus
Copy link

I have built the shared library via

-DUSE_MPI=ON -DUSE_TERRAPHAST=OFF -DBUILD_AS_LIBRARY=ON

but when the library is used by ParGenes the following error is produced:

Error: /trinity/shared/easybuild/software/RAxML-NG/0.9.0-foss-2018b-OpenMPI-3.1.3/lib/raxml-ng-mpi.so: undefined symbol: pll_hardware

Doing ldd on the shared library shows that it is not linked to libpll. My understanding of src/CMakeList.txt was that if USE_LIBPLL_CMAKE is not set, then the prebuilt files for libpll in localdeps will be used. Is that not the case?

@tardigradus tardigradus changed the title libpll not being includes in raxml-ng-mpi.so libpll not being included in raxml-ng-mpi.so May 20, 2020
@amkozlov
Copy link
Owner

Hi @tardigradus,

before we go into details, may I ask what is your goal here? If you want to build ParGenes, then it is sufficient to recursively (!) clone github repo and run install.sh:

https://github.com/BenoitMorel/ParGenes

ParGenes comes with a bundled version of RAxML-NG, so you do not need to compile raxml-ng-mpi.so separately.

@tardigradus
Copy link
Author

I am installing these programs as an administrator on an HPC cluster. Thus, we may have users who want to use RAxML on it own and others who want to run ParGenes. We manage our software using EasyBuild. The basic idea is that each bit of software can be loaded as a so-called module. Modules which depend on other modules just load them as dependencies. This way all the software can be optimized for our architecture. This idea is sort of orthogonal to the approach whereby a program bundles all the stuff it depends on to get people up an running as quickly and painlessly as possible.

So ideally I would want to build PLL as a stand-alone module and then add it as a dependency to the RAxML and ModuleTest modules. I appreciate that this is extra work and not what most users may need. However, if people are going to be using MPI, they are probably going to be doing this on clusters, on which such bundling may be less convenient.

Do you see a good way to go forward here?

@BenoitMorel
Copy link
Collaborator

Dear Tardigradus,

Can you tell me which raxml-ng release (or branch and commit) you are using?

Benoit

@tardigradus
Copy link
Author

I'm using version 0.9.0.

@BenoitMorel
Copy link
Collaborator

BenoitMorel commented May 21, 2020

So ideally I would want to build PLL as a stand-alone module and then add it as a dependency to the RAxML and ModuleTest modules

I also wanted to have the same PLL "module" for building both modeltest and raxml-ng in ParGenes. The issue with this approach is that RAxML and ModelTest releases/tags do not always require the same PLL version. Although I don't like rebuilding PLL several times, it ended up being the less unsatisfying solution...

Doing ldd on the shared library shows that it is not linked to libpll

Libpll is statically built and included in the raxml-ng(-mpi).so, that's why you don't see it with ldd.
The issue you have (with pll_hardware) reminds me something, but I don't remember exactly what was happening. I tried to compile raxml 0.9.0 and to call it with "--raxml-binary", but it seemed to work on my machine. How can I reproduce your setup? Can you send me the exact command lines you use for installing the different components, and the command line for running pargenes?

Although I do not recommend having PLL as a stand-alone module, using an existing raxml-ng module with pargenes is something that sounds realistic to me.

@tardigradus
Copy link
Author

tardigradus commented May 22, 2020 via email

@BenoitMorel
Copy link
Collaborator

My understanding is that raxml-ng should always be compiled with the exact PLL version it points to. As long as this is respected, I don't see any problem. But then I am not sure that there is a real interest. For this specific question maybe @amkozlov knows better than I.

I still can't reproduce the ParGenes issue.

  • did the user try running its command without specifying the raxml-binary (in which case, pargenes would use the raxml binary built during its own installation)? I know that it's not what you want, but I am interested in the result.
  • did the user install ParGenes with the same MPI module loaded as the one you used to install raxml-ng?
  • this is a dirty question, but does /trinity/shared/easybuild/software/RAxML-NG/0.9.0-foss-2018b-OpenMPI-3.1.3/lib/raxml-ng-mpi.so contain the string "pll_hardware"?
  • can the user try running the same command, but replacing pargenes-hpc.py with pargenes-hpc-debug.py, and replacing the argument of --raxml-binary with the raxml-ng executable (instead of the library)?

@tardigradus
Copy link
Author

One of the ideas of EasyBuild is to ensure reproducibility of builds. Thus, a given module ties a specific version of RAxML-NG to a specific version of PLL. This is already done for OpenMPI, which is also reflected in the name of the module.

In answer to the third question: Yes, the the library does contain the string "pll_hardware". I'll contact the user regarding the other questions.

@tardigradus
Copy link
Author

Here are the answers to the other questions:

  1. Yes, I have run ParGenes successfully (as long as I use only one node) with the raxml binary built during its own installation.
  2. I do not think so, these are the modules I used for installation of ParGenes:
    module load GCC/7.3.0-2.30
    module load CMake/3.10.2-GCCcore-7.3.0
    module load impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30
  3. (actually question 4) The debug version using the executable rather than the library works.

I think problem is that whereas RAxML-NG was compiled with OpenMPI, the user loaded the Intel MPI module (impi). That is probably not a good idea. I'll ask the user to rebuild with OpenMPI.

@amkozlov
Copy link
Owner

Hi @tardigradus,

ok I see your point. Still, I believe that at least for the standalone RAxML-NG, linking LIBPLL statically is the best choice, as opposed to having it as a module in EasyBuild. Few benefits of the latter (small - if any - diskspace/memory savings and "cleaner" setup) simply do not justify the descent into the dependency hell, even if tools like EasyBuild can manage it to some degree. Of course, the situation is different for general-purpose libs like OpenMPI. So if one day we will have dozens of programs using LIBPLL, then the overhead of dependency-tracking might pay off, and we can reconsider proper versioning, packaging and dynamic linking of LIBPLL.

For ParGenes, however, it sounds more reasonable to use existing ModelTest-ng and RAxML-NG installation. And conceptually this should be possible, even though both ModelTest-ng and RAxML-NG will have its own statically-linked version of LIBPLL So I hope you can figure out the solution together with @BenoitMorel
.
Best,
Alexey

@tardigradus
Copy link
Author

OK, the build instructions here:

https://github.com/amkozlov/raxml-ng/wiki/installation#mpi-enabled-version

don't say anything about setting up the stuff under localdeps. The original error seen by the the user was

raxml-ng-mpi.so: undefined symbol: pll_hardware

so does that imply that the static PLL library is not being compiled into the final shared object?

Could you maybe point me to a simple test case which I can use to verify the completeness of the build?

@BenoitMorel
Copy link
Collaborator

I just added a test script https://github.com/BenoitMorel/ParGenes/blob/master/tests/test_custom_raxml_library.sh

If you want to test your own raxml-ng-mpi.so, you can do the following:

git clone --recursive https://github.com/BenoitMorel/ParGenes.git
cd ParGenes
./install_scheduler_only.sh
cd tests
./test_custom_raxml_library.sh path_to_your_raxml_library

Is this what you wanted?

@tardigradus
Copy link
Author

Not exactly. Actually I wanted to know whether libpll had been compiled into raxml-ng.so properly. I have recompiled from a recursive pull of the Github repo and raxml-ng.so now contains strings such as

.../libs/pll-modules/libs/libpll/src/compress.c

(the ellipsis is mine).

Is there simple test I can do to check that the

raxml-ng-mpi.so: undefined symbol: pll_hardware

error isn't thrown?

@BenoitMorel
Copy link
Collaborator

ParGenes is the only program that can run raxml as a library. So the simplest way to test that the error is not thrown is to run the 4 commands I gave you in my last message, replacing path_to_your_raxml_library with the path to your raxml-ng-mpi.so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants