Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moco examples causing segmentation fault on M1 Mac #3168

Open
DanielFNG opened this issue Mar 22, 2022 · 16 comments
Open

Moco examples causing segmentation fault on M1 Mac #3168

DanielFNG opened this issue Mar 22, 2022 · 16 comments

Comments

@DanielFNG
Copy link

I recently followed the install from source instruction on an M1 Mac so I could use a later version of the system Python when scripting. However, I've found that the Moco C++ examples are now no longer working. They seem to cause a segmentation fault at some point during the study.solve() call e.g:

This is Ipopt version 3.12.8, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:     4400
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        0

zsh: segmentation fault  ./exampleTracking
This is Ipopt version 3.12.8, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:   412582
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:        0

zsh: segmentation fault  ./example2DWalking

It's not a general environment issue, because the plain OpenSim C++ examples compile and run just fine i.e in the simple optimization example:

objective evaluation #: 33 elbow flexion angle = 99.5574 BICshort moment arm  = 0.0493903
Elapsed time = 0s
OpenSim example completed successfully. 

I'm just wondering if there's any known issues when compiling from source on ARM or if there are any special steps to take in the compilation process. Is it recommended to compile for x86 in the meantime? Cheers!

@nickbianco
Copy link
Member

Hi @DanielFNG, we are also encountering this issue with the new Macs. We don't have a solution yet, but we can update our progress here.

@carmichaelong
Copy link
Member

The following tests FAILED:
	  3 - test_generic_optimization (SEGFAULT)
	  4 - test_sliding_mass_minimum_effort (SEGFAULT)
	  6 - test_analytic_optimal_control_solutions (SEGFAULT)
	  7 - test_double_pendulum (SEGFAULT)
	  9 - test_optimal_control_initial_guess (SEGFAULT)
	 10 - test_parameter_optimization (SEGFAULT)
	 60 - testMocoInterface (SEGFAULT)
	 61 - testMocoGoals (SEGFAULT)
	 62 - testMocoParameters (SEGFAULT)
	 63 - testMocoImplicit (SEGFAULT)
	 64 - testMocoConstraints (SEGFAULT)
	 65 - testMocoContact (SEGFAULT)
	 66 - testMocoActuators (SEGFAULT)
	 67 - testMocoInverse (SEGFAULT)
	 68 - testMocoTrack (SEGFAULT)
	 69 - testMocoAnalytic (SEGFAULT)

Failures are all related to OptimizeTNLP() for both casadi and tropter with error EXC_BAD_ACCESS
tropter:

status = app->OptimizeTNLP(nlp);

casadi: in casadi file ipopt_interface.cpp L415

Still need to dig in further. Perhaps first thing I'll check is if the build instructions for OpenSim would default to x64 or arm64. Given that others are using the pre-built artifacts just fine, x64 through Rosetta is likely working, but maybe something here is building arm64 and ipopt doesn't like that.

@nickbianco
Copy link
Member

It seems like this might be an IPOPT problem. After a quick Google, I haven't found anyone with a similar issue though.

@carmichaelong
Copy link
Member

Yes, whoops, should have made that clearer that this is NOT a casadi/troper problem but rather an IPOPT problem. I did find that default settings for our build instructions do create arm64 libraries and executables and confirmed that the GitHub artifacts make x86_64 libraries and executables. Trying a build with x86_64 locally to double check that this will work around the issue.

It seems like at least updated IPOPT packages will work natively with arm given that they're available on brew: https://formulae.brew.sh/formula/ipopt

Do you update IPOPT versions all that much?

@nickbianco
Copy link
Member

Do you update IPOPT versions all that much?

No, we don't. In fact, for Windows, we use pre-built binaries that Chris hosts because no one else hosts them and building from source is a pain.

@aymanhab
Copy link
Member

aymanhab commented Apr 4, 2022

Can we give error/warning early on rather than give users the false impression that this is working @carmichaelong ?

@aymanhab aymanhab added this to the OpenSim 4.4 milestone Apr 4, 2022
@carmichaelong
Copy link
Member

It's a good question... unfortunately still investigating workarounds/fixes. Even if you check for architecture, arm64 will build but fail tests and x86_64 won't build dependencies.

@carmichaelong
Copy link
Member

Some promising signs for a workaround. This leads to no segfaults during run time, but some tests fail (some very narrowly). On a high level: 1. install metis using brew, 2. install IPOPT using coinbrew (which also installs MUMPS), 3. change the dependencies CMake to use IPOPT install folder from coinbrew

Building dependencies

  1. In addition to the current Homebrew packages from the current instructions, also install metis this way (brew install metis)
  2. Install coinbrew (coin-or's package manager). Helpful documentation for installing coinbrew and then working with it can be found here: https://coin-or.github.io/user_introduction and https://coin-or.github.io/coinbrew/. Then fetch and install IPOPT:
./coinbrew fetch [email protected]
./coinbrew build Ipopt --prefix=Ipopt-install --without-asl --without-hsl --disable-java --with-metis-cflags=-I/opt/homebrew/Cellar/metis/5.1.0/include --with-metis-lflags="-L/opt/homebrew/Cellar/metis/5.1.0/lib -lmetis" --reconfigure --no-prompt
  1. Under the correct SUPERBUILD_ipopt section, replace with (will need user input in next step to fill in the variable IPOPT_INSTALL_DIR):
if(SUPERBUILD_ipopt)
      set(IPOPT_INSTALL_CMD "${CMAKE_COMMAND}" -E copy_directory
          "${IPOPT_INSTALL_DIR}"
          "${CMAKE_INSTALL_PREFIX}/ipopt")
      ExternalProject_Add(ipopt
          DOWNLOAD_COMMAND ""
          CONFIGURE_COMMAND ""
          BUILD_COMMAND ""
          INSTALL_COMMAND ${IPOPT_INSTALL_CMD}
          )
endif()
  1. Build the dependencies and pass in where you installed ipopt (depends on --prefix= in step 2) through IPOPT_INSTALL_DIR, e.g:
cmake ../source/dependencies \
-DCMAKE_INSTALL_PREFIX="../opensim_dependencies_install" \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DIPOPT_INSTALL_DIR=/Users/<username>/repos/coinbrew/Ipopt-install

OpenSim Moco test errors

Building OpenSim and Moco should be the same at this point. Overall test errors as follows:

The following tests FAILED:
      3 - test_generic_optimization (Failed)
      7 - test_double_pendulum (Failed)
     60 - testMocoInterface (Failed)
     68 - testMocoTrack (Failed)
Errors while running CTest

More details below. Mostly tropter failures, one casadi failure. @nickbianco would be interested in your thoughts on some of these test errors.

test_generic_optimization
Looks to be some narrow misses, only with tropter

-------------------------------------------------------------------------------
IPOPT C++ tutorial problem HS071; has constraints.
  Finite differences, Ipopt Jacobian, limited memory Hessian
-------------------------------------------------------------------------------
/Users/carmichaelong/repos/opensim-core/source/Vendors/tropter/tests/test_generic_optimization.cpp:136
...............................................................................
 
/Users/carmichaelong/repos/opensim-core/source/Vendors/tropter/tests/test_generic_optimization.cpp:144: FAILED:
  REQUIRE( solution.variables[0] == 1.0 )
with expansion:
  0.99999999 == 1.0
 
-------------------------------------------------------------------------------
IPOPT C++ tutorial problem HS071; has constraints.
  Finite differences, tropter Jacobian, limited memory Hessian
-------------------------------------------------------------------------------
/Users/carmichaelong/repos/opensim-core/source/Vendors/tropter/tests/test_generic_optimization.cpp:151
...............................................................................
 
/Users/carmichaelong/repos/opensim-core/source/Vendors/tropter/tests/test_generic_optimization.cpp:159: FAILED:
  REQUIRE( solution.variables[0] == 1.0 )
with expansion:
  0.99999999 == 1.0
 
-------------------------------------------------------------------------------
IPOPT C++ tutorial problem HS071; has constraints.
  Finite differences, tropter Jacobian and Hessian
-------------------------------------------------------------------------------
/Users/carmichaelong/repos/opensim-core/source/Vendors/tropter/tests/test_generic_optimization.cpp:166
...............................................................................
 
/Users/carmichaelong/repos/opensim-core/source/Vendors/tropter/tests/test_generic_optimization.cpp:174: FAILED:
  REQUIRE( solution.variables[0] == 1.0 )
with expansion:
  0.9999999923 == 1.0
 
-------------------------------------------------------------------------------
IPOPT C++ tutorial problem HS071; has constraints.
  ADOL-C
-------------------------------------------------------------------------------
/Users/carmichaelong/repos/opensim-core/source/Vendors/tropter/tests/test_generic_optimization.cpp:181
...............................................................................
 
/Users/carmichaelong/repos/opensim-core/source/Vendors/tropter/tests/test_generic_optimization.cpp:187: FAILED:
  REQUIRE( solution.variables[0] == 1.0 )
with expansion:
  0.9999999923 == 1.0
 
===============================================================================
test cases:  5 |  4 passed | 1 failed
assertions: 31 | 27 passed | 4 failed

test_double_pendulum
Small-ish(?) miss with tropter

-------------------------------------------------------------------------------
Double pendulum coordinate tracking
  IPOPT, trapezoidal
-------------------------------------------------------------------------------
/Users/carmichaelong/repos/opensim-core/source/Vendors/tropter/tests/test_double_pendulum.cpp:382
...............................................................................
 
/Users/carmichaelong/repos/opensim-core/source/Vendors/tropter/tests/test_double_pendulum.cpp:400: FAILED:
  REQUIRE( (Approx(explicit_solution.states.bottomRows(2)(ir, ic)) .epsilon(1e-2).scale(1.0) == implicit_solution.states.bottomRows(2)(ir, ic)) )
with expansion:
  false
with message:
  (0,0): 1.61165 vs 1.58289
 
 
===============================================================================
test cases:    2 |    1 passed | 1 failed
assertions: 1449 | 1448 passed | 1 failed

testMocoInterface
Bigger(?) miss with tropter

-------------------------------------------------------------------------------
Guess time-stepping - MocoTropterSolver
-------------------------------------------------------------------------------
/Users/carmichaelong/repos/opensim-core/source/OpenSim/Moco/Test/testMocoInterface.cpp:1235
...............................................................................
 
/Users/carmichaelong/repos/opensim-core/source/OpenSim/Moco/Test/testMocoInterface.cpp:1266: FAILED:
  REQUIRE( solutionSim.compareContinuousVariablesRMS(guess) < 1e-2 )
with expansion:
  0.1262155023 < 0.01
 
 
===============================================================================
test cases:  36 |  35 passed | 1 failed
assertions: 254 | 253 passed | 1 failed

testMocoTrack
Bigger(?) miss with casadi

-------------------------------------------------------------------------------
MocoTrack gait10dof18musc
-------------------------------------------------------------------------------
/Users/carmichaelong/repos/opensim-core/source/OpenSim/Moco/Test/testMocoTrack.cpp:47
...............................................................................
 
/Users/carmichaelong/repos/opensim-core/source/OpenSim/Moco/Test/testMocoTrack.cpp:68: FAILED:
  CHECK( std.compareContinuousVariablesRMS( solution, {{"controls",{}}}) < 1e-2 )
with expansion:
  0.0983720594 < 0.01
 
===============================================================================
test cases: 2 | 1 passed | 1 failed
assertions: 3 | 2 passed | 1 failed

@nickbianco
Copy link
Member

Great progress @carmichaelong! Here are my thoughts:

The test_generic_optimization tests are based on an example from the Ipopt documentation: https://coin-or.github.io/Ipopt/INTERFACES.html. I think the variable check that is failing should be equal to exactly 1.0; the other variables in the test checked using Approx(), but I think that is just to avoid typing out all the decimal points.

The check on test_double_pendulum seems like a genuine failure; the relative margin on that test (i.e., .epsilon(1e-2)) is already fairly wide.

The testMocoInteface test also seems like a genuine failure. That test compares a time-stepping simulation to the solution from a MocoProblem with no costs, so it should be a close match.

testMocoTrack is a regression test, so I wouldn't be too surprised if the solution changed slightly with different Ipopt dependencies.

@carmichaelong
Copy link
Member

Thanks @nickbianco for the insight. I tried some quick things to debug a little, but unfortunately no fix from those. A quick(ish) summary:

  • For testMocoInterface and testMocoTrack, I enabled casadi and tropter for both. Answers between both interfaces were very similar (within 1e-5 of each for the value being tested)
  • Tested building IPOPT with different dependencies: 1) using lapack from openblas (usually configure script finds lapack already on Mac), and 2) without metis. Neither changed any test values.

Still some open questions with IPOPT/MUMPS versioning.

  • These recent tests are based on the IPOPT 3.14.5, which when using coinbrew to build, uses MUMPS 5.4.1. This MUMPS version is a major version higher than what's used in the superbuild.
  • Using Superbuild grabs IPOPT 3.12.8, which uses MUMPS 4.10.0. This leads to segfaults.
  • Using coinbrew to install IPOPT 3.13.4 will use MUMPS 4.10.0 (IPOPT 3.13.4 is the latest version that does this). While MUMPS and IPOPT both build, MUMPS tests pass but IPOPT tests segfault, possibly for the same reason as 3.12.8.
  • It is unclear if the earlier IPOPT 3.12/3.13 vs 3.14, or MUMPS 4 vs 5, is the cause of the segfaults. It might be possible to build IPOPT 3.14 with MUMPS 4 to further debug.

Possible next steps:

  • Patch dependencies CMake to use an installation directory of IPOPT (e.g., to point to a coinbrew installation of IPOPT)
  • Try IPOPT 3.14.5 (and MUMPS 5.4.1) on Windows to see if this is platform independent.
  • See if anything in the MUMPS user guide may point at a breaking change between MUMPS 4 and 5 (http://mumps.enseeiht.fr/doc/userguide_5.4.1.pdf)
  • Try to build IPOPT 3.14 with MUMPS 4

@carmichaelong
Copy link
Member

Turns out both IPOPT (3.13 vs 3.14) and MUMPS (4.10.0 vs 5.4.1) versioning explains the differences in tests. I built various combinations of these (see end of this post for some setup notes).

Note that we currently use an even older version of IPOPT (3.12.8) that uses MUMPS 4.10.0. A big change between IPOPT 3.12 and 3.13 is that IPOPT no longer builds the dependencies MUMPS and Metis directly (i.e., does a monolithic build), so this is something to consider if we want to upgrade IPOPT. I'm using 3.13 here since it's easier to mix and match different versions without the monolithic build.

Test results

tl;dr MUMPS 5.4.1 makes 2 of the tests fail, IPOPT 3.14 makes the other 2 fail. IPOPT 3.13 and MUMPS 4.10.0 passes all tests.

IPOPT 3.13 (tag stable/3.13), MUMPS 4.10.0 (tag stable/2.1)

100% tests passed, 0 tests failed out of 118

IPOPT 3.13, MUMPS 5.4.1 (tag stable/3.0)

The following tests FAILED:
	  7 - test_double_pendulum (Failed)
	 68 - testMocoTrack (Failed)

IPOPT 3.14 (tag stable/3.14), MUMPS 4.10.0

The following tests FAILED:
	  3 - test_generic_optimization (Failed)
	 60 - testMocoInterface (Failed)

Rough steps for building dependencies:

  • Metis: brew install metis (could consider doing this from source too)
  • MUMPS: pull source from github, checkout desired release (e.g.,stable/2.1, stable/3.0). Then configure with options, e.g.:
    ../ThirdParty-Mumps/configure --prefix=/Users/<username>/repos/mumps/install --with-metis-cflags=-I/opt/homebrew/Cellar/metis/5.1.0/include --with-metis-lflags="-L/opt/homebrew/Cellar/metis/5.1.0/lib -lmetis"
  • IPOPT: pull source from github, and checkout desired release (e.g., stable/3.13, stable/3.14) Then configure with options, e.g.:
    ../Ipopt/configure --prefix=/Users/<username>/repos/ipopt/install --disable-java --with-mumps-cflags=-I/Users/<username>/repos/mumps/install/include/coin-or/mumps --with-mumps-lflags="-L/Users/<username>/repos/mumps/install/lib -lcoinmumps"

@carmichaelong
Copy link
Member

carmichaelong commented Apr 12, 2022

IPOPT 3.12.8 with pre-built MUMPS 4.10.0 can work too (this could let us avoid bumping IPOPT version to 3.13). Two important steps:

  • configure step: ../Ipopt/configure --prefix=/Users/<username>/repos/ipopt/install --with-metis-incdir=-I/opt/homebrew/Cellar/metis/5.1.0/include --with-metis-lib="-L/opt/homebrew/Cellar/metis/5.1.0/lib -lmetis" --with-mumps-incdir=/Users/<username>/repos/mumps/install/include/coin-or/mumps --with-mumps-lib="-L/Users/<username>/repos/mumps/install/lib -lcoinmumps"
  • Patch IpMumpsSolverInterface.cpp to add #include "mumps_compat.h. This is used in 3.13+ to define COIN_USE_MUMPS_MPI_H (https://github.com/coin-or/Ipopt/blob/stable/3.13/src/Algorithm/LinearSolvers/IpMumpsSolverInterface.cpp#L26) but is not in 3.12.

@aymanhab @nickbianco Could be good to discuss in a meeting what the right scope for fixing the dependencies setup could be for IPOPT.

@carmichaelong
Copy link
Member

@DanielFNG Would you be able to test if the branch with some fixes for an ARM Mac build work for you? So far this should work for the C++ build (i.e,. hasn't been tested Java/Python wrapping). You can see some discussion about it in #3192, and some amended terminal instructions below.

brew install cmake swig gcc pkgconfig autoconf libtool automake wget doxygen metis
git clone https://github.com/opensim-org/opensim-core.git
cd opensim-core
git switch dependencies-mac-arm64
cd ..
mkdir opensim_dependencies_build
cd opensim_dependencies_build
cmake ../opensim-core/dependencies \
      -DCMAKE_INSTALL_PREFIX="../opensim_dependencies_install" \
      -DCMAKE_BUILD_TYPE=RelWithDebInfo \
      -DSUPERBUILD_ezc3d=ON \
      -DOPENSIM_WITH_CASADI=ON \
      -DOPENSIM_WITH_TROPTER=ON
make -j4
cd ..
mkdir opensim_build
cd opensim_build
cmake ../opensim-core \
      -DCMAKE_INSTALL_PREFIX="../opensim_install" \
      -DCMAKE_BUILD_TYPE=RelWithDebInfo \
      -DBUILD_PYTHON_WRAPPING=OFF \
      -DBUILD_JAVA_WRAPPING=OFF \
      -DOPENSIM_DEPENDENCIES_DIR="../opensim_dependencies_install"
make -j8
ctest -j8

@DanielFNG
Copy link
Author

Hello,

apologies for the delay. I am having issues installing IPOPT in particular on this branch, with the following:

clang: error: no such file or directory: '/usr/local/lib/libcoinhsl.dylib'
make[5]: *** [libipopt.la] Error 1
make[4]: *** [all-recursive] Error 1
make[3]: *** [all-recursive] Error 1
make[2]: *** [ipopt-prefix/src/ipopt-stamp/ipopt-build] Error 2
make[1]: *** [CMakeFiles/ipopt.dir/all] Error 2
make: *** [all] Error 2

At one point I manually installed libcoinhsl in x86 mode to test different IPOPT linear solvers (ma57, ma97 etc) and I think this may be conflicting somehow (previously I was getting an error that it was trying to compile for arm64 but the libcoinhsl library was x86 - now it doesn't seem to install it correctly). I have tried clearing out the old libraries and rerunning all of these steps but no luck. I will update if I can later fix this on my machine.

@carmichaelong
Copy link
Member

@DanielFNG Thanks for the feedback, sorry to hear libcoinhsl may be conflicting. A couple of ideas to work around:

  1. There are some flags in the IPOPT configure step that might help. This would involve updating the configure command line in the CMakeLists file here. You could add the options: --without-hsl --without-hsl-lib --without-hsl-incdir. This might help the configure step skip looking for HSL.
  2. If that's still an issue, IPOPT re-did how the configure/build system works for 3.13, specifically to avoid a monolithic build which might be causing some issues in your case. If that could be helpful, I could make another branch that works with 3.13 and see if that fixes your problem. Please let me know and I'd be happy to do so!

@DanielFNG
Copy link
Author

DanielFNG commented Jun 22, 2022

Hello,

I was unable to get this working with the HSL libraries. I couldn't get to the bottom of whether my own previous installation was conflicting or if OpenSim tries to install it as a dependency but it wasn't working for some reason?

However, adding the options you specify above to the CMakeLists.txt file did skip the HSL step and the rest of the build and test worked as normal. All tests passed with this installation for me.

Screenshot 2022-06-22 at 08 18 18

However, I do then get an error when trying to run the final installation step:

-- Installing: /Users/daniel/External/opensim_install/lib/libipopt.dylib
CMake Error at Vendors/tropter/cmake/cmake_install.cmake:41 (file):
  file INSTALL cannot find
  "/Users/daniel/External/opensim_dependencies_install/ipopt/lib/libcoinmumps.1.6.0.dylib":
  No such file or directory.
Call Stack (most recent call first):
  Vendors/tropter/cmake_install.cmake:66 (include)
  Vendors/cmake_install.cmake:47 (include)
  cmake_install.cmake:114 (include)

The only files I have in the directory mentioned above are:

Screenshot 2022-06-22 at 08 53 18

On my machine, I end up with some libcoinmumps.dylib files in the dependencies install directory (under the 'mumps' folder) but none matching the exact filename listed above. To get around this I modified the cmake_install.cmake file in the build directory to exclude the reference to Tropter and everything else was able to install okay, and now when I run Moco executables they show as being 'Apple' format in Activity Monitor.

I'm getting significantly better performance when running things natively so definitely think this is a worthwhile step for M1 users:

image

(Image above generated from 5 Moco simulations using different sets of weights)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants