A simple example of using the DSE Cassandra driver in a Graal native image
This repositoriy aims to provide a minimal example of using the modern version of the DataStax Java driver in a native image generated by the Graal tool chain. These examples explicitly target the driver only; adding additional libraries and dependencies will by definition complicate Graal's image generation process and obscure the functionality being illustrated here.
As this effort went along it became clear that the drivers interaction with native code represented a unique challenge. This interaction (and it's functionality when running in a native image) is explored in more depth below.
The keyspace counter application can be run directly as a Java application using Gradle:
./gradlew run
An argument of "debug" will enable log4j output to the console:
./gradlew debug
The native image is generated using the gradle-graal plugin:
./gradlew nativeImage
Once generation completes the native image can be run directly:
build/graal/keyspaceCounter
The Java driver attempts to use various jnr libs for a few native operations. These include:
- Retrieving system time with microsecond granularity via libc through jnr-ffi and jffi
- Retrieving PID via jnr-posix
- Retrieving CPU information via the Platform class in jnr-ffi
The Native class provides a reasonable starting point for all of this functionality
Normally jnr-ffi will generate a proxy for the native library interface (Native.LibCLoader.LibC in our case) via asm. In this case the methods of the interface result in calls to the underlying native functionality via jffi. If this process fails an "error proxy" is returned. This proxy is a conventional java.lang.reflect.Proxy instance which throws a specified exception on every method call. This error proxy is defined here and used in catch blocks of the load() method of the same class.
For Graal to do it's work all classes must be available at build-time. Note that proxies like this error proxy are supported but must be declared in the proxy config and eval'd at build-time.
Note that these proxies are only required if we have a problem loading the underlying native code at run-time, but since there's no way to control for this at build-time they must be included here.
Once we've accounted for the dynamic proxies above we immediately run into another problem.
As mentioend above jnr-ffi will attempt to dynamically create a class implementing the library interface via asm. The output of this process is then used to generate a dynamic class via ClassLoader.defineClass() here. This results in the following error message in the log4j output:
31 [s0-admin-0] DEBUG com.datastax.oss.driver.internal.core.os.Native - Error loading libc java.lang.RuntimeException: com.oracle.svm.core.jdk.UnsupportedFeatureError: Unsupported method java.lang.ClassLoader.defineClass(String, byte[], int, int) is reachable: The declaring class of this element has been substituted, but this element is not present in the substitution class at jnr.ffi.provider.jffi.AsmLibraryLoader.generateInterfaceImpl(AsmLibraryLoader.java:247) at jnr.ffi.provider.jffi.AsmLibraryLoader.loadLibrary(AsmLibraryLoader.java:89) at jnr.ffi.provider.jffi.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:44) at jnr.ffi.LibraryLoader.load(LibraryLoader.java:325) at jnr.ffi.LibraryLoader.load(LibraryLoader.java:304)
This is consistent with the stated limitation on dynamic class loading in the Graal docs.
There is a workaround for this problem, but it's not terribly intuitive. jnr-ffi allows users to opt out of the ASM-based loading logic mentioned above in favor of a reflection-based implementation which uses dynamic proxies in all cases (not just for the error case discussed above). This switch is enabled via the system property "jnr.ffi.asm.enabled" (see here for more detail). The generated proxy object still uses jffi for execution.
With the AsmLibraryLoader disabled we now come across a new issue:
28 [s0-admin-0] DEBUG com.datastax.oss.driver.internal.core.util.Reflection - Building from unqualified name AtomicTimestampGenerator 28 [s0-admin-0] DEBUG com.datastax.oss.driver.internal.core.util.Reflection - Trying with default package com.datastax.oss.driver.internal.core.time.AtomicTimestampGenerator 31 [s0-admin-0] DEBUG com.datastax.oss.driver.internal.core.os.Native - Error accessing libc.gettimeofday() java.lang.NoClassDefFoundError: Ljava/lang/UnsatisfiedLinkError; at com.oracle.svm.jni.functions.JNIFunctions.FindClass(JNIFunctions.java:328) at com.kenai.jffi.Foreign.dlopen(Foreign.java) at com.kenai.jffi.Library.dlopen(Library.java:96) at com.kenai.jffi.Library.openLibrary(Library.java:158) at com.kenai.jffi.Library.getCachedInstance(Library.java:131) at jnr.ffi.provider.jffi.NativeLibrary.openLibrary(NativeLibrary.java:101) at jnr.ffi.provider.jffi.NativeLibrary.loadNativeLibraries(NativeLibrary.java:79) at jnr.ffi.provider.jffi.NativeLibrary.getNativeLibraries(NativeLibrary.java:70) at jnr.ffi.provider.jffi.NativeLibrary.getSymbolAddress(NativeLibrary.java:49) at jnr.ffi.provider.jffi.DefaultInvokerFactory.createInvoker(DefaultInvokerFactory.java:107) at jnr.ffi.provider.jffi.ReflectionLibraryLoader$LazyLoader.get(ReflectionLibraryLoader.java:151) at jnr.ffi.provider.jffi.ReflectionLibraryLoader$LazyLoader.get(ReflectionLibraryLoader.java:87)
At first glance this looks like another error driven by a failure to include a class in the reflection config but there's actually something else going on here. The exception in question is being generated by C code as a result of a failed dlopen() call; see this code and this declaration for the relevant context. After some debugging it became clear that this only happens for the gettimeofday() ops in Native; the PID retrieval logic (which uses jnr-posix) appears to work without issue even on Graal. But we're loading libc, which should be a fairly standard process.
Except when it isn't. We load libc via the following code:
try {
libc = LibraryLoader.create(LibC.class).load("c");
runtime = Runtime.getRuntime(libc);
} catch (Throwable t) {
libc = null;
LOG.debug("Error loading libc", t);
}
Note that we explicitly specify the name of the lib here. But jnr-posix defers to jnr-ffi's utility for specifying the standard C library name which yields "libc.so.6" for Linux rather than "c" (see here for more detail). This appears to have a corollary on my local environment (Fedora Core 28). libc is available at /lib64/libc.so.6 while /lib64/libc.so does exist but is an ld script. jnr-ffi's loading code has logic which attempts to detect this case but that logic isn't getting triggered when this code runs on Graal; it just fails outright.
Switching the code above to use "libc.so.6" rather than "c" causes the native logic to work correctly, albeit in a stripped-down sandbox app which only covers this functionality. Either name works when running on a standard JVM; it's only Graal that requires the more precise name. But since this does bring our jnr-ffi usage into agreement with what jnr-posix does this seems entirely reasonable.
As mentioned above the driver attempts to retrieve CPU information via jnr.ffi.Platform. This class computes a set of platform-specific values based on common system properties: it doesn't execute any native code. It's loaded via reflection from within com.datastax.oss.driver.internal.core.os.Native so including it within the reflection config appears to be adequate.
Note that we also have to add platform-specific type support as well; thus the inclusion of jnr.ffi.Platform$Linux in addition to jnr.ffi.Platform. This has the result of making the generated image platform-specific.
com.datastax.oss.driver.internal.core.os.Native leverages jnr-posix to obtain the current PID. Ultimately this appears to use the same jnr-ffi library loading code used above