Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when enabling weak refs #184

Open
wenyuzhao opened this issue Oct 13, 2022 · 6 comments
Open

Crash when enabling weak refs #184

wenyuzhao opened this issue Oct 13, 2022 · 6 comments

Comments

@wenyuzhao
Copy link
Member

MMTK_NO_REFERENCE_TYPES=false MMTK_PLAN=Immix ./openjdk/build/linux-x86_64-normal-server-release/jdk/bin/java  -XX:MetaspaceSize=1G -XX:-UseBiasedLocking -Xms91M -Xmx91M -XX:+UseThirdPartyHeap  --add-exports java.base/jdk.internal.ref=ALL-UNNAMED -Dprobes=RustMMTk -Djava.library.path=./evaluation/probes -cp ./evaluation/probes:./evaluation/probes/probes.jar:/usr/share/benchmarks/dacapo/dacapo-evaluation-git-f480064.jar Harness -n 5 -c probe.DacapoChopinCallback fop
--------------------------------------------------------------------------------
IMPORTANT NOTICE:  This is NOT a release build of the DaCapo suite.
Since it is not an official release of the DaCapo suite, care must be taken when
using the suite, and any use of the build must be sure to note that it is not an
offical release, and should note the relevant git hash.

Feedback is greatly appreciated.   The preferred mode of feedback is via github.
Please use our github page to create an issue or a pull request.
    https://github.com/dacapobench/dacapobench.
--------------------------------------------------------------------------------

===== DaCapo evaluation-git-f480064 fop starting warmup 1 =====
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f7589f4cbc8, pid=3185663, tid=3185664
#
# JRE version: OpenJDK Runtime Environment (11.0.15) (build 11.0.15-internal+0-adhoc.wenyuz.openjdk)
# Java VM: OpenJDK 64-Bit Server VM (11.0.15-internal+0-adhoc.wenyuz.openjdk, mixed mode, tiered, third-party gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xa7bbc8]  Klass::external_name() const+0x18
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/wenyuz/MMTk-Dev/core.3185663)
#
# An error report file with more information is saved as:
# /home/wenyuz/MMTk-Dev/hs_err_pid3185663.log
Compiled method (c1)    1140 2674       3       java.lang.invoke.LambdaFormEditor::getInCache (186 bytes)
 total in heap  [0x00007f7575c20490,0x00007f7575c21940] = 5296
 relocation     [0x00007f7575c20608,0x00007f7575c20710] = 264
 main code      [0x00007f7575c20720,0x00007f7575c215a0] = 3712
 stub code      [0x00007f7575c215a0,0x00007f7575c21600] = 96
 metadata       [0x00007f7575c21600,0x00007f7575c21630] = 48
 scopes data    [0x00007f7575c21630,0x00007f7575c21770] = 320
 scopes pcs     [0x00007f7575c21770,0x00007f7575c218f0] = 384
 dependencies   [0x00007f7575c218f0,0x00007f7575c218f8] = 8
 nul chk table  [0x00007f7575c218f8,0x00007f7575c21940] = 72
Compiled method (c1)    1141 2674       3       java.lang.invoke.LambdaFormEditor::getInCache (186 bytes)
 total in heap  [0x00007f7575c20490,0x00007f7575c21940] = 5296
 relocation     [0x00007f7575c20608,0x00007f7575c20710] = 264
 main code      [0x00007f7575c20720,0x00007f7575c215a0] = 3712
 stub code      [0x00007f7575c215a0,0x00007f7575c21600] = 96
 metadata       [0x00007f7575c21600,0x00007f7575c21630] = 48
 scopes data    [0x00007f7575c21630,0x00007f7575c21770] = 320
 scopes pcs     [0x00007f7575c21770,0x00007f7575c218f0] = 384
 dependencies   [0x00007f7575c218f0,0x00007f7575c218f8] = 8
 nul chk table  [0x00007f7575c218f8,0x00007f7575c21940] = 72
Compiled method (c1)    1141 2674       3       java.lang.invoke.LambdaFormEditor::getInCache (186 bytes)
 total in heap  [0x00007f7575c20490,0x00007f7575c21940] = 5296
 relocation     [0x00007f7575c20608,0x00007f7575c20710] = 264
 main code      [0x00007f7575c20720,0x00007f7575c215a0] = 3712
 stub code      [0x00007f7575c215a0,0x00007f7575c21600] = 96
 metadata       [0x00007f7575c21600,0x00007f7575c21630] = 48
 scopes data    [0x00007f7575c21630,0x00007f7575c21770] = 320
 scopes pcs     [0x00007f7575c21770,0x00007f7575c218f0] = 384
 dependencies   [0x00007f7575c218f0,0x00007f7575c218f8] = 8
 nul chk table  [0x00007f7575c218f8,0x00007f7575c21940] = 72
Compiled method (c1)    1141 2674       3       java.lang.invoke.LambdaFormEditor::getInCache (186 bytes)
 total in heap  [0x00007f7575c20490,0x00007f7575c21940] = 5296
 relocation     [0x00007f7575c20608,0x00007f7575c20710] = 264
 main code      [0x00007f7575c20720,0x00007f7575c215a0] = 3712
 stub code      [0x00007f7575c215a0,0x00007f7575c21600] = 96
 metadata       [0x00007f7575c21600,0x00007f7575c21630] = 48
 scopes data    [0x00007f7575c21630,0x00007f7575c21770] = 320
 scopes pcs     [0x00007f7575c21770,0x00007f7575c218f0] = 384
 dependencies   [0x00007f7575c218f0,0x00007f7575c218f8] = 8
 nul chk table  [0x00007f7575c218f8,0x00007f7575c21940] = 72
Compiled method (c1)    1143 2674       3       java.lang.invoke.LambdaFormEditor::getInCache (186 bytes)
 total in heap  [0x00007f7575c20490,0x00007f7575c21940] = 5296
 relocation     [0x00007f7575c20608,0x00007f7575c20710] = 264
 main code      [0x00007f7575c20720,0x00007f7575c215a0] = 3712
 stub code      [0x00007f7575c215a0,0x00007f7575c21600] = 96
 metadata       [0x00007f7575c21600,0x00007f7575c21630] = 48
 scopes data    [0x00007f7575c21630,0x00007f7575c21770] = 320
 scopes pcs     [0x00007f7575c21770,0x00007f7575c218f0] = 384
 dependencies   [0x00007f7575c218f0,0x00007f7575c218f8] = 8
 nul chk table  [0x00007f7575c218f8,0x00007f7575c21940] = 72
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
fish: Job 1, 'MMTK_NO_REFERENCE_TYPES=false M…' terminated by signal SIGABRT (Abort)
@wenyuzhao
Copy link
Member Author

Not sure what's the problem. But our CI did not catch this.

@qinsoon
Copy link
Member

qinsoon commented Oct 17, 2022

Can you try if it works without harness/probe?

@wenyuzhao
Copy link
Member Author

This is what's happening:

https://github.com/mmtk/mmtk-openjdk/blob/master/mmtk/src/object_scanning.rs#L121

impl OopIterate for InstanceRefKlass {
    fn oop_iterate(&self, oop: Oop, closure: &mut impl EdgeVisitor<OpenJDKEdge>) {
        ...
            match self.instance_klass.reference_type {
                ...
                ReferenceType::Soft => add_soft_candidate(reference),
                ...
                ReferenceType::Final | ReferenceType::Other => {
                    Self::process_ref_as_strong(oop, closure)
                }

When allow_new_candidiate is false, add_soft_candidate basically does nothing to the referent, and leave it unmarked and untracked. For this case, we should treat referents as strong references.

I have a partially working fix for soft-refs only. I'll try to fix it for all the reference types.

@qinsoon
Copy link
Member

qinsoon commented Oct 18, 2022

allow_new_candidate is only set to false if we forward refs, which is only applicable for mark compact. Have you seen allow_new_candidate as false in Immix?

@wenyuzhao
Copy link
Member Author

wenyuzhao commented Oct 18, 2022

The weak processing on JikesRVM and OpenJDK is different.

For JikesRVM, here are the steps of soft processing:

  1. Build a list of ALL soft-refs in the heap during the mutator phase
  2. During GC, do a transitive closure to mark all the live/reachable objects.
  3. Scan the soft-refs list
    a. Remove dead references
    b. For live references, either clear or trace the referent.

For OpenJDK (vanilla, without mmtk):

  1. Does nothing during the mutator phase.
  2. During GC, in marking transitive closure, find and remember all reachable soft-refs. These are called "discovered refs". There are undiscovered soft-refs that are reachable by the reachable soft-refs, and they are not marked at this stage.
  3. Process discover soft-refs
    a. Either trace or clear the referent.
    b. When discovering new soft-refs, treat them as strong.

Yes my previous discussion on allow_new_candidate is probably wrong. For Immix this is never set to false.

But this means that the newly discovered unreachable soft references during reference processing are not properly marked and forwarded. They're simply remembered by the reference processor.

This is correct for JikesRVM, because the soft-refs list is built ahead of time, as part of the SoftReference's constructor. But for openjdk, we gradually discover more soft-refs during transitive closure.

@wenyuzhao
Copy link
Member Author

So for openjdk, we need to (1) disable allow_new_candidate before weak processing (2) during weak processing, treat all references as strong.

(2) is still a problem. When allow_new_candidate is set to true, the object scanning code ignores the referents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants