Skip to content

Dead lock on class loader #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Dead lock on class loader #54

wants to merge 1 commit into from

Conversation

Godin
Copy link
Member

@Godin Godin commented Jun 3, 2016

We got dead lock on class loading.
In our case appClassLoader is excluded. But I see, that jacoco tries to instrument class that is loaded by appClassLoader. How can i workaround this situation

junit launcher:
<jacoco:coverage enabled="@{test.coverage}" destfile="${coverage.File}" includes="com.xxx.tool.:com.xxx.ci.:com.nomagic.cts.:com.xxx.common."
excludes="org.:java.:com.nomagic.rcpf.:com.xxx.registration.:sun.:javax.:com.oracle.:com.sun." exclclassloader="sun.misc.Launcher$AppClassLoader:sun.reflect.DelegatingClassLoader">

Thread dump

Found one Java-level deadlock:

"pool-xxx-thread-1:Loading...":
  waiting to lock monitor 0x0000000042b85060 (object 0x00000000c03c2bb0, a sun.misc.Launcher$AppClassLoader),
  which is held by "Thread-16"
"Thread-16":
  waiting to lock monitor 0x00000000420f2438 (object 0x00000000c03c2bf8, a sun.misc.Launcher$ExtClassLoader),
  which is held by "pool-xxx-thread-1:Loading..."


"pool-tool-thread-1:Loading..." prio=10 tid=0x00007fde1c070800 nid=0x63e7 waiting for monitor entry [0x00007fde1a77f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at org.jacoco.agent.rt_mlrowg.CoverageTransformer.transform(CoverageTransformer.java:78)
    at sun.instrument.TransformerManager.transform(TransformerManager.java:169)
    at sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:365)
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
    at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at sun.misc.Launcher$ExtClassLoader.findClass(Launcher.java:229)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    - locked <0x00000000b58a44a0> (a sun.misc.Launcher$ExtClassLoader)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    at com.xxx.tool.persistence.ci.d.j.<init>(j.java:151)
    at com.xxx.tool.persistence.ci.d.d.doCreateProject(d.java:76)
    at com.xxx.ci.persistence.spi.local.LocalProjectOpenUtil.createProject(LocalProjectOpenUtil.java:84)
    at com.xxx.ci.persistence.spi.local.LocalProjectOpenUtil.doOpenProject(LocalProjectOpenUtil.java:53)
    at com.xxx.ci.persistence.spi.local.LocalProjectRepository.doOpenProject(LocalProjectRepository.java:92)
    at com.xxx.ci.persistence.spi.ProjectOpenUtil.doOpenProject(ProjectOpenUtil.java:50)
    at com.xxx.ci.persistence.spi.ProjectOpenUtil.openProject(ProjectOpenUtil.java:34)
    at com.xxx.ci.persistence.spi.DefaultProjectRepository.openProject(DefaultProjectRepository.java:218)
    at com.xxx.tool.persistence.ProjectLoadService.loadStorage(ProjectLoadService.java:310)
    at com.xxx.tool.persistence.ProjectLoadService.load(ProjectLoadService.java:203)
    at com.xxx.tool.persistence.ProjectLoadService.load(ProjectLoadService.java:163)
    at com.xxx.tool.persistence.ProjectLoadService.loadAndHandleErrors(ProjectLoadService.java:54)
    at com.xxx.tool.persistence.ProjectLoadService$1.execute(ProjectLoadService.java:109)
    at com.xxx.task.Task.construct(Task.java:188)
    at com.xxx.task.SwingWorker$3.call(SwingWorker.java:227)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at com.xxx.task.SwingWorker$SwingWorkerFuture.run(SwingWorker.java:573)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at com.xxx.task.PooledThread.run(PooledThread.java:64)

"Thread-20" prio=10 tid=0x00007fde5dbdf800 nid=0x63e6 waiting for monitor entry [0x00007fde2a372000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:291)
    - waiting to lock <0x00000000b58a44a0> (a sun.misc.Launcher$ExtClassLoader)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:295)
    - locked <0x00000000b58a4458> (a sun.misc.Launcher$AppClassLoader)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    - locked <0x00000000b58a4458> (a sun.misc.Launcher$AppClassLoader)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    at java.beans.Introspector.instantiate(Introspector.java:1470)
    at java.beans.Introspector.findExplicitBeanInfo(Introspector.java:431)
    at java.beans.Introspector.<init>(Introspector.java:380)
    at java.beans.Introspector.getBeanInfo(Introspector.java:167)
    at org.apache.axis.utils.BeanUtils$1.run(BeanUtils.java:92)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.axis.utils.BeanUtils.getPropertyDescriptors(BeanUtils.java:73)
    at org.apache.axis.utils.BeanUtils.getPd(BeanUtils.java:63)
    at org.apache.axis.description.TypeDesc.makePropertyDescriptors(TypeDesc.java:460)
    - locked <0x00000000fed08428> (a org.apache.axis.description.TypeDesc)
    at org.apache.axis.description.TypeDesc.getPropertyDescriptors(TypeDesc.java:439)
    at org.apache.axis.encoding.ser.BeanSerializerFactory.init(BeanSerializerFactory.java:56)
    at org.apache.axis.encoding.ser.BeanSerializerFactory.<init>(BeanSerializerFactory.java:42)
    at org.apache.axis.encoding.ser.BaseSerializerFactory.createFactory(BaseSerializerFactory.java:235)
    at org.apache.axis.client.Call.registerTypeMapping(Call.java:2315)
    at com.xxx.registration.user.Registration_serviceSoapBindingStub.createCall(Unknown Source)
    - locked <0x00000000feda1040> (a com.xxx.registration.user.Registration_serviceSoapBindingStub)
    at com.xxx.registration.user.Registration_serviceSoapBindingStub.checkActivation(Unknown Source)
    at com.xxx.rcpf.product.c.a.n.w(n.java:405)
    at com.xxx.rcpf.product.p$0.run(p$0.java:360)
    at java.lang.Thread.run(Thread.java:662)

@ghost ghost assigned marchof Jan 10, 2013
@Godin Godin modified the milestones: 0.6.6, 0.6.5 Mar 3, 2014
@Godin Godin modified the milestones: 0.7.2, 0.7.1 May 8, 2014
@Godin Godin modified the milestones: 0.7.2, 0.7.3 Sep 12, 2014
@Godin Godin modified the milestones: 0.7.4, 0.7.3, 0.7.5 Feb 20, 2015
@Godin
Copy link
Member

Godin commented Mar 14, 2015

@marchof do you have any thoughts about this ticket?

@marchof marchof modified the milestone: 0.7.5 May 20, 2015
@Godin
Copy link
Member

Godin commented May 25, 2016

@marchof I can't find released version of agent with package org.jacoco.agent.rt_mlrowg , rt_ prefix was used between 0.3.2 and 0.6.1 inclusive. So given the age of the ticket, absence of reproducer, information about JVM and so on, I propose to close it as "can't reproduce".

@Godin
Copy link
Member

Godin commented May 25, 2016

However additional note:
line from stack trace

at org.jacoco.agent.rt_mlrowg.CoverageTransformer.transform(CoverageTransformer.java:78)

maps to valid code in commits between 9daf4f9 and b056199 and corresponds to

logger.logException(ex);

e.g. https://github.com/jacoco/jacoco/blob/v0.4.1/org.jacoco.agent.rt/src/org/jacoco/agent/rt/CoverageTransformer.java#L78

@Godin
Copy link
Member

Godin commented May 25, 2016

Might be related to https://bugs.openjdk.java.net/browse/JDK-8140477 , which contains traces of JaCoCo version 0.7.4.

@bjkail
Copy link
Contributor

bjkail commented May 25, 2016

ClassLoader.loadClass locks itself then delegates to parent, which does the same, so class loader locking order must be child then parent to avoid deadlocks. AppClassLoader delegates to ExtClassLoader, which delegates to boot class loader, so threads must acquire the lock on AppClassLoader then on ExtClassLoader to avoid deadlocks. Also note that -javaagent classes are loaded by AppClassLoader.

Thread-20 has attempted to load a class through AppClassLoader, so AppClassLoader locks itself as normal and then delegates to ExtClassLoader, which attempts to lock itself.

pool-tool-thread-1 is attempting to load a class through ExtClassLoader, so ExtClassLoader locks itself as normal, and then the JaCoCo transformer is called. The JaCoCo transformer requires a new class to be loaded from its class loader (AppClassLoader), which attempts to lock AppClassLoader itself. This thread is attempting to acquire locks in the wrong order, so the deadlock occurs.

To completely avoid the deadlock, I think either JaCoCo instrumentation needs to be disabled for classes loaded by any parents of AppClassLoader, so bootstrap class loader or ExtClassLoader (e.g., add inclextclasses agent option that defaults to false ala inclbootstrapclasses), or JaCoCo needs to preload all classes (e.g., during premain) that it might need during instrumentation, which must include any class that could be loaded from the bootstrap class loader. The former might lose some functionality, and note doing so only fixes the problem for the default options: enabling inclbootstrapclasses=true or inclextclasses=true would still risk a deadlock. The latter is challenging (perhaps use ASM to inspect JaCoCo classes at build time to create a complete list the agent should preload?) and not good for startup performance, but it would only affect inclbootstrapclasses=true and inclextclasses=true JVMs that are likely slow anyway.

@marchof
Copy link
Member

marchof commented May 25, 2016

@bjkail Thanks for the detailed explanation! Do you think you could create a minimal reproducer?

JaCoCo agent is already minimized. So preloading all classes would actually be an option if this solves deadlocks. But would only do this if can proof with a reproducer.

@Godin
Copy link
Member

Godin commented May 25, 2016

@bjkail Yeah, As usual - thank you very much for your valuable comments.

Just some thoughts, guys, about reproducer: So that seems that to trigger loading of classes on a line with logger, we should trigger exception during instrumentation. Or in other cases trigger loading of classes by ASM - I do remember that there are some conditions, which cause ASM to do this, but don't remember exactly which ones.

@bjkail
Copy link
Contributor

bjkail commented May 25, 2016

I don't know how to write a reliable reproducer without modifying the JaCoCo agent to pause to get another thread stuck.

However, while investigating to see if I could write a reproducer, I remembered that this problem is mostly solved as of Java 7 due to AppClassLoader and ExtClassLoader being registered as parallel class loaders (my ClassLoader expertise mostly came from Java 6 and earlier era :-)). Does JaCoCo still support Java 6 and earlier? If yes, perhaps add the inclextclasses option and recommend that users enable that in older releases to avoid the deadlock. If no, then perhaps just close this issue.

@bjkail
Copy link
Contributor

bjkail commented May 25, 2016

@Godin The infamous scenario where ASM loads classes unexpectedly is ClassWriter.getCommonSuperclass when COMPUTE_FRAMES is enabled. Perhaps that's what you're remembering.

@Godin
Copy link
Member

Godin commented May 25, 2016

@bjkail as of today we support quite a huge range - from dinosaur JDK 5 up to bleeding edge JDK 9 EA 😉

As this is about race condition, reproducer with sleep in agent will be a really good start. Then it might be possible to do the same operations in a tight loop or to clock down CPU or slow down JDK to try to demonstrate the same without sleep. So could you please share it?

Given the facts that this ticket was created in Dec 2012 against JaCoCo version <= 0.6.0, most likely problem was faced with Java 6. Not sure whether it worth it or not, but could be possible to determine more precisely using JDK parts of stack trace as a breadcrumbs 😈

Regarding parallel class loaders and Java 7: mentioned earlier https://bugs.openjdk.java.net/browse/JDK-8140477 registered as affecting JDK 9 and with comment that problem comes exactly from parallel capability of class loaders (not sure if it means that earlier versions don't suffer from the same problem).

And you're probably right about my memory regarding unexpected loading of classes by ASM.

@marchof
Copy link
Member

marchof commented May 25, 2016

@bjkail @Godin We have an pending issue for this: #177. ASM folks have a branch which fixes this. But I never found the time to verify.

@Godin
Copy link
Member

Godin commented May 25, 2016

@marchof wow, wasn't aware that they worked out fix. I can't find branch, or you meant patch attached to http://forge.ow2.org/tracker/?func=detail&aid=317551&group_id=23&atid=100023 ? Can try to give you a hand in testing it.

@marchof
Copy link
Member

marchof commented May 25, 2016

@Godin Please see #177 for follow-up.

@bjkail
Copy link
Contributor

bjkail commented May 25, 2016

I can reproduce the issue reliably on Java 6 with jacoco54.zip, and it passes on Java 8 as expected. Rather than adding a sleep, I "faked" the main thread with a manual synchronized+defineClass, but hopefully this is still useful to demonstrate the problem.

Yes, JDK-8140477 is the same problem as this issue. In JDK 7/8, it was sun.misc.Launcher$AppClassLoader and $ExtClassLoader (see stack traces above), but in JDK 9 those classes were moved/rewritten as part of JEP 261 to jdk.internal.misc.ClassLoaders$AppClassLoader and $ExtClassLoader, so I guess they regressed and then fixed the issue by registering the new class loader implementations as parallel capable.

@Godin
Copy link
Member

Godin commented May 25, 2016

@bjkail I didn't yet looked inside, so maybe I'm doing something wrong, but seems that it doesn't work for me:

$ sed '/Override/d' src/jacoco54/Main.java -i.bak

$ java-select 1.5
java version "1.5.0_22"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_22-b03)
Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_22-b03, mixed mode)

$ javac -source 1.5 -target 1.5 src/jacoco54/Main.java

$ java -javaagent:jacocoagent-0.7.6.jar -cp src jacoco54/Main
appCL = sun.misc.Launcher$AppClassLoader@27858610
extCL = sun.misc.Launcher$ExtClassLoader@2f2498b5
Waiting for thread to lock ExtClassLoader
begin loadClass thread
Defining class
end loadClass thread

$ ls jacoco.exec 
jacoco.exec

$ rm jacoco.exec

$ java-select 1.6
java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)

$ java -javaagent:jacocoagent-0.7.6.jar -cp src jacoco54/Main
appCL = sun.misc.Launcher$AppClassLoader@3326b249
extCL = sun.misc.Launcher$ExtClassLoader@3182f0db
Waiting for thread to lock ExtClassLoader
begin loadClass thread
Defining class
end loadClass thread

$ ls jacoco.exec
jacoco.exec

@Godin
Copy link
Member

Godin commented May 25, 2016

@bjkail my bad, should have read your comments inside

java -verbose:class -javaagent:jacocoagent-0.7.6.jar=excludes=jacoco54.Main -cp src jacoco54.Main

perfectly reproduces the problem for me 👍 Thank you!

@bjkail
Copy link
Contributor

bjkail commented May 25, 2016

@Godin I should have given clearer instructions, sorry. jacoco54.Main needs to be excluded because the test requires that synchronized blocks are set up before the JaCoCo code runs for the first time and loads its dependency classes. Glad you found my comment and got it working.

@Godin Godin assigned Godin and unassigned marchof May 27, 2016
@Godin
Copy link
Member

Godin commented Jun 3, 2016

As was pointed out by @bjkail - set of classes for preloading should include not only our classes, but also JVM classes reachable from ours, so that it can't be precomputed at build time and should be computed at runtime. I tried to implement computation of this set using ASM followed by their preloading - see attached commits. There are two versions even. But they anyway fail on a given example:

JVM version is 20.45-b01
Deadlock Detection:

Found one Java-level deadlock:
=============================

"main":
  waiting to lock Monitor@0x00007f0b60004968 (Object@0x00000007b1810000, a sun/misc/Launcher$AppClassLoader),
  which is held by "Thread-1"
"Thread-1":
  waiting to lock Monitor@0x00007f0b60005e68 (Object@0x00000007b18105b0, a sun/misc/Launcher$ExtClassLoader),
  which is held by "main"

Found a total of 1 deadlock.

Thread 32460: (state = BLOCKED)
 - java.lang.ClassLoader.loadClass(java.lang.String, boolean) @bci=0, line=291 (Interpreted frame)
 - java.lang.ClassLoader.loadClass(java.lang.String, boolean) @bci=23, line=295 (Interpreted frame)
 - sun.misc.Launcher$AppClassLoader.loadClass(java.lang.String, boolean) @bci=41, line=301 (Interpreted frame)
 - java.lang.ClassLoader.loadClass(java.lang.String) @bci=3, line=247 (Interpreted frame)
 - jacoco54.Main.run() @bci=17, line=53 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=662 (Interpreted frame)


Thread 32409: (state = BLOCKED)


Thread 32408: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - java.lang.ref.ReferenceQueue.remove(long) @bci=44, line=118 (Interpreted frame)
 - java.lang.ref.ReferenceQueue.remove() @bci=2, line=134 (Interpreted frame)
 - java.lang.ref.Finalizer$FinalizerThread.run() @bci=16, line=171 (Interpreted frame)


Thread 32407: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - java.lang.Object.wait() @bci=2, line=485 (Interpreted frame)
 - java.lang.ref.Reference$ReferenceHandler.run() @bci=46, line=116 (Interpreted frame)


Thread 32397: (state = BLOCKED)
 - org.jacoco.agent.rt.internal_e8dcf09.core.internal.flow.ClassProbesAdapter.visitMethod(int, java.lang.String, java.lang.String, java.lang.String, java.lang.String[]) @bci=33, line=73 (Interpreted frame)
 - org.jacoco.agent.rt.internal_e8dcf09.asm.ClassReader.readMethod(org.jacoco.agent.rt.internal_e8dcf09.asm.ClassVisitor, org.jacoco.agent.rt.internal_e8dcf09.asm.Context, int) @bci=524, line=911 (Interpreted frame)
 - org.jacoco.agent.rt.internal_e8dcf09.asm.ClassReader.accept(org.jacoco.agent.rt.internal_e8dcf09.asm.ClassVisitor, org.jacoco.agent.rt.internal_e8dcf09.asm.Attribute[], int) @bci=1161, line=693 (Interpreted frame)
 - org.jacoco.agent.rt.internal_e8dcf09.asm.ClassReader.accept(org.jacoco.agent.rt.internal_e8dcf09.asm.ClassVisitor, int) @bci=7, line=506 (Interpreted frame)
 - org.jacoco.agent.rt.internal_e8dcf09.core.instr.Instrumenter.instrument(org.jacoco.agent.rt.internal_e8dcf09.asm.ClassReader) @bci=43, line=84 (Interpreted frame)
 - org.jacoco.agent.rt.internal_e8dcf09.core.instr.Instrumenter.instrument(byte[], java.lang.String) @bci=38, line=108 (Interpreted frame)
 - org.jacoco.agent.rt.internal_e8dcf09.CoverageTransformer.transform(java.lang.ClassLoader, java.lang.String, java.lang.Class, java.security.ProtectionDomain, byte[]) @bci=36, line=91 (Interpreted frame)
 - sun.instrument.TransformerManager.transform(java.lang.ClassLoader, java.lang.String, java.lang.Class, java.security.ProtectionDomain, byte[]) @bci=50, line=169 (Interpreted frame)
 - sun.instrument.InstrumentationImpl.transform(java.lang.ClassLoader, java.lang.String, java.lang.Class, java.security.ProtectionDomain, byte[], boolean) @bci=34, line=365 (Interpreted frame)
 - java.lang.ClassLoader.defineClass1(java.lang.String, byte[], int, int, java.security.ProtectionDomain, java.lang.String, boolean) @bci=0 (Interpreted frame)
 - java.lang.ClassLoader.defineClassCond(java.lang.String, byte[], int, int, java.security.ProtectionDomain, boolean) @bci=32, line=631 (Interpreted frame)
 - java.lang.ClassLoader.defineClass(java.lang.String, byte[], int, int, java.security.ProtectionDomain) @bci=9, line=615 (Interpreted frame)
 - sun.reflect.NativeMethodAccessorImpl.invoke0(java.lang.reflect.Method, java.lang.Object, java.lang.Object[]) @bci=0 (Interpreted frame)
 - sun.reflect.NativeMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=87, line=39 (Interpreted frame)
 - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=6, line=25 (Interpreted frame)
 - java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[]) @bci=161, line=597 (Interpreted frame)
 - jacoco54.Main.main(java.lang.String[]) @bci=333, line=44 (Interpreted frame)

For the time being I did not dig deeper why this happens.

@bjkail
Copy link
Contributor

bjkail commented Jun 6, 2016

Preloader is closer to what I imagined than Preloader2 since you need to be sure that no code path in JaCoCo will attempt to load a class.

If you want to keep digging, I suppose the challenge is determining which class didn't get preloaded? Perhaps -Djava.system.class.loader (with a custom class loader with diagnostics in loadClass) would be the fastest way to do that. For a proper unittest, you could write a custom class loader that loads the agent JAR, and then fail if loadClass is called after Preloader is done.

@Godin
Copy link
Member

Godin commented Jun 6, 2016

@bjkail Preloader2 supposed to behave the same way as Preloader, as far as I can see invocations of Remapper.mapare the same as invocations of Preloader.addType, so just a kind of simplified implementation. Anyway usage of any of them ends up in the same deadlock currently.

I completely forgot about -Djava.system.class.loader - indeed this will simplify digging. Thanks for suggestions!

@bjkail
Copy link
Contributor

bjkail commented Jun 7, 2016

Ah, ok. I did not look closely enough, but if ClassRemapper can be used to simplify finding the dependencies, then that seems reasonable.

@Godin
Copy link
Member

Godin commented Jun 7, 2016

Weird, but

Instrumenter instrumenter = new Instrumenter(runtime);
instrumenter.instrument(new ClassReader(PreMain.class.getName()));

in PreMain avoids deadlock, while according to -verbose:class less classes are loaded in this case (subset) compared to Preloader.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants