Shawn_Moore

Title: CA Clarity Tuesday Tip: JGroups Blocking

Discussion created by Shawn_Moore Employee on Oct 26, 2011
Latest reply on Oct 26, 2011 by Chris_Hackett
Title: CA Clarity Tuesday Tip: JGroups Blocking

CA Clarity Tuesday Tip by Shawn Moore, Sr. Principal Support Engineer for 10/25/2011

Thanks to Josh Leone, Director Engineering Services, for providing much of the content of this article.

Though this article may be somewhat complex, it is also valuable in that it provides a good example of an actual scenario of threads blocking other threads.

In 12.1.2, we've resolved a defect (CLRT-62608), which causes thread deadlocking to occur during JGroups activity. This defect is present in 12.1.0 and 12.1.1.

JGroups is used for our multicast communication and most commonly would be exercised by process engine communication, caching consistency updates(notifications) and by the admin services to discover nodes within a cluster.

Thread deadlocking will cause a thread of execution to appear hung, as JVM is not able to resolve which thread can advance next.

This can manifest in a number of ways, but would be seen in a thread dump as a particular thread is being waited upon by many other threads. (see example below) If a thread dump shows the behavior below, it is likely that the environment is running into this problem. It can also manifest itself as database blocking on SQL Server based installs.

Example:

** note: the classes org.jgroups.protocols.FC and org.jgroups.protocols.FRAG2 are good indicators of this problem, if being executed by the thread that other threads are waiting on.

"Post Condition Transition Pipeline 2" daemon prio=10 tid=0x00002aab3ca5b800 nid=0xf25 waiting on condition [0x0000000043dd3000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2116)
at org.jgroups.protocols.FC.handleDownMessage(FC.java:549)
at org.jgroups.protocols.FC.down(FC.java:423)
at org.jgroups.protocols.FRAG2.down(FRAG2.java:154)
at org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:215)
at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:894)
at org.jgroups.JChannel.down(JChannel.java:1623)
at org.jgroups.JChannel.send(JChannel.java:724)
at com.niku.union.utility.SimpleMessenger.broadcast(SimpleMessenger.java:325)
at com.niku.union.utility.SimpleMessenger.broadcast(SimpleMessenger.java:308)
at com.niku.union.utility.caching.CacheMessenger.remove(CacheMessenger.java:45)
at com.niku.union.utility.caching.CacheController.remove(CacheController.java:929)
at com.niku.union.utility.caching.CacheController.remove(CacheController.java:882)
at com.niku.security.cache.UserSessionCache.removeFromPersistence(UserSessionCache.java:294)
at com.niku.security.service.AuthenticationService.delete(AuthenticationService.java:217)
at com.niku.bpm.utilities.BpmUtils.logout(BpmUtils.java:93)
at com.niku.bpm.engine.exprevaluator.ExpressionEvaluator.evaluate(ExpressionEvaluator.java:231)
at com.niku.bpm.engine.rules.PostConditionTransitionPipeline.evaluatePostConditions(PostConditionTransitionPipeline.java:199)
at com.niku.bpm.engine.rules.PostConditionTransitionPipeline.execute(PostConditionTransitionPipeline.java:79)
at com.niku.bpm.engine.rules.Pipeline.run(Pipeline.java:221)

The blockee's could be:

clarity1@app830 threaddumps]$ grep 0x00002aaab87ca690 bg2-6.txt
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
- parking to wait for <0x00002aaab87ca690> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

-shawn

Outcomes