Automic Workload Automation

Expand all | Collapse all

AE 12 on Linux? Multiple Servers? May want to check your syslog ...

  • 1.  AE 12 on Linux? Multiple Servers? May want to check your syslog ...

    Posted Apr 19, 2018 09:05 AM

    Just as a heads-up, before you possibly get a visit from a (rightfully) annoyed Linux admin:

     

    Our AE 12.1.1+hf2 appears to have been dumping a Java exception about some locking issue with the search indexer for many weeks now. And it dumped it every 10 seconds. And it dumped it into /var/log/messages!

     

    It's possible this only happens when you have a setup with multiple servers (we have a "NonStop Cluster"), maybe it's one JWP trying to update the index while the other is already having the lock.

     

    Just so you know. I'll now contemplate whether to incur the wrath of a server admin (by not cleaning /var/log/messages of this) or some auditor (by doing so).

     

    I have opened an Incident for this, also asking what syslog facility this is sent to and thus, how to redirect this into it's own logfile (where, imho, it really should have been in the first place).

    Apr 15 09:59:54 sudeb000096 ucybsmgr: java.lang.RuntimeException: org.apache.lucene.store.LockObtainFailedException: Lock held by this virtual machine: /opt/ae/12.1.1_hf2/server/bin/indexCache-12.1/0/write.lock
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at com.automic.search.lucene.builder.IndexBuilderTransaction.createIndexWriterForClient(IndexBuilderTransaction.java:165)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at com.automic.search.lucene.builder.IndexBuilderTransaction.lambda$run$1(IndexBuilderTransaction.java:65)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.stream.Collectors.lambda$toMap$58(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.stream.ReduceOps$3ReducingSink.accept(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.HashMap$KeySpliterator.forEachRemaining(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.stream.AbstractPipeline.copyInto(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.stream.AbstractPipeline.evaluate(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.stream.ReferencePipeline.collect(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at com.automic.search.lucene.builder.IndexBuilderTransaction.run(IndexBuilderTransaction.java:65)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at com.automic.search.lucene.builder.IndexBuilder.buildIndex(IndexBuilder.java:118)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at com.automic.search.lucene.builder.IndexBuilder.run(IndexBuilder.java:102)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at com.automic.kernel.impl.TimerImpl$CoordinationTask.execute(TimerImpl.java:186)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at com.automic.kernel.impl.DBAction.execute(DBAction.java:87)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at com.automic.kernel.impl.DBAction.execute(DBAction.java:46)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at com.automic.kernel.impl.DBAction.executeInHelperConnection(DBAction.java:32)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at com.automic.kernel.impl.TimerImpl$CoordinationTask.run(TimerImpl.java:168)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at java.lang.Thread.run(Unknown Source)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by this virtual machine: /opt/ae/12.1.1_hf2/server/bin/indexCache-12.1/0/write.lock
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:127)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:776)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: at com.automic.search.lucene.builder.IndexBuilderTransaction.createIndexWriterForClient(IndexBuilderTransaction.java:163)
    Apr 15 09:59:54 sudeb000096 ucybsmgr: ... 24 more

    Best,

    Carsten



  • 2.  Re: AE 12 on Linux? Multiple Servers? May want to check your syslog ...

    Posted Apr 19, 2018 09:50 AM

    Automic: It's a known bug.



  • 3.  Re: AE 12 on Linux? Multiple Servers? May want to check your syslog ...

    Posted Apr 24, 2018 10:09 AM

    ... and when this error condition happens, it may begin dumping trace files like there's no tomorrow. I had 17 GB of automatic trace files over a short time, so you may seriously want to pay attention not to have your partitions run full about this.

     

    You can cure this by stopping and restarting JWPs until the flood of output stops. It seems to happen only occasionally, stopping and restarting a JWP can make it stop, or begin again.