Automic Workload Automation

  • 1.  Moving from Oracle 11g to 12c and problems

    Posted Sep 17, 2018 10:35 AM

    Good Morning,

    I know I have another question out there, but I'm not seeing it show up in the forum so hopefully a second post might be helpful.

    Since updating to Oracle 12c from 11g, we've been having some random WP crashes; sometimes several times a day, and others maybe 1.  Also, sometimes, the WP starts back up and others it does not.

    The error in the trace before the dump data is always the same:

    20180914/074742.731 - U00029108 UCUDB: SQL_ERROR Database handles DB-HENV: 1dad730 DB-HDBC: 1df8068
    20180914/074742.731 - U00003591 UCUDB - DB error info: OPC: 'OCIStmtExecute' Return code: 'ERROR'
    20180914/074742.731 - U00003592 UCUDB - Status: '' Native error: '3135' Msg: 'ORA-03135: connection lost contact
    Process ID: 23342
    Session ID: 245 Serial number: 13079'
    20180914/074742.731 - U00003536 UCUDB: FATAL DATA BASE ERROR: Re-connection will be attempted in 10 seconds.
    20180914/074742.746 - U00003537 UCUDB - RECONNECT: DB call 'OCITransRollback': Return code: '-1'.
    20180914/074742.746 - U00003590 UCUDB - DB error: 'OCITransRollback', 'ERROR ', '', 'ORA-03114: not connected to ORACLE'
    20180914/074742.746 - U00003592 UCUDB - Status: '' Native error: '3114' Msg: 'ORA-03114: not connected to ORACLE'
    20180914/074742.825 - U00003538 UCUDB: Re-connection to database successful. Processing will continue.
    20180914/074742.840 - U00000006 DEADLOCK
    20180914/074742.840 - U00003594 UCUDB Ret: '6' opcode: 'INSR' SQL Stmnt: 'INSERT INTO ABLOB (ABLOB_AH_Idnr, ABLOB_Key, ABLOB_Content, ABLOB_OH_Idnr, ABLOB_Name, ABLOB_ModCnt, ABLOB_Type, ABLOB_ModDate, ABLOB_ReplaceVar, ABLOB_Version) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)'
    20180914/074742.934 - Access violation at 00007FFF272E1102 on read 000000001CB36040
    20180914/074742.934 - U00009907 Memory dump 'Parameter List ' (Address='0000000000BA68A0', Length='1296')

     

    Our DBAs have been reviewing the settings and they are being reviewed again, but I am running out of ideas on how to resolve this one.  I have a sev 2 ticket open with support and unfortunately, the only next steps for us is to enable a rolling trace and hope that its crashes.  Unfortunately, we can't leave this on for long periods of time as it is impactful to the usage of the application.  We are also not able to reproduce it in DEV easily as some of the workload cannot be running for various reasons.

     

    The only other things I've found are some trace files being created in the Oracle client folder.  All of this information has also been sent to support:

    Trace file E:\oracle12c\diag\clients\user_SYSTEM\host_1615971910_82\trace\ora_2920_2652.trc
    DDE: Flood control is not active
    Incident 89 created, dump file: E:\oracle12c\diag\clients\user_SYSTEM\host_1615971910_82\incident\incdir_89\ora_2920_2652_i89.trc
    oci-24550 [3221225477] [Unhandled exception: Code=c0000005 Flags=0
    ] [] [] [] [] [] [] [] [] [] []

    Also in the Incident folder there is this:

    Dump file E:\oracle12c\diag\clients\user_SYSTEM\host_1615971910_82\incident\incdir_89\ora_2920_2652_i89.trc
    [TOC00000]
    Jump to table of contents
    Dump continued from file: E:\oracle12c\diag\clients\user_SYSTEM\host_1615971910_82\trace\ora_2920_2652.trc
    [TOC00001]
    oci-24550 [3221225477] [Unhandled exception: Code=c0000005 Flags=0
    ] [] [] [] [] [] [] [] [] [] []
    [TOC00001-END]
    [TOC00002]
    ========= Dump for incident 89 (oci 24550 [3221225477]) ========
    Tracing is in restricted mode!
    [TOC00003]
    ----- Short Call Stack Trace -----
    dbgexPhaseII()+925<-dbgexProcessError()+2688<-dbgeExecuteForError()+65<-dbgePostErrorDirect()+2313<-kpeDbgSignalHandler()+343<-skgesig_Win_UnhandledExceptionFilter()+167<-00007FFEE9551B82<-00007FFEEC23F1B3<-00007FFEEC221E26<-00007FFEEC23349D<-00007FFEEC1F48D7<-00007FFEEC23262A<-00007FFEEAF91867<-00007FFEE39CFA4D<-00007FFEE39D37A5<-00007FFEE39D3C84<-00007FF74226DC50<-00007FF7422785E9<-00007FF7422523A3<-00007FFEEB1E13D2<-00007FFEEC1B54F4[TOC00003-END]
    [TOC00004]
    ----- START Event Driven Actions Dump ----
    ---- END Event Driven Actions Dump ----
    [TOC00004-END]
    [TOC00005]
    ----- START DDE Actions Dump -----
    Executing SYNC actions
    Executing ASYNC actions
    ----- END DDE Actions Dump (total 0 csec) -----
    [TOC00005-END]
    End of Incident Dump
    [TOC00002-END]
    TOC00000 - Table of contents
    TOC00001 - Error Stack
    TOC00002 - Dump for incident 89 (oci 24550 [3221225477])
    | TOC00003 - Short Call Stack Trace
    | TOC00004 - START Event Driven Actions Dump
    | TOC00005 - START DDE Actions Dump
    End of TOC

     

    Again, I've had our DBAs and also support review these files, but there are no definitive answers, so we are now back in active monitoring and trying to predict a crash so logging can be turned on for the WPs at the time.

     

    If anyone has had any issues like this, please let me know as.

     

    Thanks in advance for all of the help



  • 2.  Re: Moving from Oracle 11g to 12c and problems

    Posted Sep 19, 2018 01:51 PM

    So after many traces and logs sent to support, they have determined that even though nothing has changed in our Automic processes, we are told that we need to upgrade to 11.2.8 at this time.  They found entries relating to a memory leak, and honestly, we've run updates in the past to deal with other memory leaks which were not resolved.f

    We were provided an event to assist in the event a WP Crashes and does not come back, but we do not have any other workarounds at this time.

    If anyone else has any ideas, or maybe workarounds please let me know as we are going to have a difficult time scheduling an update at this time.

    Thanks!



  • 3.  Re: Moving from Oracle 11g to 12c and problems
    Best Answer

    Posted Oct 12, 2018 10:00 AM

    Just wanted to follow up on this one.  We found a workaround that has been helpful for us from Oracle.

    In sqlnet.ora on the server and the local Oracle Client, we had to add in SQLNET.SEND_TIMEOUT=60000.  

     

    We did not have to restart anything and the changes have helped us remain stable.

    From our end, we will be either planning for an update to 11.2.8 as the bug fix actually fixes an issue we raised several months ago, or we will start the work on moving to the newest version.