Clarity

Expand all | Collapse all

CA Clarity Tuesday Tip: Time Slicing Stability Tests

System

SystemApr 20, 2011 10:50 AM

  • 1.  CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Apr 19, 2011 07:53 PM
    CA Clarity Tuesday Tip by Shawn Moore, Sr. Principal Support Engineer for 4/19/2011

    Time Slicing Stability Testing:

    A few days ago the question came up about Time Slicing and how stable it is when abruptly stopped or aborted through a major database failure. We've always felt that it had a strong recovery mechanism, but I really wanted to test this out. So I decided to conduct a series of tests to see how time slicing would recover from various failures. (The goal was to try to break the execution of the job so that the job just wouldn't run upon start up.)

    1) I first decided to do some basic stop and start tests, so I ran 3 iterations of stopping bg during the actual creation of slices.

    INFO 2011-04-18 16:51:13,500 [Dispatch Thread-4 : bg@server] niku.blobcrack (none:none:none) Processing 18 new requests.
    INFO 2011-04-18 16:51:13,500 [Dispatch Thread-4 : bg@server niku.blobcrack (none:none:none) ### Processing blobcrack.modifyTeam_set
    INFO  2011-04-18 16:51:13,860 [Dispatch Thread-4 : bg@server] niku.blobcrack (none:none:none) ### Curve set size is 1000
    INFO 2011-04-18 16:51:24,048 [Dispatch Thread-4 : bg@server] niku.blobcrack (none:none:none) ### Curve set size is 1000

    RESULT: Upon startup, time slicing resumed as it should have.

    2) Next I ran several iterations of stopping bg right after startup, prior to actual slice processing.

    RESULT: Again upon starting, the time slicing resumed as it should have.

    The logs noted the following message, which was expected.

    Caused by: java.sql.SQLException: [CA Clarity][Oracle JDBC Driver]Object has been closed.

    at com.ca.clarity.jdbc.base.BaseExceptions.createException(Unknown Source)

    at com.ca.clarity.jdbc.base.BaseExceptions.getException(Unknown Source)

    at com.ca.clarity.jdbc.base.BaseResultSet.getMetaData(Unknown Source)

    at com.niku.union.persistence.PersistenceController.extractResultSet(PersistenceController.java:1586)


    3) I decided to be a bit more drastic and start killing db sessions. After allowing time slicing to start processing, I killed several db sessions, which included the job scheduler.

    The logs noted the following error.

    ERROR 2011-04-19 15:44:46,927 [Dispatch Thread-97 : bg@server] niku.njs (none:none:none) Database error for job 5009012
    com.niku.union.persistence.PersistenceException:
    SQL error code: 28
    Error message: [CA Clarity][Oracle JDBC Driver][Oracle]ORA-00028: your session has been killed

    Executed:
    update cmn_sch_jobs
    set schedule_date = ?,
    status_code = ?,
    last_updated_date = ?,
    last_updated_by = ?
    where id = ?
    and status_code != ?

    4) The Job Scheduler did not automatically restart. (This is known behavior, after a db failure the bg server will need to be started.)

    5) I then manually stopped and started job scheduler to bring it back online.

    6) And I observed Slice processing had continued.

    7) I decided to do one more test and cancel the job after it had failed. I first allowed time slicing to start processing (after reseting slicing), then killed several sessions. Again, I ended up killing the job scheduler.

    .
    .
    INFO 2011-04-19 15:54:50,153 [Dispatch Thread-8 : bg@server] niku.blobcrack (none:none:none) ### Curve set size is 1000
    INFO 2011-04-19 15:55:39,826 [Dispatch Thread-8 : bg@server] niku.blobcrack (none:none:none) ### Curve set size is 1000
    ERROR 2011-04-19 15:56:17,217 [Dispatch Thread-8 : bg@server] niku.blobcrack (none:none:none) Exception during blobcrack process
    java.sql.SQLException: [CA Clarity][Oracle JDBC Driver]No more data available to read.

    at com.ca.clarity.jdbc.base.BaseExceptions.createException(Unknown Source)

    at com.ca.clarity.jdbc.base.BaseExceptions.getException(Unknown Source)

    at com.ca.clarity.jdbc.base.BaseExceptions.getException(Unknown Source)

    at com.ca.clarity.jdbc.oracle.net8.OracleNet8NSPTDAPacket.sendRequest(Unknown Source)

    at com.ca.clarity.jdbc.oracle.OracleImplConnection.rollbackTransaction(Unknown Source)

    8) Observed Job Scheduler not restarting.
    9) Fired up App (didn't want to accidentally kill the process)
    10) Observed the job from the Clarity UI
    11) Cancelled the job (I technically shouldn't have to to this, it should just start up again as in step 6, but I wanted to introduce this a factor because some user will cancel a job after failure. )
    12) I finally created a new immediate mode single run Time Slicing job.
    13) Within a minute, the job started up and began processing.


    Lesson to be learned from this exercise. The Time Slicing job recovers very well and will continue running unless there is a major db failure. In that case you'll need to restart it to get it working.

    Rule of thumb, give the job some time to start processing again, unless you know you had a database failure. Chances are the job will recover nicely.

    Shawn Moore
    CA Technologies


    ps: There is one way I know of to get the job to be in a stuck state, perform a hot backup at the time of significant db activity. Then restore the backup. What may happen is that the job processing tables will be out of sync. This can almost always be fixed by simply canceling and deleting the offending job.


  • 2.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Apr 20, 2011 05:11 AM
    Nice analysis Shaun! Thanks.

    I'll sleep easier now. :blink:


  • 3.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

     
    Posted Apr 20, 2011 10:50 AM
    Great tip Shawn!

    Thanks,
    Chris


  • 4.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Apr 20, 2011 03:24 PM
    Nice work. Thanks.

    Is that v12.1?

    I guess it does not cover the r8.1 problems where you just can't start bg?

    So that is just slicing and does not cover the actuals and ETC not matching in in timeentries and prassignments for which you had verfication queries being passed around for 7.5.3.

    Martti K.


  • 5.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Apr 21, 2011 02:35 PM
    Thanks Shawn for sharing this information..

    Just an observation on Clarity 12.0.6 with Latest patch - 14(?) with SQL server 2005 SP2 ... When Timeslicing job is running and if it is cancelled (from Clarity UI) then Transactions went into Deadlock and no user is able to update anything on Project object. Upon investigation it was found that there was a Lock on INV_INVESTMENT table. After blocking session was killed in database, things went fine. Alternatively restarting BG service also released the lock.. This behavior could be replicated every time Timeslicing job was cancelled when it was running...

    Thanks
    Sangeet


  • 6.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Apr 22, 2011 12:13 AM
    We have recently noticed during our time slice stability test with our customization, Slice Status of 4 set to allocation curve and records never getting processed after than. Still waiting on Slice Status 4 description on state, since it isn't specified in the Technical reference....


  • 7.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Apr 26, 2011 06:05 PM
    Ramganesh:

    The status 4 was introduced to handle defect CLRT-8124, to handle situations where a slice request definition was changed or newly created at the same time some objects had already been marked for reslicing.

    CLRT-8124: If Timeslice is processing a new assignment slice request, other assignment data modified while it's processing may become out of sync.

    For reference here are the Time slicing statuses:

    4- This status will be used to set the slice status of all unmodified object records to 4 so that any of the same type of object records already marked as changed will not be affected.
    3- marked for slicing due to rollover
    2- currently processing this object's slices
    1- marked for processing
    null - up to date

    Here's a KB article as well (the article would apply to assignments and team records - might be a little vague due to the verbiage)
    https://comm.support.ca.com/?legacyid=TEC500607

    Hope that helps.

    -shawn


  • 8.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Apr 27, 2011 12:52 PM
    Dear Shawn,

    Many Thanks for your Tuesday tips which is very useful.we learn new things every week.

    Since this Tip is related to Time slice ,i want to ask you about the slices which are not set up in the UI.

    There are 2 questions to you?

    1. we will take the example of DAILYRESOURCEALLOCCURVE where it will be there in UI (we will define the slice period,Roll over internal,start and end date of the slice) and it will have an entry in prj_blb_slicerequests.we also have PRJ_BLB_SLICES_D_ALC table which also will have daily slice values.How the data is populated in PRJ_BLB_SLICES_D_ALC table.we have not defined the slice period,start and end date of the slice.On what basis the data is populated.In some of the oob portlets like resource planning the data is fetched from PRJ_BLB_SLICES_D_ALC tables.How it works?.

    2.In Financial plans we have TSV where the data is stored in ODF_SSL_*.I read that it populate data with out depending on Time slice and what is the logic behind it.I have read the TEC440146 Article but eager to know how it works internally?

    cheers,
    sundar


  • 9.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Apr 28, 2011 12:22 AM
    Sundar,

    That's good point you have raised. All the internal slices (which doesn't show up on UI) are 'Insta Slices' ..
    I am also interested to know hidden secrets behind their existence, behavior, reconfigurability, importance and impact... I have seen that wherever Clarity uses Time scales ( in Resource Planning, Team details etc ) these 'Insta Slices' are used. Start date, periods and offset decides the range/extent of slices. Practically I have seen that Setting Expiration_date and request_completed date to NULL will force rebuild of these Insta Slices. Theoretically we can also set periods, offset and start date along with setting the above 2 fields as NULL... and rebuild slices to fit 'needs'. However, (according to CA) these 'Insta Slices' are not user configurable or customizable things.
    In all these Insta slices are 'officially' a mystery things for Customers. In my opinion there should be guidelines and a 'official' way to reconfigure/customize these slices.


    Shawn,

    Through this forum I would like to ask CA why Timeslices are still a 'Mystery' thing for customer. Little has been published, described, discussed and explained about Timeslices. There is no scenario based AI included which will advise users about the impact of ranges on data size and subsequent dangers. For example for X timeslice, if I increase daily slice period from 1 yr to 2 yr how many records will be added (or percentage). Why can't Clarity give a warning or advise if user tries to create a Custom timeslice which is already present out of the box (or may be overlaps others). Clarity should be showing 'health' of timeslices based on many AI factors and advise users to have optimal timeslice configuration.

    Thanks
    Sangeet


  • 10.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted May 02, 2011 08:04 PM
    Sundar,

    That is a good question. It sounds like we still need more published on Time Slices. I'll keep that in mind for future tips.

    As far time slice estimation and health of the system, those are good ideas but tricky to implement. The biggest challenge with estimation is that there is so much potential variation in what results when a parent object is slice.

    i.e. I'll take the easiest and most familar example to me: time entries.

    The "parent object" for time entries is the PRTimeentry record. And within that record the prActCurve is what is being sliced. Let's take a couple of example scenarios that will show why estimation would be difficult.

    For any given time entry on a timesheet, a user could entry some number of hours per day or specify a total.

    i.e.

    Example 1:

    Thursday, May 5th, 2011: 1 hour

    Example 2:

    Total hours for the time period beginning on May 2nd, 2011: 10 hours. (approx 2 hours per work day, entered into the totals field on the timesheet)

    The PRTimesheet.prActSum will show 3600 (1 hour) and 36000(10 hours) for Example 1 and 2 respectively.

    In the blob, both items are represented by a single date range and a value. (they probably would be similar is size as well), however, example 1 will only generate one slice, and Example 2 will generate 5 slice records. We don't really know how many slices will be output until we actually inspect the curve and such an operation could be expensive, because we would have to pull the blob into memory and serialize it into an object.

    So at best we could estimate how many parent objects would need reslicing (in this case how many time entries) but we would only be able to derive a range of 1X to 5X the number of slices, which wouldn't be too helpful.

    i.e. if 20,000 time entries needed slicing (assuming these all had some amount of time), we would have anywhere from 20,000 to 100,000 potential slice records generated. That's a big difference!

    Hope this helps! I'll definitely have to add some slicing content to our next tips.


  • 11.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted May 03, 2011 01:25 AM
    Thanks Shawn for taking time for detailed reply..

    I acknowledge that in current design it will be difficult to get those kind of estimates.
    However, if CA can spend some energy in redesign of Timeslices, a lot can be achieved.

    For example Allocation slice occupies the most in Slices table. And in my environment slices=0 Allocation slices accounts 75% of all Allocation slices. If we can remove slice=0 from slice table, it will reduce 75% of Allocation slices and 48% of all slices (In my environment, for example)

    For estimation of Timeentry, for example, if something like slice_count field can be added which will store slice count per BLOB (when time entry is saved).. things can be estimated in more accurate manner. However Assignment like BLOBs may need additional handling as it contains more than one type of slices (Actuals, ETC etc)..

    To Summarize, Timeslicing and related operations/handling starts giving 'growing' pain when Timeslice table size crosses say 10m mark and grows higher. It will help Customers a lot if better sizing/estimation tools are available so that they can take informed decisions while deciding their slices. To start with Estimation guide can also do with details like which slices impacts which area/module of Clarity and possible impact of each slice on size/functionality. Blobcrack process can also be redesigned and some intelligence added to it.


  • 12.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted May 03, 2011 12:26 PM
    Shawn,

    Thanks for considering in future tips and replying promptly.i really appreciate it.Time slices is still an vague topic from niku days,i never able to understand the full functionality and design.i will eagerly wait for you to cover those.

    Sangeet,

    Thanks for you too..

    cheers,
    sundar


  • 13.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted May 03, 2011 07:56 PM
    Sundar,

    Very good points. I would also like to see some additional investment in time slicing. Of course one of my greatest wishes would be that we don't use blobs at all, so that there would be no need for timeslicing. But I've been wishing that for years. ;)

    -shawn


  • 14.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted May 06, 2011 07:48 PM
    In my opinion, Timeslice/Blob is great concept in itself but only thing it has been poorly implemented in Clarity (in current context).

    Lets take an example :

    A Customer has past 4 yrs of Actuals data in the Clarity. If they don't use BLOB then every data element will be in it's expanded form (like it is currently present in Timeslices tables) .. With many data elements now like Actual, ETC, Timeentry, Baseline etc, which are kind of time varying, it will result in Millions of records in expanded table. In that situation, Querying, Inserts, Updates, Deletes in that case would be nightmare.. This is just like configuring every timeslice for say 1500-2000 daily periods.
    Timeslice concept says that let the 'Time-varying' data be compressed in BLOBS and extract only the ones that you need.

    Timeslice design worked well in old NIKU days when in all these 'BLOBs' were low in numbers, and handling them was just easy and efficient. Now after 10 years such BLOBs in system have grown many times (data grown and Clarity introduced new BLOB attributes) but the Timeslicing approach remains the same which was 10 years back. This is what concerns me that if this continues and CA delays "Timeslice Redesign" it will be too late and one of two customers will have Serious performance issues in system 'OR' they will have to have minimum slice ranges in system which will affect their ability to take better business decisions.

    Timeslices : Wake up call for CA ...

    - Sangeet


  • 15.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted May 10, 2011 05:39 PM
    Good points Sangeet:

    With the nature of the web and higher demands for lots of smaller transactions on databases, I think today's db's could be effective at handling the millions of records. Where blob's still shine is that they can handle those situations where the data is slightly different.

    i.e.

    5 hours between Monday and Friday.
    vs.
    1 hour each day

    But maybe someone doesn't know how many hours they spent each day, but knows they spent 5 hours total for the week. The blob is good at handling this situation.

    Always fun to ponder the blob vs. no blob topic.

    -shawn


  • 16.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted May 03, 2011 12:36 PM
    I'd humbly direct anyone interested in "what is going on with timeslicing" to my tip :*)

    13156632

    --

    Also Shawn ; any comment on the vast timeslice performance improvement that can apparently be achieved when the default "sequence number cache" is changed from 8 to something sensible? I understand that this "fix" is coming in a release soon(???) but is it something that can be easily retrofitted to older Clarity versions ourselves?


  • 17.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted May 03, 2011 08:15 PM
    Dave,

    Good call on pointing folks over to your time slice monitoring query! Your participation has been great. (It is nice to have other contributing!)

    As far as the defect you mentioned, here's some info:

    CLRT-57412: Sequence cache size is too small, results in Oracle ALTER SEQUENCE and CMN_ID_SP high CPU load during timeslicing with large numbers of inserts.

    AWR reports show calls to CMN_ID_SP and Alter Sequence in the top activity (by CPU time)

    This fix has been targeted for release in 12.1.1, but until the actual release we won't know for sure if it will be in the actual build. The proposed change is in Java so it requires a rebuild of core code (at least 1 class file) to be integrated into the product.


    -shawn


  • 18.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted May 04, 2011 05:19 AM
    ^ CLRT-57412 ; yes that's the one I meant - thanks!

    I only knew the number before now not the text, its not a "public" bug reference and there is no KB article (yet?).

    (The fact the "fix" is in the java rather than one of the "flat files" on the server is enough info to convince me that we can't retro-fix it ourselves)


  • 19.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted May 18, 2011 05:29 AM

    Dave wrote:

    ^ CLRT-57412 ; yes that's the one I meant - thanks!

    I only knew the number before now not the text, its not a "public" bug reference and there is no KB article (yet?).

    (The fact the "fix" is in the java rather than one of the "flat files" on the server is enough info to convince me that we can't retro-fix it ourselves)
    Just following up this a little (you can tell this fix is one close to my heart!).

    There is a comment about this issue in the new 12.1 SP1 release notes (and I have noticed the same comment was in the re-released 12.1 release notes (but not in the original version of these release notes). The comment says;
    CLRT-57412

    When you timeslice with a large number of inserts, the sequence cache size is too small and causes Oracle ALTER SEQUENCE and CMN_ID_SP to result in a high CPU load. For details on the potential impact this issue has on your Clarity implementation, search for technote: 27581 on CA Support Online
    But I can not search for 27581 on the CA support site - I get no KB results?

    (If I search for the 27581 or the bug reference all I see is the links to the release notes - i.e. the comment above.)

    So can someone "de-classify" TEC27581 so we can see it? :blink:

    Ta!


  • 20.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted May 25, 2011 02:16 PM
    Hi Dave,

    I can't seem to find it either. I'll check with our doc folks to find out the status of the article. Thanks!

    -shawn


  • 21.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Jul 01, 2011 05:56 AM
    and just to follow up a little more,

    The recent (June 2011) generic fix packs for 12.0.6 and 12.1.0 apparently contain the fix for CLRT-57412

    :grin:

    see;
    TEC522707 for 12.0.6
    TEC542313 for 12.1.0


  • 22.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Sep 27, 2011 10:15 AM
    But no reference to it (CLRT-57412 that is) in the fixed defects for SP2 for 12.1 (see I said I was keen on this fix!)

    (which I can't find online at the moment but is uploaded in this thread; CA Clarity 12.1/SP2 is now Electronically Available )

    Any comments?


  • 23.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Nov 01, 2011 06:39 PM
    Shawn,

    On our TRN system, we're facing the SLICE_STATUS = 4 issue where all but 3 of the PRTIMEENTRY records is set to 4 and the Timeslice Job runs for days without doing anything. Did not see in this thread and recommendations for correcting the situation.

    Have killed the Timeslice job and will delete it. Thinking up running a script to set SLICE_STATUS = 1 on all PRTIMEENTRY records, then starting a new Timeslice job.

    Were a bit desperate on this one, as we are trying to test our Crystal Reports out on TRN, before converting our PRD system from Actuate to Crystal. Without testing, we can't move from TRN to PRD - the project is facing a one month delay - if we don't make the switch over window, we have to a wait until after month end for the next window.

    Any suggestions are appreciated.

    Dale


  • 24.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Apr 26, 2011 05:50 PM
    Sangeet,

    That's an interesting behavior. I'll have to try that out, I wouldn't expect it to fail that way. I'll add that to my project list and let you know what I come up with.

    -shawn


  • 25.  RE: CA Clarity Tuesday Tip: Time Slicing Stability Tests

    Posted Apr 29, 2011 09:47 AM
    Thanks! Always thought it would be fun to test this out.

    Michael