Automic Workload Automation

Expand all | Collapse all

How to limit the number of simultaneous runs based on runtime parameters

  • 1.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 08:14 AM
    I have an SQL job that activates thousands of instances of the same workflow, using ACTIVATE_UC_OBJECT. The SQL job’s post-process starts one instance of the target workflow for each row returned by the query; it passes some of the fields returned by the query as input parameters to the activated workflows. (It uses :PUT_PROMPT_BUFFER to pass these values.)

    Two of the fields that are passed to the target workflow are called server_name and node_name. I need to limit the number of workflows that are running simultaneously, based on these fields:
    1. Based onserver_name: only 100 workflows may run simultaneously for any givenserver_name.
    2. Based onnode_name: only one workflow may run at any given time for any givennode_name.
    There are only a few different values for server_name, so I could conceivably create SYNC objects for these, and assign them at runtime using :ATTACH_SYNC. So at least the first case seems relatively straightforward.

    However, there are several thousand different values of node_name, and new values may appear in the DB table without notice. So it would be difficult to use SYNC objects for the second case. The activated workflows are assigned aliases that include server_name and node_name. So I thought it might be possible to use the Tasks running in parallel option to limit the number of tasks having a particular name. However, I discovered that this option enforces the limit based on the object name and not the task alias. So unfortunately, this won’t work either.

    Does anyone have any other ideas?



  • 2.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 09:28 AM
    Hi Michael

    Task-Preconditions support the usage of alias names. As an action you might start a "re-evaluation in x minutes". Might not be as clean as a SYNC solution, however there's a better scalability.

    Regards
    Joel


  • 3.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 10:30 AM
    We use Group objects and the Start type attribute to control a similar requirement.  While Groups do have some limitations it has proven very effective for our needs.  It is used mainly for control of a system that has multi-thread jobs.  The same program is executed concurrently in a number (up to 999) of distinct jobs.  Each job selects the appropriate host during activation to spread the workload across multiple servers.  We do not use Host Groups as we find the Group method is more suitable for our needs.  

    Perhaps it could be used to address your requirements.


  • 4.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 10:36 AM
    Mark_Hadler_430 I don’t think job groups will work in the second case. We would have to create thousands of them — one for each potential value of node_name. Obviously we would also need to know all of the values of node_name ahead of time.


  • 5.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 11:24 AM

    joel_wiesmann_automic, I tried your approach. I added to the pre-conditions a check for any occurrence of the task alias in the activity window, as shown below.

    9el9bv9jm6zo.pnghttps://us.v-cdn.net/5019921/uploads/editor/hv/9el9bv9jm6zo.png" width="750">

    Unfortunately, the predefined variable &$ALIAS# does not appear to be resolved as expected in this context. Here is an excerpt from the activation log of the task:
    2015-09-03 17:09:27 - IF  resides > 0 times with status ANY_ACTIVE in activity window
    2015-09-03 17:09:27 -    False: 0 matching task(s) were found in activities.
    2015-09-03 17:09:27 - FINALLY run task
    The value of the &$ALIAS# predefined variable should appear between IF and resides. Perhaps the task alias is not set yet during evaluation of pre-conditions. Do you think this is a bug? Do you have a solution?

    Update: The task alias definitely is set when the pre-conditions are evaluated. I’m leaning towards calling this a bug.


  • 6.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 11:26 AM
    Would it be possible to create a number of "generic" Groups as I'm assuming that not every node_name will be active concurrently?  When a node_name is activated an available generic Group is chosen as the Start type and assigned with a PUT_ATT for those jobs.  The obvious trick is to keep track of available Groups, but given functions such as SYS_ACTIVE_COUNT this shouldn't be too hard.


  • 7.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 11:33 AM
    Mark Hadler said:
    Would it be possible to create a number of "generic" Groups as I'm assuming that not every node_name will be active concurrently?  When a node_name is activated an available generic Group is chosen as the Start type and assigned with a PUT_ATT for those jobs.  The obvious trick is to keep track of available Groups, but given functions such as SYS_ACTIVE_COUNT this shouldn't be too hard.
    Mark_Hadler_430, several thousand will be activated around the same time. I guess we would still have to create a lot of groups.

    By the way, the likelihood that more than one workflow instance with the same node_name would be activated at the same time is very low (probably zero); but it is conceivable that one could still be active from the previous day’s run.


  • 8.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 11:51 AM
    Though I'm not really recommending it, have you thought about having the Pre Process of each job check to see how many of its kind are currently active and if it exceeds the threshold post some type of wait?

    If these jobs are relatively quick then perhaps you could have a WHILE loop that checks and WAITs for an appropriate time period.  If the jobs run for an extended period of time the wait could be performed within the loop by an ACTIVATE_UC_OBJECT of a Time Event with the WAIT parameter.

    Another thought would be to have the job themselves perform the concurrent execution checking and waiting.  I doubt that this would be very difficult in most any operating system that I'm familiar with.  UC4 activates them all and it is up to the executing jobs duke it out amongst themselves.


  • 9.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 12:09 PM
     
    Though I'm not really recommending it, have you thought about having the Pre Process of each job check to see how many of its kind are currently active and if it exceeds the threshold post some type of wait?

    If these jobs are relatively quick then perhaps you could have a WHILE loop that checks and WAITs for an appropriate time period.  If the jobs run for an extended period of time the wait could be performed within the loop by an ACTIVATE_UC_OBJECT of a Time Event with the WAIT parameter.
    I’ve heard that :WAIT statements occupy a work process while they are running. Otherwise, I guess this would be an acceptable approach. if a bit obscure and opaque.
    Another thought would be to have the job themselves perform the concurrent execution checking and waiting.  I doubt that this would be very difficult in most any operating system that I'm familiar with.  UC4 activated them all and it is up to the executing job duke it out amongst themselves.
    That could also be an option. The only challenge would be maintaining visibility in UC4 of the fact that one is running and the others are waiting. Honestly I think in this particular scenario it might be acceptable to cancel the subsequent runs if one with the same node_name is active. But that’s not how it’s set up in the current system; and I have been tasked with replicating the current set-up. Still, it got me thinking. I’ll meet with this team tomorrow, so I can ask them if cancelling instead of queuing might be acceptable.


  • 10.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 02:14 PM
    The documentation page for the Preconditions Tab states:
    You can also use  predefined variables as parameter values. To open the Variable picker, you click the element name (= blue text left to the element) in the parameter dialog. The relevant input field changes to a gray text field and the button at the lower right edge becomes active. Click it to open the Variable picker dialog.        
    I clicked on the blue text Task or alias, clicked on the variable picker button in the lower right corner, and selected Alias from the Object category.
    mp22s3yjkfgn.png

    This unfortunately did not work. The same sort of problem occurred. &$ALIAS# was resolved to an empty string.
    2015-09-03 20:01:25 - IF  resides > 1 times with status ANY_ACTIVE in activity window
    2015-09-03 20:01:25 -    False: 0 matching task(s) were found in activities.
    2015-09-03 20:01:25 - FINALLY run task
    I found that if I hard-coded the alias name in the object variable &ALIAS#, the precondition activity check worked fine. Obviously, a hard-coded name is not useful, because the task alias is not predictable.

    However, this gave me an idea. I tried a work-around, using the ordinary object variable &ALIAS# as a go-between. I added a SET VALUE step to the beginning of the preconditions.
    lf9a0r7o0tef.pnghttps://us.v-cdn.net/5019921/uploads/editor/a9/lf9a0r7o0tef.png" width="400">

    The preconditions then looked as follows.
    xyevrkeoy825.png

    This resulted in an even stranger error. The Activation log of the task showed this:
    2015-09-03 20:09:02 - U0020237 The object variable '&' in object: 'XC_INC.ACTION.SET_VALUE', line: '00025' (RunID: '0026031275') has been created with the value '' by  using the command :SET/PUBLISH_VALUE.
    The task details contained another error:
    U0021719 Syntax error in object 'UC0.MAL.TESTJOB1.JOBS_UNIX', line '00000'. 'U1001307 A variable name with the length 0 is not allowed.'.
    I think I’ve found at least a couple of bugs here.




  • 11.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 02:31 PM
    How about an old school idea...

    Set up a folder to contain indicator files.  When a workflow starts, the first thing it does is create a new indicator file called <SERVER>_<NODE>.TXT.  When it is done, it deletes this file.  It would also have a pre-condition rule that keeps the workflow from starting until all undesirable indicator files are clear.

    Same idea could be done using a static variable... add rows and retrieve rows from it.


  • 12.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 03:47 PM
    >@</p>Pete Wirfs", I’m sure that would work, but isn’t that what we have scheduling software for?  ;)


  • 13.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 05:03 PM
    Regarding your strange error;

    2015-09-03 20:09:02 - U0020237 The object variable '&' in object: 'XC_INC.ACTION.SET_VALUE', line: '00025' (RunID: '0026031275') has been created with the value '' by  using the command :SET/PUBLISH_VALUE.


    I suspect that the "set object variable" dialog does not want a preceding '&' as part of the target object name(?)  The typical function of '&' is to tell UC4 that the following string is a variable name, but in this context, it already knows it is expecting a variable name.


  • 14.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 05:13 PM
    Ah ha, found proof in the documentation;
    http://docs.automic.com/documentation/AE/9_SP11/english/AE_WEBHELP/uc4.htm#ucacua.htm?Highlight=pre-conditions

    It says under "SET VALUE";
    "Specify the variable name without a leading &"

    I've run into this before...


  • 15.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 05:53 PM
    One last thought for an approach.  How about having the SQL job’s Post Process activate a Time Event that activates the jobs instead of the Post Process.  The Event's ! Process checks the number of jobs active and activates some more up to the threshold.  Each interval checks the number of active jobs and if it exceeds the threshold it ends else it activates one or more.    If all of the jobs have been activated then the Event ends.


  • 16.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 06:16 PM
    Ah ha, found proof in the documentation;
    http://docs.automic.com/documentation/AE/9_SP11/english/AE_WEBHELP/uc4.htm#ucacua.htm?Highlight=pre-conditions

    It says under "SET VALUE";
    "Specify the variable name without a leading &"

    I've run into this before...
    Good catch. I should have thought of that. Unfortunately, the original problem with the &$ALIAS# predefined variable remains, even if I pass the value through an object variable.

    Activation log:
    2015-09-04 00:11:17 - U0020230 Value '&ALIAS#' in object: 'XC_INC.ACTION.SET_VALUE', line: '00025' (RunID: '0026048425') was changed from '' to ' ' using the command :SET/PUBLISH_VALUE.
    2015-09-04 00:11:17 - U0020206 Variable 'ALIAS#' was stored with value ''.
    Preconditions log:
    2015-09-04 00:11:17 - THEN set object variable ALIAS# to  in scope Task
    2015-09-04 00:11:17 - IF  resides > 1 times with status ANY_ACTIVE in activity window
    2015-09-04 00:11:17 -    False: 0 matching task(s) were found in activities.
    2015-09-04 00:11:17 - FINALLY run task



  • 17.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 03, 2015 06:26 PM
    Ok, here is an interesting update: the problem does not happen in V11.1.1, with the exact same objects.
    • V9 SP11 HF1: &$ALIAS# returns a null value when evaluated in task pre-conditions.
    • V11.1.1 (SP1): &$ALIAS# returns the correct value (the task alias) when evaluated in task pre-conditions.
    I will open a new incident about this shortly, and will post the bug ID here as soon as I have it.


  • 18.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 04, 2015 01:46 AM
    Hi Michael

    Make sure you're having generate at runtime on the task-object as well as on the job plan. I also experienced to have an empty alias (we're on V10) with no generate at runtime setting ticked.

    Also something I reported to our AE engineering was, that if you're using variables in the precondition-activities-compare, they should always be compared as "upper-case". I'm having a cool foreach workflow solution that activates an object using the VARA-key encoded into the alias name. However a precondition-activities-compare cannot be done out-of-the-box because the alias will always be upper-case once the object is activated while the VARA-key remains lower-case (resulting in a activity check of a impossible "MYOBJECT_aliaskey" instead of "MYOBJECT_ALIASKEY").

    As you stated, all the major functions for detecting runing objects seem not to support aliases even tough they allow really cool solutions for reoccurring requirements. This confuses me a bit as it looks like someone had a cool idea (hey, let's add aliases to tasks!) but noone thought about possible use-cases.

    Regards
    Joel


  • 19.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 08, 2015 06:49 PM
    Make sure you're having generate at runtime on the task-object as well as on the job plan. I also experienced to have an empty alias (we're on V10) with no generate at runtime setting ticked.
    I almost always turn Generate at runtime on. It is on for these objects.

    Also something I reported to our AE engineering was, that if you're using variables in the precondition-activities-compare, they should always be compared as "upper-case". I'm having a cool foreach workflow solution that activates an object using the VARA-key encoded into the alias name. However a precondition-activities-compare cannot be done out-of-the-box because the alias will always be upper-case once the object is activated while the VARA-key remains lower-case (resulting in a activity check of a impossible "MYOBJECT_aliaskey" instead of "MYOBJECT_ALIASKEY").
    Good idea, but the names are all upper case in these objects.



  • 20.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 09, 2015 05:51 AM
    • V9 SP11 HF1: &$ALIAS# returns a null value when evaluated in task pre-conditions.
    • V11.1.1 (SP1): &$ALIAS# returns the correct value (the task alias) when evaluated in task pre-conditions.
    I found a rather kludgy work-around for the null &$ALIAS in V9 preconditions. I created SEC_SQLI VARA objects to find running tasks with a particular alias, and set up user-defined preconditions that query those VARAs. This however depending on having consistent task aliases, which in turn required using MODIFY_TASK. I generally eschew this command, so I thought I might encode the pertinent information in the archive keys instead of (or in addition to) the task alias. Now that I’m just querying the EH table, I can look for whatever I like.

    Unfortunately I ran into a new obstacle when pursuing this approach: it seems task preconditions are evaluated before the archive keys of the task have been set, in the following situations:
    • when the archive key fields of the job definition contain variable(s)
    • when the archive keys are set via :PUT_ATT commands in the pre-process of the job
    So although I can find other running tasks, I cannot compare them to the archive keys of the current task because the predefined variables &$ARCHIVE_KEY1# and &$ARCHIVE_KEY2# resolve to null when evaluated in pre-conditions. Anyone else noticed this behavior? It does not seem to be merely a limitation of V9. It happens also in V11.

    The order in which things happen is obviously important, but it is not clear. If this is documented, I could not find the pertinent docs. The documentation of the four stages of execution does not mention when evaluation of variables or pre-conditions occurs.


  • 21.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 09, 2015 06:36 AM
    I also did some testing on this, for me it looks like the archive key variable is calculated during the post process (or at least after the process if there is no post-process there). If you're doing a :STOP within the "preprocess" or "process", variables within the archivekey won't be resolved.

    Agent & Login fields behave completely different - they get resolved during the activation. We're using the {VARA,key,#} - notation there.


  • 22.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 09, 2015 01:08 PM
    Here is how I worked around the above limitations. First, the big picture.

    vjzlvdhl8p09.pnghttps://us.v-cdn.net/5019921/uploads/editor/1q/vjzlvdhl8p09.png" width="869">

    Now, I’ll explain what each piece does. I set up SQLI VARA objects that check for running tasks matching certain criteria.

    TSM.DEV.CHECK_FOR_RUNNING_BACKUPS.VARA_SEC_SQLI
    select
    case when count(EH_AH_Idnr) = 0
    then 'FALSE'
    else 'TRUE'
    end task_exists
    from EH left join OH on EH.EH_OH_Idnr=OH.OH_Idnr
    where EH_OType = 'JOBS'
    and OH.OH_Name like 'TSM.DEV.BACKUP_%'
    and EH.EH_Status < 1800
    and EH.EH_Archive1 = ?
    and EH.EH_Archive2 = ?
    This returns TRUE if any backup job is running (not just active) on a particular node (Archive key 1) and TSM server (Archive key 2). Otherwise, it returns FALSE. If another backup is running with the same details, the new workflow is cancelled.

    TSM.DEV.COUNT_RUNNING_BACKUPS_ON_TSM_SERVER.VARA_SEC_SQLI
    select count(EH_AH_Idnr)
    from EH left join OH on EH.EH_OH_Idnr=OH.OH_Idnr
    where EH_OType = 'JOBS'
    and OH.OH_Name like 'TSM.DEV.BACKUP_%'
    and EH.EH_Status < 1800
    and EH.EH_Archive2 = ?
    This returns the number of running backups on the TSM server specified in Archive key 2. We want to limit the number of backup jobs running simultaneously on the same server. The static VARA TSM.SERVER_MAX_JOBS.VARA_STATIC has the TSM server names as keys, and the maximum number of simultaneous backups for each server in Value 1. If the maximum number of backups are already running, the job will wait and check again in a minute.

    I think this should work. I just need to tweak it a bit and run some tests.


  • 23.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 09, 2015 03:02 PM
    Neat workaround. Guess my operations team would kill me if I start to introduce logic like this ;-).

    The "?" in your sql query seems to be part of a prepared statement - how does that one get filled-in with the appropriate archive key?


  • 24.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 09, 2015 03:26 PM
    I empathize with the operations team. If I could find a simpler and more elegant solution, I would highly prefer not to make things so convoluted. If it were possible, for example, to use script/object variables like &TSM_SERVER# in the names of the SYNC objects used by jobs, a lot of this would be unnecessary. The batch would be much easier to understand, and it would be more straightforward to see how many backup jobs were running on each TSM server.

    Each ? is a placeholder for a bind parameter. I use the archive keys (and occasionally even the object title) to pass values from object variables into bind parameters via predefined variables. E.g., in the first SQLI above, bind parameter 1 is set to &$ARCHIVE_KEY1# and bind parameter 2 is set to &$ARCHIVE_KEY2#. I use the archive keys for this because in V9, predefined variables for object attributes work fine in bind parameters, but ordinary script/object variables do not. This is a limitation of V9, and is fixed in V10 and later. (When we have completed our upgrade to V11, I’ll have to spend some weeks removing the V9-specific hacks and workarounds!)


  • 25.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 10, 2015 03:33 AM
    Oh, this means you can bind &$ARCHIVE_KEY?# in the SEC_SQL VARA? Because the variable picker (v10) doesn't show this as an option. I can only pick AE variables or use the {}-notation. That's quiet cool then.


  • 26.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Sep 10, 2015 03:39 AM
    Joel Wiesmann said:
    Oh, this means you can bind &$ARCHIVE_KEY?# in the SEC_SQL VARA? Because the variable picker (v10) doesn't show this as an option. I can only pick AE variables or use the {}-notation. That's quiet cool then.
    Yes, and in V10 and later, even ordinary script/object variables are resolved correctly in bind parameters. It’s quite useful when one needs to choose predicates or other parameters of the SQL statement based on the result or output of a previous job.


  • 27.  How to limit the number of simultaneous runs based on runtime parameters

    Posted Oct 08, 2015 04:54 AM
    I have discovered that variables in the archive key fields are not resolved until after pre-conditions have passed. This means at the time the pre-conditions are evaluated, archive key 1 contains &NODE_NAME#, and archive key 2 contains &TSM_SERVER#, even though these variables have values in the parent workflow. If I Ignore preconditions, then the task runs and the variables in the archive keys are resolved.

    This is an unexpected obstacle to my current approach; obviously using SQLI queries in pre-conditions will not work if the variables that are going into the bind parameters are not resolved. I thought I had this working before, but now I’m no longer certain. I would be glad to get suggestions on how to work around this problem.