daryl.brown_ACI

Best way to distribute files to multiple folders?

Discussion created by daryl.brown_ACI on Aug 29, 2014
Latest reply on Sep 2, 2014 by Günter_Schulmeister_351
We have an application that will load/consume files when placed in a given input folder, one at a time.  There are a multiple folders that are used for this type of processing in order to allow us to load files in a multi-threaded manner.
The business requirement we've been given is to roll over to the "next" folder for each new file we need to load.  For example, if we have three of these import folders and four files to load, the first file should be sent to folder #1, the second file to folder #2, the third file to folder #3, and then the fourth file would go back to folder #1, etc, etc.


So...any thoughts from the community at large at how best to implement something like this [in V9 OM]?

I've got a couple thoughts on this already, but I'm always game for other elegant solutions...

In each case here, I start by creating a variable object -- FOLDER_LIST_VAR -- containing the various folders, each with a numerical keyword (e.g., keyword "1" = (folder #1), keyword "2" = (folder #2), etc.). 

Idea #1 -- Add another row to this variable -- keyword "CURRENT_INDEX" -- with a value[1] of "1" and value[2] of (total # of folders).  Then, for each workflow that needs to drop off a file, have it read the CURRENT_INDEX value[1].  Then immediately increase that value by 1, or back to 1 if the current value matches the value in value[2], writing that updated value back to the variable object.  Meanwhile, read in the folder path associated with the CURRENT_INDEX value we read in -- that will be the folder used for dropping off the file.
drawbacks -- if multiple workflows try to read from this variable at the same time, they may all retrieve the same value, rather than an incremented value for each read

Idea #2a -- Create a sync object with an initial status of 1, and a defined action that increases the status value by 1, or back to 1 once we reach the max.  Somewhere in the workflow, use SET_SYNC to increment the counter (e.g., the sync object), and GET_SYNC to see what the counter has been set to.  The value corresponds to the row in FOLDER_LIST_VAR to be used.
drawbacks -- without locking the sync object in some way, you risk the same problem as in idea #1 -- multiple objects could be performing these operations at the same time and could wind up with the same values.

Idea #2b -- Create a sync object with an initial status of 1, a status of LOCKED and UNLOCKED, and a defined LOCK action that sets an UNLOCKED sync object to LOCKED, and increases the status value by 1, or back to 1 once we reach the max.  A second action, UNLOCK, will set the state back to UNLOCKED.  Somewhere in the workflow, add a script object that uses this sync object (LOCK / UNLOCK / UNLOCK / Wait) and does a GET_SYNC to read the sync object value (while it is locked), and thereby derive the folder to use from the FOLDER_LIST_VAR.  PSET this value so it can be used downstream and so we don't tie up the sync object longer than needed.
drawbacks -- Slightly more complex than the other solutions...and you have to deal with selecting the sync object from the picklist in order to add it to the object.  (Annoying!)  It's also possible that you wind up hanging all your workflows if the sync object gets stuck in a funky state.

So...any other good ideas?

Outcomes