IT Process Automation

  • 1.  Robust logic for handling >512 rows returned in SQL query

    Posted Dec 11, 2013 10:03 PM

    We use ITPAM to provide a consistent web-services method for accessing SQL data, amongst other things.  Normally our queries return a smallish number of records.  Occasionally we have a query that returns more than the 512 maximum allowed rows within ITPAM.  Here's a way to handle more than 512 rows safely and easily.  I used ITPAM 4.1 (without SP1).  I would like to know if anybody else has come up with a different or better way to handle the possibility of more than 512 rows being returned.

    (Note: This example shows SQL Server 2012 syntax, but you should be able to adapt this easily to any SQL server that supports OFFSET/LIMIT parameters.)

    The Process consists of 3 operators: 

    1. An initial Query Database operator to get the row count
    2. A Loop Operator to split the data query into batches
    3. A Query Database operator inside the Loop Operator to fetch the data records

    The initial Query Database operator performs a simple count() query to determine the number of expected rows, which drives the number of batches required.

    Expression ="select count(1) as 'Count' from (select ...) as X"

    Post-Execution code:

    var numRecordsReturned = Process[OpName].QueryResults.ResultSets[0].Rows[0].Count;
    
    Process.numRecordsReturned = numRecordsReturned;
    
    var numLoopsRequired = Math.ceil(numRecordsReturned / 512);
    
    Process.numLoopsRequired = numLoopsRequired;
     
    The Loop operator has:
    Repeat Count = Process.numLoopsRequired
    Pre-Execution code:
    var rowOffset = (Process[OpName].CurrentLoopIteration) * 512;
    
    Process.rowOffset = rowOffset;
     
    The Query Database operator inside the Loop has:
    Expression = "select myColumnName from myTableName OFFSET " + Process.rowOffset + " ROWS FETCH NEXT 512 ROWS ONLY"

    Post-Execution code:

    var endingIndex=Process.rowOffset;
    
    Process.endingIndex=endingIndex;
    
    for (  var i=0;   i<Process[OpName].QueryResults.ResultSets[0].Rows.length;   i++) {
    
      var actualIndex = endingIndex + i;
    
      Process.OutputList[actualIndex] = Process[OpName].QueryResults.ResultSets[0].Rows[i].myColumnName;
    
    }
     
    The finished product is an accessible Process-level string array called OutputList which contains all the SQL records in the original query result order, even if you have more than 512 values.  I like this method a lot better than looking for the TruncateWarning, exception handling it, and trying to figure out which records were truncated.
     
    Hope this helps!


  • 2.  RE: Robust logic for handling >512 rows returned in SQL query

    Posted Dec 12, 2013 03:21 AM

    THANKS!

    Jürg (Juerg)



  • 3.  Re: Robust logic for handling >512 rows returned in SQL query

    Posted Sep 18, 2015 03:55 PM

    Thank you for posting this. I just implemented something similar to this. However, it is slightly different because I am in SQL 2008 and OFFSET is only available in SQL 2012. First, I had to get the max sequence value of the table: Select MAX(<fieldname>) from <Table Name>

     

    Then, I used the BETWEEN parameter in my SQL Statement, getting the first in the last value each time I looped through. The Looped continued until I reached the max sequence value and the SQL statement was as follows (this is an example): select * from <Table Name> where <field name (from above)> BETWEEN Process.first AND Process.last



  • 4.  Re: Robust logic for handling >512 rows returned in SQL query

    Posted Oct 30, 2015 12:18 PM

    As long as you're not extracting so much data from every row in your query that you bring your SQL server to it's knees you can query more than 512 rows using the OTB 'Query Database' operator. It provides you with a 'RowCount' which can be plugged directly into the 'Repeat Count' of your loop, then accessing your data becomes as easy as accessing an arrays indexes. The key piece is the set rowcount parameter at the beginning of the example below. (be sure to set it high enough that you're not bumping your head on it in the foreseeable future)

     

    SET ROWCOUNT 100000

    SELECT

    [user_id]

    FROM [mdb].[dbo].[usm_contact_domain_role]

    WHERE domain = 'Management'

    order by user_id

     

    I did one project using ranged queries before discovering this method and I'm an ardent fan of it's elegant simplicity compared to ranged queries



  • 5.  Re: Robust logic for handling >512 rows returned in SQL query

    Posted Nov 12, 2015 10:46 AM

    Thanks for posting this option. That does seem a bit simpler to do, but would you essentially just up the row count by 512 each time your run the query?



  • 6.  Re: Robust logic for handling >512 rows returned in SQL query

    Posted Nov 12, 2015 12:22 PM

    Here's a quick picture of the difference between the two approaches, the one on the left using the SQL 'SET ROWCOUNT' to over ride the OTB limitation of 512 rows per query.

    On the left you execute the query once and loop over your results. On the right you first calculate how many times to run your query outside the loop then run your query and loop over the results, then run your query and loop over the results, then run your query and loop over the results, then run your query and loop over the results, then run your query and loop over the results, etc.... (it's as fun as as it sounds )

     

    To answer your question, the row count is dynamic on the left and fixed at 512 on the right; as the outside loop iterates it just pulls successive sets of 512 rows which the inside loop then iterates over.



  • 7.  Re: Robust logic for handling >512 rows returned in SQL query



  • 8.  Re: Robust logic for handling >512 rows returned in SQL query

    Posted Jul 08, 2017 01:21 PM

    Hi,

     

    What does Evaluate_Expression does in the above process?

     

    Thank you,

    Venkat



  • 9.  Re: Robust logic for handling >512 rows returned in SQL query

    Posted Jul 10, 2017 07:48 AM

    Venkat,

     

    You can use it to make sure you've iterated over all the rows returned by Query_Database, and if not it will mark the process as 'failed' so you can know to take a look. I use this in combination with a 'Process.' variable for error logging in processes that I have scheduled off hours to make sure they're functioning as intended.

     

    The expression they evaluate can read something like this for the left example:

    (Process.Query_Database_3.QueryResults.RowCount == Process.Loop_1.CurrentLoopIteration)

     

    And for the right:

    ((Process.Query_Database_4.QueryResults.RowCount == Process.Loop_2.CurrentLoopIteration) && (Process.Query_Database_2.QueryResults.RowCount == Process.Loop_4.CurrentLoopIteration))

     

    Elwynn.