The SMPolicyTraceAnalysis tool can now graph transactions from the DataDirect odbctrace.out file.
The SMPolicyTraceAnalysis is a tool that can graph transactions from trace logs, giving graphs of distribution of transaction times, number of concurrent requests, variation of number of input request and transaction times per timestep. The main usage is policy server and webagent trace logs - but it also can parse and give graphs for a number of other types of SSO related trace logs.
The latest version can be downloaded from the communities site here : Siteminder Policy Trace Analysis
The ODBC driver from DataDirect is used by the policy server to connect to ODBC policy store, session store or user store. When there are performance problems - it may be response times from the database - so then enabling the DataDirect odbctrace.out file and running in through the Analysis tool will now give some graphs that may help.
1. The new option - selecting the ODBC trace log:
2. Setup DataDirect ODBC trace on Linux.
For Unix systems the DataDirect ODBC trace log is enabled via settings in : system_odbc.ini file. The following settings are needed:
More details on setting up the ODBC tracing here :
Analysis of the input files gives a pdf report file, and a directory "longtrans" of long, terminated, or incomplete transactions.
As well as printing out a section for stats on "AllRequests", we also have separate sections that report stats for some of the more commonly "slow" ODBC requests : SQLConnect, SQLExecute, SQLRowCount, SQLFetch, SQLAllocHandle.
The following reports a few of the graphs and explanation in the overall "AllRequests" section:
3.1 Stats Results
Overall stats of the trace analysis :
Here we can see all transactions were < 10sec, av time was 5ms response etc.
3.2 Distribution of Transaction Times
Overall distribution of the request times, it is a kind-of exponential x axis, so we can report both ms, sec and minute response times visually in a meaningful way (and a table of values is printed as well).
Here we can see that most transactions are quick, a number took > 300ms, and there is one that took, 2-3sec.
3.2 Concurrent Requests
The number of requests being processed at the same time per timestep, both a max, and av value are reported.
We can that generally in the 10sec interval we have on av 1-2 requests occurring (blue), but also within that 10sec interval there were peaks (red) where some 5-10 requests are occurring in parallel. We can also see we have a spike of 50 requests occurring at once in the graph - so the policy server at that time is executing 50 ODBC requests, up from background of 5-10 requests.
3.2 Graph of Requests Started and Completed.
Graphs a count of the number of "start" and "end" requests that we see per interval. When requests are completed at the same rate as they are started, the blue and red graphs overlap and we only see the red.
We can see that ODBC requests are about 4,000 per 10sec earlier in the day, and from the prior graph we can see that matches the slight increase in av active threads. Then it drops to about 500 per 10sec, then we have that peak going up to 3,000 requests come in at that time the connections increase.
Note: 3,000 may appear a lot, but from the ODBC trace level, it logs each SQLExecute and each SQLFetch as a separate operation, so that could mean one execute, and 2,999 fetch operations (it doesn't quite mean that in this case, since we have 50 requests happening in parallel, but just thought it best to explain why the numbers can be fairly high).
3.3 Graph of Av Response time per timestep.
To give an idea of when performance slows down during a trace we plot the av time for the last five time intervals. Then we can see where there are slowdowns in transaction performance over the analysis time.
Here for example we can see that despite the increase to 50 concurrent requests being executed, the av response is still < 50ms, and max response time only spikes to about 750ms - so it seems even through we have many more requests, that they are all being executed and response returned quite quickly - ie no real slowdown is seen from the database.
Note: To get the average time for a transaction we need to wait until the transaction has completed, but if a transaction is "hung" or slow, then that slowdown is not visible in this graph until the transaction completes, which can lead to showing a spike slightly "after" the problem an event, rather than during the problem.
To help with that here is another graph "Accumulated Open Transaction Time" in the pdf report - it is not shown here, but it plots the total accumulated time for all "in progress" transactions - and that is better at identifying when exactly transactions start going into a "hung" state or there is a spike in transactions being processed.
3.3 Long Transactions
We also extract and transaction that take longer than the cutoff time, are aborted, or did not complete, and place then in a subdirectory "longtrans". A number of them are also reported in the pdf report (the number depends on how long the are, for the ODBC transactions they are usually fairly short) eg:
Section 8.15. Trans : Time Taken :15 sec
2017-06-23 09:03:19.868536 ppid=4031:pid=4112:4c: ENTER SQLAllocHandle
2017-06-23 09:03:19.868536 ppid=4031:pid=4112:4c: SQLSMALLINT 3 <SQL_HANDLE_STMT>
2017-06-23 09:03:19.868536 ppid=4031:pid=4112:4c: SQLHANDLE 0x256402d4
2017-06-23 09:03:19.868536 ppid=4031:pid=4112:4c: SQLHANDLE * 0xf4a5a0f4
2017-06-23 09:03:35.074379 ppid=4031:pid=4112:4c: EXIT SQLAllocHandle with return code 0 (SQL_SUCCESS)
Cheers - Mark
Snr Principal Support Engineer - Global Customer Success