For the next two months, we will talk about several APM CE scenarios. These situations provide useful functionality but can have hidden costs including involved support cases.
The scenario discussed this month is having APM CE (CEM) only generate metrics to be seen and alerted in the Investigator. This data is analyzed then for short- and long-term trends. Any alerts are discussed and resolved in conjunction with the application team. As long as the dashboards are populated, the metrics are produced, and the alerts provide proper notifications, then all is good.
But there are lost opportunities and future potential setbacks with the above approach as stated. This includes the following:
1) Typically, any sort of APM CE (CEM) maintenance is ignored completely. So Incidents and defects (depending on volume and data retention settings) can pile up. In the worst case, the APM database and APM CE GUI become unresponsive. If periodic maintenance and cleanup is not done, then one is risking hours or days of downtime due to database maintenance benign neglect. So if you are going to do this approach, ensure that the database is maintained and not ignored.
2) If the APM CE transaction definitions are not being maintained, then invalid metrics and defect-dependent metrics may be produced. Using fresh recordings increases the possibility that current components are being captured. If this is not the case, then the average response time metric captured may be less than reality. (Because missing or invalid components means an invalid response time.) So at least a quarterly review of transaction definitions is a must.
3) Analyzing the metrics on the APM CE side means you can see over time the subtleties in defect and incident trends. Even better, you can work on resolving the low-hanging defects and then focus on the incidents impacting truly application health.
4) As of today, looking at just the Introscope metrics does not allow you to analyze user/user group trends. So, you are unable to tell if application performance is impacting a small group of users or is system-wide.
It takes words and music to make up a song. So it is with APM. Both the customer experience and application analysis is needed to see the complete picture.
1. Are you one of the companies doing this approach? Is this working well for you?
2. Do you have anything that you wish to add on the tradeoffs of using this approach?
3. Are there other topics that you wish to see covered?