CA Tuesday Tip: Optimization, The Forgotten Task

Discussion created by Hallett_German Employee on Mar 9, 2013
Latest reply on Apr 1, 2013 by MaryGreening
CA Wily Tuesday Tip by Hallett German, Sr. Support Engineer for 3/12/2013

In these APM Tuesday Tips, a framework has been presented in parts. One important component has been in the APM Transaction Definition Lifecycle which included an optimization step. This optimization can be expanded out to all of APM in general. (This tip is based on the Performance and Sizing Guide, and various internal health check/optimization documents.) It also fits well with the proactive system administration philosophy discussed in earlier articles,.

The first step is to know your environment.
-Is it stable or growing at some pace? Are more applications planned to be monitored soon? Are there changes planned to versions of APM environment or integrations?
-Are there performance glitches in your APM environment? Are they customer-impacting or administrator--impacting? Are they recent or ongoing?

The second step is to do a health check twice to four times a year. It should include:
- Documenting the environment
- Summarizing performance issues in last 4-6 months
- Historical analysis
- Recent configuration changes
- Recent installs/upgrades/installations
- Overall assessment of environment
- Reports and Key Metrics
- Customization added recently
- Errors in logs

This should address such Introscope specifics such as
- Cluster health (Overall, EM, Database, Smartstor, Load vs Capacity, Harvest/Smartstor Durations)
- Supportability metrics
- Naming standards
- Authentication/Authorization issues
- Operating systems and database configuration settings.

It should also include APM CE specifics such as
- SSL/Network Data Quality
- Number False Positive Defects
- Filters (Hardware, Span/tap, web server)
- User Groups (Too Many, Not organized well etc.)
- Naming standards
- Stats and Defects Aggregation Health
- HTTP Plug-ins
- APM CE Configuration

The third step is to make the needed changes. A sample of these are below:- Removing unneeded things (metrics, calculators, alerts, user groups, definitions)
- Change scope of metric groupings
- Redistribute the agent connections.
- Tune settings On EM (Increasing EM Heap Size, Transport Outgoing Message Queue Size, others)
- Disable unneeded agent tracers
- Tune settings on APM CE (Domain Settings, User Accounts, Web Server Filters. My Reports, Data Retention, Incident and Business Impact, Introscope Settings. SLA and Defect Thresholds (According to Application Group's Specifications)

If you spend the time being proactive and optimize your environment, you should have less outages and performance problems. Else, one risks dealing with the alternative of more outages and performance sluggishness.

These are the discussion questions for this article:
1. Are you doing health checks on your APM environment? How often?
2. Have they improved overall performance?
3. What additional documentation do you need to have a healthy environment?