There are four elements within the APM that will cause the APM cluster to slow down.
1. Number of Metrics (live and historic)
2. Number of agents & applications
3. Number of traces
4. EM Resources (number of EMs/CPU/Memory/Disk access/Network)
You mentioned that you have clamps, what is being clamped?
From your inclusion of the transaction trace settings, let us assume that it is the traces causing issues. Traces occur at random, and on error. Errors can be an exception case or a stalled transaction (transaction longer than 30 seconds). Look at the traces of the agent that was transaction trace clamped and see if you can find either a pattern or hints on why the traces are occurring.
On our system we found that we were having transactions take more than 10 minutes each since it was an service that was being used for large batch processing. With that we changed our stall threshold from 30 seconds to 4200 seconds which helped cut down on the transaction traces on those agents.
Now, in align for the resources, we tend to cut the number of days the traces are kept from the default of 14 days to 7 and then increase the disk space from 1 GB to 4 GB. But we only did that because we saw "Out of trace space" messages in the APM Status console.
With Number of metrics and number of agents, increases with either would usually start to appear in the form of the smartstor duration or the harvest duration.
So, instead of jumping around guessing what your issue might be, could you provide more details in your environment?
Number of collectors
Host/Servers - physical or virtual
Collector resources (CPU cores, RAM, OS, disk type (NAS, DASD, SAN, physical raid)
Number of agents (application agents (java and .NET)
Total Number of metrics
Number of applications
And then a history lesson....when did this slowness start? Has it always been this way?
If your APM just one day started to behave poorly, we need to back trace to see if there was a dramatic increase in metrics/agents/application/traces or if some shared resources (cpu/memory/disk/network, virtual hosts) are being over taxed.
You can also pull the perf log from the enterprise managers and see if you can see an increase in the harvest duration or metrics that might help narrow down this issue.
Also in the <EM_HOME>/example there are two management modules that I highly suggest you deploy and customize for your APM environment.
MOM_Infra_Monitoring_MM.jar
Collector_1.jar (copy this jar for each of your collectors and customize the MOM dashboards/alerts to include all of your collectors)
There is also a "Supportability.jar" that is useful since it give you a more of an over view of the APM cluster.