I admin four production clusters (10 collectors each) and two non-production clusters (5 collectors each), and monitoring them all can be a pain. Some of the production clusters are particularly stressed so I cannot always rely on them to self alert when they are having issues, and when they are having serious issues I can't even log into Workstation/Webview to see what is going on.
So, I have been using EPAgents reporting in to my most stable cluster to monitor all the MOMs/Collectors across all clusters. The attached Perl script has been the most useful part of the monitoring my EM EPAgents perform. It parses the EM perflog file and reports the performance values in as normal metrics. With it I no longer have to do the tedious task of downloading the perflog files from each MOM/Collector and formatting them in Excel in order to interpret the data; I can just look at a dashboard and see live data.
If you have multiple large environments as well you may find this script useful.