I wanted to share this experience that I recently faced while working with the vmware probe. All this started because of a dashboard (a list view) where there was a mismatch in the QoS being captured against the alert. The list view in question is for an easy view on quickly accessing the cdm metrics. All it does is displays CPU, Disk and Memory utilisation of servers which are highly utilzed. There was this one time, where we had an alert for a server as CPU is highly utilized but, the list view did not display the server in question. To troubleshoot, I tried to create a list view just for the server with the issue and that actually matched the alert and the qos. But when it is with all the servers, it did not. The following was done to rectify the same
1. We are monitoring vm host CPU data in the vmware probe where the QoS for that is mapped to the QoS as that of the CDM probe. Now this caused the mismatch and the wrong data was displayed.
2. The list view for CPU had the wrong information as I have selected * in the targets list for the CPU usage QoS from cdm probe.
Since both the cdm probe and the vmware probe had the same QoS name for CPU usage, this caused this confusion and I had to change the QoS name for many other metrics under the vmware probe. Please look out as you will have an entry for QOS_CPU_USAGE and QOS_VMWARE_HOST_CPU_USAGE in the vmware probe. So, I had to change all of them with the vmware ones.
I am still wondering, if this qos was made available with intention or that we need to relook into this. This same issue happened to the memory usage as well and followed the same for the fix.
Did anyone else have noticed this anytime while working with the vmware probe? Let me know.