Currently, the engine generates the following alert when it detects that a hypervisor's network usage is high:
Used Network resources of host $hypervisor [100%] exceeded defined threshold [95%].
This does not help us in situations where we are investigating a sporadic or intermittent network usage spike and looking back through the audit_log or similar. This RFE is to adjust this message to include which logical network(s) are using the highest bandwidth, e.g.:
Networks $network1, $network2 on hypervisor $hypervisor are using a total of $usage_perc of network resources.
Or anything similar that will identify the networks triggering the alert.
Starting with 3.4 the individual NIC/NICs should be specified and not just the hypervisor name - see BZ 1070667. This should provide a significant improvement already. Can you take a look on that and see if this can help?
For this customer request I guess that several networks (VLANs) are used in the same NIC, so he prefer to get the information on a per network level? Also, keep in mind that the congestion/network load can be caused due to outside speakers in the network, so it's not always trivial to provide this kind of info on the hypervisor level.
Jake, can you please comment on this?
For our purposes right now the individual NIC report should be sufficient after going over the situation with the customer.
However, we have numerous customers that have a decent amount of VLANs on the same physical NIC so I could see in the future where we'd still want to pin the alert to a specific logical network.
We will be resolving this as part of alerting on the metric store.
Seems to be network related logging on the engine, moved to network