Created attachment 768879 [details] log Description of problem: I encountered an issue in which i changed my cinder to work on remote nfs share but forgot to start rpcbind on one of the computes. it gave me an idea that we should perhaps create a monitoring package that can connect to horizon and the user can define different monitoring tools. this for example can alert on issues during install or of one of the needed services crashed for some reason Version-Release number of selected component (if applicable): python-django-horizon-2013.1.2-1.el6ost.noarch openstack-nova-compute-2013.1.2-2.el6ost.noarch openstack-cinder-2013.1.2-3.el6ost.noarch How reproducible: 100% Steps to Reproduce: 1. stop rpcbind on one of the computes and try to attach a volume 2. stop libvirt on one of the computes 3. Actual results: there is no alert system that we can get queries on important services which will cause failures if down Expected results: we should create a monitoring package that if installed will allow different monitoring queries and report them in horizon. Additional info: log from rpcbing issue.
We ship Nagios as part of the Red Hat Enterprise Linux OpenStack Platform product; it should be able to cover this and various other monitoring use cases.