Hide Forgot
Description of problem: when restarting vdsm, mom also restarting, and immediately failed. [journalctl --unit vdsmd] Nov 08 18:25:27 RHEL7.2Server systemd[1]: Stopping Virtual Desktop Server Manager... Nov 08 18:25:27 RHEL7.2Server vdsmd_init_common.sh[3053]: vdsm: Running run_final_hooks Nov 08 18:25:27 RHEL7.2Server systemd[1]: Starting Virtual Desktop Server Manager... [journalctl --unit mom-vdsm] Nov 08 18:17:23 RHEL7.2Server systemd[1]: Starting MOM instance configured for VDSM purposes... Nov 08 18:25:16 RHEL7.2Server systemd[1]: Stopping MOM instance configured for VDSM purposes... Nov 08 18:25:29 RHEL7.2Server systemd[1]: Started MOM instance configured for VDSM purposes. But mom then immediately failed with: 2016-11-08 18:25:08,008 - mom.RPCServer - INFO - ping() 2016-11-08 18:25:08,010 - mom.RPCServer - INFO - getStatistics() 2016-11-08 18:25:17,028 - mom.RPCServer - INFO - RPC Server ending 2016-11-08 18:25:24,705 - mom.GuestManager - INFO - Guest Manager ending 2016-11-08 18:25:26,575 - mom.HostMonitor - INFO - Host Monitor ending 2016-11-08 18:25:29,869 - mom - INFO - MOM starting 2016-11-08 18:25:29,905 - mom.HostMonitor - INFO - Host Monitor starting 2016-11-08 18:25:29,905 - mom - INFO - hypervisor interface vdsmjsonrpcbulk 2016-11-08 18:25:30,029 - mom.vdsmInterface - ERROR - Cannot connect to VDSM! [Errno 111] Connection refused 2016-11-08 18:25:30,030 - mom - ERROR - Failed to initialize MOM threads Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 29, in run hypervisor_iface = self.get_hypervisor_interface() File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 217, in get_hypervisor_interface return module.instance(self.config) File "/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcbulkInterface.py", line 47, in instance return JsonRpcVdsmBulkInterface() File "/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcbulkInterface.py", line 29, in __init__ super(JsonRpcVdsmBulkInterface, self).__init__() File "/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcInterface.py", line 43, in __init__ .orRaise(RuntimeError, 'No connection to VDSM.') File "/usr/lib/python2.7/site-packages/mom/optional.py", line 28, in orRaise raise exception(*args, **kwargs) RuntimeError: No connection to VDSM. Version-Release number of selected component (if applicable): Red Hat Virtualization Manager Version: 4.0.5.4-0.1.el7ev vdsm-4.18.15-23.git9656d58.el7.centos.x86_64 mom-0.5.8-1.el7ev.noarch How reproducible: Steps to Reproduce: 1. install vdsm from master and mom-0.5.8-1.el7ev.noarch 2. restart vdsmd service and look at mom logs 3. Actual results: Expected results: Additional info:
*** Bug 1396826 has been marked as a duplicate of this bug. ***
I am moving this to VDSM component as that is where the fix will be published.
*** Bug 1397268 has been marked as a duplicate of this bug. ***
I don't think this bug should be ON_QA, it actually should failed QA. messages log still spammed with 'vdsm MOM WARN MOM not available, KSM stats will be missing.' on vdsm-4.18.17-1.el7ev.x86_64
The fix will be part of 4.0.6. It is already merged too, but the automation is not smart enough to know that it is not present in the last weeks build..
I still see this bug on 4.0.6.3-0.1.el7ev vdsm-4.18.18-4.git198e48d.el7ev.x86_64 Dec 11 11:38:11 silver-vdsa journal: vdsm MOM WARNING MOM not available. Dec 11 11:38:11 silver-vdsa journal: vdsm MOM WARNING MOM not available, KSM stats will be missing. Dec 11 11:38:27 silver-vdsa journal: vdsm MOM WARNING MOM not available. Dec 11 11:38:27 silver-vdsa journal: vdsm MOM WARNING MOM not available, KSM stats will be missing. Dec 11 11:38:43 silver-vdsa journal: vdsm MOM WARNING MOM not available. Don't think this report should be ON_QA
Michael: Check /usr/lib/systemd/system/mom-vdsm.service, it should contain this: https://gerrit.ovirt.org/gitweb?p=vdsm.git;a=blob;f=init/systemd/mom-vdsm.service.in;h=e83acb40fcfd12f78d27503e92f7e5e4068b731f;hb=e4be1485f2a905c247d935aed72b106026713478 The link will show you the content of the file we ship with 4.18.18 and it contains the proper fix. So you either - haven't reloaded systemd, have an .rpmnew file or systemd override laying somewhere (/etc/systemd/system/mom-vdsm.*) or triggered the maximum failure count (So check /var/log/vdsm/mom.log and journalctl for mom-vdsm).
Martin, cat /usr/lib/systemd/system/mom-vdsm.service [Unit] Description=MOM instance configured for VDSM purposes Requires=vdsmd.service After=vdsmd.service Conflicts=momd.service ksmtuned.service [Service] Type=simple LimitCORE=infinity ExecStart=/usr/sbin/momd -c /etc/vdsm/mom.conf Restart=always RestartSec=5 User=vdsm Group=kvm TimeoutStopSec=10 StartLimitBurst=10 StartLimitInterval=120 [Install] WantedBy=multi-user.target Looks like my vdsm vdsm-4.18.18-4.git198e48d.el7ev.x86_64 do contain the fix but messages log is still spammed with those errors. Do i need to do something after updating vdsm?
Check whether mom is up or down: systemctl status mom-vdsm.service and get the reason why it is down. Check journalctl for mom to see why it was not restarted: journalctl --unit mom-vdsm.service Check mom log for errors: cat /var/log/vdsm/mom.log Check what happens when you try starting mom manually: systemctl start mom-vdsm.service If mom was not started because of triggering the burst limits, use the following to reset them: systemctl reset-failed mom-vdsm.service
Btw, you can also try systemctl daemon-reload to make sure systemd knows about the updated unit files.
Try to verify on : vdsm-4.18.18-4.git198e48d.el7ev.x86_64 mom-0.5.8-1.el7ev.noarch Verification steps: 0. look at mom logs 1. restart Vdsm 2. Check in mom logs again immediately after upgrade vdsm, mom is down with the same failure: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 29, in run hypervisor_iface = self.get_hypervisor_interface() File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 217, in get_hypervisor_interface return module.instance(self.config) File "/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcbulkInterface.py", line 47, in instance return JsonRpcVdsmBulkInterface() File "/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcbulkInterface.py", line 29, in __init__ super(JsonRpcVdsmBulkInterface, self).__init__() File "/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcInterface.py", line 43, in __init__ .orRaise(RuntimeError, 'No connection to VDSM.') File "/usr/lib/python2.7/site-packages/mom/optional.py", line 28, in orRaise raise exception(*args, **kwargs) RuntimeError: No connection to VDSM. But after restarting vdsmd service, mom is running OK. I'm not sure that this is valid behaviour, Martin is this OK that mom didn't work after upgrading vdsm?
Verifying this bug, msivak check and this is a valid behavior.