Hide Forgot
Created attachment 1199867 [details] logs Description of problem: mom is not available for vdsClient commands on ngn vlan devices. After adding a rhvh-4.0-0.20160906.0+1 server to rhv-m 4.0.4 on top a vlan device(configured via ifcfg-* files) the mom isn't available. vdsClient -s 0 getVdsCaps /usr/share/vdsm/vdsClient.py:33: DeprecationWarning: vdscli uses xmlrpc. since ovirt 3.6 xmlrpc is deprecated, please use vdsm.jsonrpcvdscli from vdsm import utils, vdscli, constants Traceback (most recent call last): File "/usr/share/vdsm/vdsClient.py", line 2980, in <module> code, message = commands[command][0](commandArgs) File "/usr/share/vdsm/vdsClient.py", line 543, in do_getCap return self.ExecAndExit(self.s.getVdsCapabilities()) File "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib64/python2.7/xmlrpclib.py", line 1587, in __request verbose=self.__verbose File "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request return self.single_request(host, handler, request_body, verbose) File "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request self.send_content(h, request_body) File "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content connection.endheaders(request_body) File "/usr/lib64/python2.7/httplib.py", line 1013, in endheaders self._send_output(message_body) File "/usr/lib64/python2.7/httplib.py", line 864, in _send_output self.send(msg) File "/usr/lib64/python2.7/httplib.py", line 826, in send self.connect() File "/usr/lib/python2.7/site-packages/vdsm/m2cutils.py", line 203, in connect sock = socket.create_connection((self.host, self.port), self.timeout) File "/usr/lib64/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 110] Connection timed out Version-Release number of selected component (if applicable): mom-0.5.5-1.el7ev.noarch rhvh-4.0-0.20160906.0+1 vdsm-4.18.11-1.el7ev.x86_64 How reproducible: 100% on rhvh-4.0-0.20160906.0+1 and vlan devices that were configured via ifcfg-* files. Non vlan devices worked as expected. On rhel servers all devices worked. Steps to Reproduce: 1. Create a vlan device(nic/bond/static/dhcp) on clean rhvh-4.0-0.20160906.0+1 with ifcfg-* files and restart network 2. Successfully add host to rhv-m 4.0.4 3. Run vdsClient command on host Actual results: Connection timed out because mom isn't available - In UI refresh caps seems to work Expected results: Should work
That's quite a confusing bug - is that a mom bug? Is that because of the VLAN, because of RHVH, or a combo of both?
I'm not sure if it's a mom bug, but the issue is caused and seen in a combo of both VLAN and RHVH.
1) I am not sure we support manual edits of ifcfg files at all 2) VDSM uses unix domain sockets to talk to MOM, that should not be affected by VLANs at all 3) MOM uses xmlrpc client as provided by VDSM and that might stall in case the network gets bad.. but there is no data to indicate this 4) Your traceback is coming from vdsClient when trying to connect to VDSM, not to MOM So my main question is.. where did you see anything MOM related when testing this?
Hi Martin, i'm not sure it's mom related issue, but, i see it related to mom when trying to run vdsClient commands on the server. Please advice with who i should contact in order to investigate this thing, i have a server with this issue waiting for someone to take a look. Thanks!
(In reply to Michael Burman from comment #4) > i see it related to mom > when trying to run vdsClient commands on the server. Where do you see it? The issue is definitely not on mom side since you cannot connect to VDSM using neither MOM nor vdsClient. > Please advice with who i should contact in order to investigate this thing, > i have a server with this issue waiting for someone to take a look. Thanks! Someone from VDSM networking? But they will ask for vdsm logs too.
I see it journalctl output - Sep 13 10:53:36 orchid-vds2.qa.lab.tlv.redhat.com vdsm[22629]: vdsm MOM WARNING MOM not available. Sep 13 10:53:36 orchid-vds2.qa.lab.tlv.redhat.com vdsm[22629]: vdsm MOM WARNING MOM not available, KSM stats will be missing. Sep 13 10:53:36 orchid-vds2.qa.lab.tlv.redhat.com vdsm[22629]: vdsm root ERROR Report host stats failed Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 113, in report_stats report[prefix + '.cpu.ksm_pages'] = hoststats['ksmPages'] KeyError: 'ksmPages'
Created attachment 1200379 [details] vdsm logs
Danken has found the reason that preventing from running a vdsClient -s 0 command on vlan devices and it's because vdsm trying to connect to fqdn that isn't the vlan one. - For example in our server - orchid-vds2.qa.lab.tlv.redhat.com the fqdn for vlan 162 is orchid-vds2-vlan162.qa.lab.tlv.redhat.com, when we configuring a device with vlan 162 and restarting network, the hostname isn't updated and vdsm trying to connect to orchid-vds2.qa.lab.tlv.redhat.com and it's why it failing. Once running the vdsClient command with the ip or with the correct fqdn it is working, but this is not explaining the 'MOM not available' errors in the logs.
Ahh so this might be the same issue we saw in https://bugzilla.redhat.com/show_bug.cgi?id=1377161 and https://bugzilla.redhat.com/show_bug.cgi?id=1358530 Can you please check it and confirm that it is the same behaviour? If it is, then this can be marked as test only, because it was fixed in https://gerrit.ovirt.org/#/c/63308/2
MOM not available messages are caused by network timeout, as MOM itself is stuck (or down) waiting for vdsm connection.
(In reply to Martin Sivák from comment #9) > Ahh so this might be the same issue we saw in > > https://bugzilla.redhat.com/show_bug.cgi?id=1377161 > > and > > https://bugzilla.redhat.com/show_bug.cgi?id=1358530 > > > Can you please check it and confirm that it is the same behaviour? If it is, > then this can be marked as test only, because it was fixed in > > https://gerrit.ovirt.org/#/c/63308/2 Hi Martin, I can't confirm that this bugs are the same issues, but i can confirm that when using rhvh-4.0-0.20160919.0+1 with vdsm-4.18.13-1.el7ev.x86_64 everything working as expected. running vdsClient -s 0 getVdsCaps on a vlan device is OK. Thanks
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.
Verified on - rhvh-4.0-0.20160919.0+1 and vdsm-4.18.13-1.el7ev.x86_64