Bug 846307
Summary: | [vdsm] super-vdsm is restarted upon IO error and vdsm communicates with old socket | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Meni Yakove <myakove> | ||||||
Component: | vdsm | Assignee: | Saggi Mizrahi <smizrahi> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Meni Yakove <myakove> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 6.3 | CC: | abaron, bazulay, hateya, iheim, ilvovsky, lpeer, mavital, smizrahi, ykaul, zdover | ||||||
Target Milestone: | rc | Keywords: | ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | infra | ||||||||
Fixed In Version: | vdsm-4.9.6-39.0 | Doc Type: | Bug Fix | ||||||
Doc Text: |
Previously, an IO error presented when the following conditions were met:
* A bond had been created with two VLAN networks, one
with MTU 5000 and the other with MTU 9000
* A virtual machine was created with the above-mentioned
networks attached
* The networks were deactivated on the virtual machine
* SetupNetwork was opened, and the bond was broken
Previously, when these conditions had been met, an IO error presented and supervdsm restarted.
An update to VDSM now makes sure that supervdsm is not restarted with IO errors present.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2012-12-04 19:04:45 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Yaniv.B, I'm sure there is an easier reproduce than the above. Please try Seems to be fixed in this patch: http://gerrit.ovirt.org/#/c/7214/6 The exception doesn't appear The above patch fixes the specific error thrown in the above scenario, However this BZ is about net being able to recover from super-vdsm restart. supervdsmServer doesn't spawn itself again after crash. As I understand we don't have recovery process to superVdsm if it is killed. superVdsmServer has _restartSupervdsm function that can be called, but I'm not sure that this is the case here... I can check if restartSuperVdsm works well with networking after execution if this is what you mean. I manage to reproduce this: 1. Attache network (VM_1) to NIC. 2. Delete ifcfg-VM_1 from /etc/sysconfig/network-scripts/ 3. Try to remove network VM_1 using SetupNetwork > Failed. vdsm log attached. After I restart VDSM service I still have under brctl the VM_1 net without interface attached to it and also on virsh network VM_1 net exist. Created attachment 611370 [details]
new vdsm.log
You are right. After running operation that throws exception that we don't catch in supervdsmServer, we catch it in ProxyCaller::__call__ method. In this except code we reset supervdsmServer and call the same method again. If the exception is thrown again, we leave supervdsmServer down And this is how supervdsmServer remains. This is what appears in your log, and this is what I suggest: http://gerrit.ovirt.org/#/c/7901 Please verify that this fixes the error you saw.. posted modified patch http://gerrit.ovirt.org/#/c/7901/ please review. vdsm-4.9.6-41.0.el6_3.x86_64 supervdsm not restarting upon I/O error. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-1508.html |
Created attachment 602729 [details] vdsm.log Description of problem: When there is IO error supervdsm restart. MainProcess|Thread-981::ERROR::2012-08-07 14:20:49,016::configNetwork::1261::setupNetworks::(setupNetworks) [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-11' Traceback (most recent call last): File "/usr/share/vdsm/configNetwork.py", line 1203, in setupNetworks implicitBonding=False) File "/usr/share/vdsm/configNetwork.py", line 871, in delNetwork configWriter.setNewMtu(network) File "/usr/share/vdsm/configNetwork.py", line 459, in setNewMtu mtu = self._getConfigValue(cf, 'MTU') File "/usr/share/vdsm/configNetwork.py", line 362, in _getConfigValue with open(conffile) as f: IOError: [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-11' MainProcess|Thread-981::ERROR::2012-08-07 14:20:49,017::supervdsmServer::61::SuperVdsm.ServerCallback::(wrapper) Error in setupNetworks Traceback (most recent call last): File "/usr/share/vdsm/supervdsmServer.py", line 59, in wrapper return func(*args, **kwargs) File "/usr/share/vdsm/supervdsmServer.py", line 107, in setupNetworks return configNetwork.setupNetworks(networks, bondings, **options) File "/usr/share/vdsm/configNetwork.py", line 1203, in setupNetworks implicitBonding=False) File "/usr/share/vdsm/configNetwork.py", line 871, in delNetwork configWriter.setNewMtu(network) File "/usr/share/vdsm/configNetwork.py", line 459, in setNewMtu mtu = self._getConfigValue(cf, 'MTU') File "/usr/share/vdsm/configNetwork.py", line 362, in _getConfigValue with open(conffile) as f: IOError: [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-11' Thread-981::DEBUG::2012-08-07 14:20:49,018::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /usr/bin/kill -9 424' (cwd None) Thread-981::DEBUG::2012-08-07 14:20:49,037::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = ''; <rc> = 0 Thread-981::DEBUG::2012-08-07 14:20:49,038::supervdsm::81::SuperVdsmProxy::(_launchSupervdsm) Launching Super Vdsm Thread-981::DEBUG::2012-08-07 14:20:49,038::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /usr/bin/python /usr/share/vdsm/supervdsmServer.py 764fa685-44bb-483f-8bf1-2e1a8a1e0437 29056' (cwd None) MainThread::DEBUG::2012-08-07 14:20:49,205::supervdsmServer::245::SuperVdsm.Server::(main) Making sure I'm root MainThread::DEBUG::2012-08-07 14:20:49,212::supervdsmServer::249::SuperVdsm.Server::(main) Parsing cmd args MainThread::DEBUG::2012-08-07 14:20:49,212::supervdsmServer::252::SuperVdsm.Server::(main) Creating PID file MainThread::DEBUG::2012-08-07 14:20:49,212::supervdsmServer::256::SuperVdsm.Server::(main) Cleaning old socket MainThread::DEBUG::2012-08-07 14:20:49,213::supervdsmServer::260::SuperVdsm.Server::(main) Setting up keep alive thread MainThread::DEBUG::2012-08-07 14:20:49,213::supervdsmServer::265::SuperVdsm.Server::(main) Creating remote object manager MainThread::DEBUG::2012-08-07 14:20:49,214::supervdsmServer::276::SuperVdsm.Server::(main) Started serving super vdsm object Thread-984::DEBUG::2012-08-07 14:20:49,388::BindingXMLRPC::864::vds::(wrapper) client [10.35.97.119]::call ping with () {} flowID [4064a45a] Thread-984::DEBUG::2012-08-07 14:20:49,389::BindingXMLRPC::870::vds::(wrapper) return ping with {'status': {'message': 'Done', 'code': 0}} VM Channels Listener::DEBUG::2012-08-07 14:20:49,554::vmChannels::103::vds::(_handle_unconnected) Trying to connect fileno 22. VM Channels Listener::DEBUG::2012-08-07 14:20:49,554::guestIF::79::vm.Vm::(_connect) vmId=`2e927215-14ce-4d62-bc86-11b082a49cac`::Attempting connection to /var/lib/libvirt/qemu/channels/VM1_MTU.com.redhat.rhevm.vdsm Thread-985::DEBUG::2012-08-07 14:20:49,902::BindingXMLRPC::864::vds::(wrapper) client [10.35.97.119]::call ping with () {} flowID [4064a45a] Thread-985::DEBUG::2012-08-07 14:20:49,902::BindingXMLRPC::870::vds::(wrapper) return ping with {'status': {'message': 'Done', 'code': 0}} Thread-986::DEBUG::2012-08-07 14:20:50,414::BindingXMLRPC::864::vds::(wrapper) client [10.35.97.119]::call ping with () {} flowID [4064a45a] Thread-986::DEBUG::2012-08-07 14:20:50,415::BindingXMLRPC::870::vds::(wrapper) return ping with {'status': {'message': 'Done', 'code': 0}} VM Channels Listener::DEBUG::2012-08-07 14:20:50,555::vmChannels::103::vds::(_handle_unconnected) Trying to connect fileno 22. VM Channels Listener::DEBUG::2012-08-07 14:20:50,555::guestIF::79::vm.Vm::(_connect) vmId=`2e927215-14ce-4d62-bc86-11b082a49cac`::Attempting connection to /var/lib/libvirt/qemu/channels/VM1_MTU.com.redhat.rhevm.vdsm Thread-987::DEBUG::2012-08-07 14:20:50,927::BindingXMLRPC::864::vds::(wrapper) client [10.35.97.119]::call ping with () {} flowID [4064a45a] Thread-987::DEBUG::2012-08-07 14:20:50,927::BindingXMLRPC::870::vds::(wrapper) return ping with {'status': {'message': 'Done', 'code': 0}} Thread-981::DEBUG::2012-08-07 14:20:51,049::supervdsm::102::SuperVdsmProxy::(_connect) Trying to connect to Super Vdsm Thread-981::ERROR::2012-08-07 14:20:51,053::BindingXMLRPC::879::vds::(wrapper) unexpected error Traceback (most recent call last): File "/usr/share/vdsm/BindingXMLRPC.py", line 869, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/BindingXMLRPC.py", line 346, in setupNetworks return api.setupNetworks(networks, bondings, options) File "/usr/share/vdsm/API.py", line 1114, in setupNetworks supervdsm.getProxy().setupNetworks(networks, bondings, options) File "/usr/share/vdsm/supervdsm.py", line 62, in __call__ return callMethod() File "/usr/share/vdsm/supervdsm.py", line 57, in <lambda> callMethod = lambda : getattr(self._supervdsmProxy._svdsm, self._funcName)(*args, **kwargs) File "<string>", line 2, in setupNetworks File "/usr/lib64/python2.6/multiprocessing/managers.py", line 725, in _callmethod conn.send((self._id, methodname, args, kwds)) IOError: [Errno 32] Broken pipe Version-Release number of selected component (if applicable): vdsm-4.9.6-26.0.el6_3.x86_64 How reproducible: 100% Steps to Reproduce: 1.Create BOND, create two VLAN network with MTU 5000 and 9000 (VLAN_NET10-MTU5000, VLAN_NET11-MTU9000)and attache the VLANs networks to the BOND 2.Create VM with two networks - VLAN_NET10-MTU5000, VLAN_NET11-MTU9000 and start the VM 3.On the VM select the two networks and deactivate them. 4.Open SetupNetwork and break the BOND. Actual results: There is IO error and supervdsm restart. Expected results: supervdsm shouldn't restart himself