Bug 846307 - [vdsm] super-vdsm is restarted upon IO error and vdsm communicates with old socket
[vdsm] super-vdsm is restarted upon IO error and vdsm communicates with old s...
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm (Show other bugs)
x86_64 Linux
unspecified Severity urgent
: rc
: ---
Assigned To: Saggi Mizrahi
Meni Yakove
: ZStream
Depends On:
  Show dependency treegraph
Reported: 2012-08-07 08:06 EDT by Meni Yakove
Modified: 2012-12-04 14:04 EST (History)
10 users (show)

See Also:
Fixed In Version: vdsm-4.9.6-39.0
Doc Type: Bug Fix
Doc Text:
Previously, an IO error presented when the following conditions were met: * A bond had been created with two VLAN networks, one with MTU 5000 and the other with MTU 9000 * A virtual machine was created with the above-mentioned networks attached * The networks were deactivated on the virtual machine * SetupNetwork was opened, and the bond was broken Previously, when these conditions had been met, an IO error presented and supervdsm restarted. An update to VDSM now makes sure that supervdsm is not restarted with IO errors present.
Story Points: ---
Clone Of:
Last Closed: 2012-12-04 14:04:45 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
vdsm.log (12.58 MB, text/x-log)
2012-08-07 08:06 EDT, Meni Yakove
no flags Details
new vdsm.log (4.21 MB, text/x-log)
2012-09-10 05:36 EDT, Meni Yakove
no flags Details

  None (edit)
Description Meni Yakove 2012-08-07 08:06:47 EDT
Created attachment 602729 [details]

Description of problem:
When there is IO error supervdsm restart.

MainProcess|Thread-981::ERROR::2012-08-07 14:20:49,016::configNetwork::1261::setupNetworks::(setupNetworks) [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-11'
Traceback (most recent call last):
  File "/usr/share/vdsm/configNetwork.py", line 1203, in setupNetworks
  File "/usr/share/vdsm/configNetwork.py", line 871, in delNetwork
  File "/usr/share/vdsm/configNetwork.py", line 459, in setNewMtu
    mtu = self._getConfigValue(cf, 'MTU')
  File "/usr/share/vdsm/configNetwork.py", line 362, in _getConfigValue
    with open(conffile) as f:
IOError: [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-11'
MainProcess|Thread-981::ERROR::2012-08-07 14:20:49,017::supervdsmServer::61::SuperVdsm.ServerCallback::(wrapper) Error in setupNetworks
Traceback (most recent call last):
  File "/usr/share/vdsm/supervdsmServer.py", line 59, in wrapper
    return func(*args, **kwargs)
  File "/usr/share/vdsm/supervdsmServer.py", line 107, in setupNetworks
    return configNetwork.setupNetworks(networks, bondings, **options)
  File "/usr/share/vdsm/configNetwork.py", line 1203, in setupNetworks
  File "/usr/share/vdsm/configNetwork.py", line 871, in delNetwork
  File "/usr/share/vdsm/configNetwork.py", line 459, in setNewMtu
    mtu = self._getConfigValue(cf, 'MTU')
  File "/usr/share/vdsm/configNetwork.py", line 362, in _getConfigValue
    with open(conffile) as f:
IOError: [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-11'
Thread-981::DEBUG::2012-08-07 14:20:49,018::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /usr/bin/kill -9 424' (cwd None)
Thread-981::DEBUG::2012-08-07 14:20:49,037::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = ''; <rc> = 0
Thread-981::DEBUG::2012-08-07 14:20:49,038::supervdsm::81::SuperVdsmProxy::(_launchSupervdsm) Launching Super Vdsm
Thread-981::DEBUG::2012-08-07 14:20:49,038::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /usr/bin/python /usr/share/vdsm/supervdsmServer.py 764fa685-44bb-483f-8bf1-2e1a8a1e0437 29056' (cwd None)
MainThread::DEBUG::2012-08-07 14:20:49,205::supervdsmServer::245::SuperVdsm.Server::(main) Making sure I'm root
MainThread::DEBUG::2012-08-07 14:20:49,212::supervdsmServer::249::SuperVdsm.Server::(main) Parsing cmd args
MainThread::DEBUG::2012-08-07 14:20:49,212::supervdsmServer::252::SuperVdsm.Server::(main) Creating PID file
MainThread::DEBUG::2012-08-07 14:20:49,212::supervdsmServer::256::SuperVdsm.Server::(main) Cleaning old socket
MainThread::DEBUG::2012-08-07 14:20:49,213::supervdsmServer::260::SuperVdsm.Server::(main) Setting up keep alive thread
MainThread::DEBUG::2012-08-07 14:20:49,213::supervdsmServer::265::SuperVdsm.Server::(main) Creating remote object manager
MainThread::DEBUG::2012-08-07 14:20:49,214::supervdsmServer::276::SuperVdsm.Server::(main) Started serving super vdsm object
Thread-984::DEBUG::2012-08-07 14:20:49,388::BindingXMLRPC::864::vds::(wrapper) client []::call ping with () {} flowID [4064a45a]
Thread-984::DEBUG::2012-08-07 14:20:49,389::BindingXMLRPC::870::vds::(wrapper) return ping with {'status': {'message': 'Done', 'code': 0}}
VM Channels Listener::DEBUG::2012-08-07 14:20:49,554::vmChannels::103::vds::(_handle_unconnected) Trying to connect fileno 22.
VM Channels Listener::DEBUG::2012-08-07 14:20:49,554::guestIF::79::vm.Vm::(_connect) vmId=`2e927215-14ce-4d62-bc86-11b082a49cac`::Attempting connection to /var/lib/libvirt/qemu/channels/VM1_MTU.com.redhat.rhevm.vdsm
Thread-985::DEBUG::2012-08-07 14:20:49,902::BindingXMLRPC::864::vds::(wrapper) client []::call ping with () {} flowID [4064a45a]
Thread-985::DEBUG::2012-08-07 14:20:49,902::BindingXMLRPC::870::vds::(wrapper) return ping with {'status': {'message': 'Done', 'code': 0}}
Thread-986::DEBUG::2012-08-07 14:20:50,414::BindingXMLRPC::864::vds::(wrapper) client []::call ping with () {} flowID [4064a45a]
Thread-986::DEBUG::2012-08-07 14:20:50,415::BindingXMLRPC::870::vds::(wrapper) return ping with {'status': {'message': 'Done', 'code': 0}}
VM Channels Listener::DEBUG::2012-08-07 14:20:50,555::vmChannels::103::vds::(_handle_unconnected) Trying to connect fileno 22.
VM Channels Listener::DEBUG::2012-08-07 14:20:50,555::guestIF::79::vm.Vm::(_connect) vmId=`2e927215-14ce-4d62-bc86-11b082a49cac`::Attempting connection to /var/lib/libvirt/qemu/channels/VM1_MTU.com.redhat.rhevm.vdsm
Thread-987::DEBUG::2012-08-07 14:20:50,927::BindingXMLRPC::864::vds::(wrapper) client []::call ping with () {} flowID [4064a45a]
Thread-987::DEBUG::2012-08-07 14:20:50,927::BindingXMLRPC::870::vds::(wrapper) return ping with {'status': {'message': 'Done', 'code': 0}}
Thread-981::DEBUG::2012-08-07 14:20:51,049::supervdsm::102::SuperVdsmProxy::(_connect) Trying to connect to Super Vdsm
Thread-981::ERROR::2012-08-07 14:20:51,053::BindingXMLRPC::879::vds::(wrapper) unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/BindingXMLRPC.py", line 869, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/BindingXMLRPC.py", line 346, in setupNetworks
    return api.setupNetworks(networks, bondings, options)
  File "/usr/share/vdsm/API.py", line 1114, in setupNetworks
    supervdsm.getProxy().setupNetworks(networks, bondings, options)
  File "/usr/share/vdsm/supervdsm.py", line 62, in __call__
    return callMethod()
  File "/usr/share/vdsm/supervdsm.py", line 57, in <lambda>
    callMethod = lambda : getattr(self._supervdsmProxy._svdsm, self._funcName)(*args, **kwargs)
  File "<string>", line 2, in setupNetworks
  File "/usr/lib64/python2.6/multiprocessing/managers.py", line 725, in _callmethod
    conn.send((self._id, methodname, args, kwds))
IOError: [Errno 32] Broken pipe

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.Create BOND, create two VLAN network with MTU 5000 and 9000 (VLAN_NET10-MTU5000, VLAN_NET11-MTU9000)and attache the VLANs networks to the BOND
2.Create VM with two networks - VLAN_NET10-MTU5000, VLAN_NET11-MTU9000 and start the VM
3.On the VM select the two networks and deactivate them.
4.Open SetupNetwork and break the BOND.

Actual results:
There is IO error and supervdsm restart.

Expected results:
supervdsm shouldn't restart himself
Comment 2 Barak 2012-08-19 12:55:11 EDT

I'm sure there is an easier reproduce than the above.
Please try
Comment 3 Yaniv Bronhaim 2012-09-02 11:29:23 EDT
Seems to be fixed in this patch: http://gerrit.ovirt.org/#/c/7214/6 
The exception doesn't appear
Comment 4 Barak 2012-09-07 09:39:02 EDT
The above patch fixes the specific error thrown in the above scenario,
However this BZ is about net being able to recover from super-vdsm restart.
Comment 5 Yaniv Bronhaim 2012-09-09 08:11:53 EDT
supervdsmServer doesn't spawn itself again after crash. As I understand we don't have recovery process to superVdsm if it is killed.
superVdsmServer has _restartSupervdsm function that can be called, but I'm not sure that this is the case here...
I can check if restartSuperVdsm works well with networking after execution if this is what you mean.
Comment 6 Meni Yakove 2012-09-10 05:35:26 EDT
I manage to reproduce this:

1. Attache network (VM_1) to NIC.
2. Delete ifcfg-VM_1 from /etc/sysconfig/network-scripts/
3. Try to remove network VM_1 using SetupNetwork > Failed.

vdsm log attached.

After I restart VDSM service I still have under brctl the VM_1 net without interface attached to it and also on virsh network VM_1 net exist.
Comment 7 Meni Yakove 2012-09-10 05:36:26 EDT
Created attachment 611370 [details]
new vdsm.log
Comment 8 Yaniv Bronhaim 2012-09-10 07:12:53 EDT
You are right. After running operation that throws exception that we don't catch in supervdsmServer, we catch it in ProxyCaller::__call__ method. 

In this except code we reset supervdsmServer and call the same method again.

If the exception is thrown again, we leave supervdsmServer down And this is how supervdsmServer remains.

This is what appears in your log, and this is what I suggest:

Please verify that this fixes the error you saw..
Comment 9 Yaniv Bronhaim 2012-09-27 10:45:36 EDT
posted modified patch http://gerrit.ovirt.org/#/c/7901/ 
please review.
Comment 14 Meni Yakove 2012-11-07 03:40:16 EST

supervdsm not restarting upon I/O error.
Comment 16 errata-xmlrpc 2012-12-04 14:04:45 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.