Bug 1150718 - Unable to restart vdsm on host with running vms
Summary: Unable to restart vdsm on host with running vms
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.5.1
Assignee: Barak
QA Contact: Gil Klein
URL:
Whiteboard: network
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-10-08 19:21 UTC by Adam Litke
Modified: 2016-02-10 19:36 UTC (History)
14 users (show)

Fixed In Version: ovirt-3.5.1_rc1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-01-21 16:02:59 UTC
oVirt Team: Network
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 33993 0 master MERGED nwfilter: Do not attempt to re-define the nwfilter when not undefining Never
oVirt gerrit 34206 0 ovirt-3.5 MERGED nwfilter: Do not attempt to re-define the nwfilter when not undefining Never

Description Adam Litke 2014-10-08 19:21:25 UTC
Description of problem:
After upgrading vdsm on a host with running vms, I am unable to restart vdsmd.

Version-Release number of selected component (if applicable):
vdsm-4.16.6-0.fc20.x86_64


How reproducible: Always


Steps to Reproduce:
1. Start a VM on a host
2. Reinstall vdsm package on that host
3. Restart vdsmd

Actual results:
Job for vdsmd.service failed. See 'systemctl status vdsmd.service' and 'journalctl -xn' for details.


Expected results:
The service restarts


Additional info:

$ sudo service vdsmd restart
Redirecting to /bin/systemctl restart  vdsmd.service

Job for vdsmd.service failed. See 'systemctl status vdsmd.service' and 'journalctl -xn' for details.
[alitke@lager ~]$ 
[alitke@lager ~]$ systemctl status vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
   Active: failed (Result: start-limit) since Wed 2014-10-08 15:17:16 EDT; 4s ago
  Process: 24507 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh --post-stop (code=exited, status=0/SUCCESS)
  Process: 27246 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=1/FAILURE)
 Main PID: 1603 (code=exited, status=0/SUCCESS)

Oct 08 15:17:15 lager.alitke.net systemd[1]: Failed to start Virtual Desktop....
Oct 08 15:17:15 lager.alitke.net systemd[1]: Unit vdsmd.service entered fail....
Oct 08 15:17:16 lager.alitke.net systemd[1]: vdsmd.service holdoff time over....
Oct 08 15:17:16 lager.alitke.net systemd[1]: Stopping Virtual Desktop Server....
Oct 08 15:17:16 lager.alitke.net systemd[1]: Starting Virtual Desktop Server....
Oct 08 15:17:16 lager.alitke.net systemd[1]: vdsmd.service start request rep....
Oct 08 15:17:16 lager.alitke.net systemd[1]: Failed to start Virtual Desktop....
Oct 08 15:17:16 lager.alitke.net systemd[1]: Unit vdsmd.service entered fail....
Hint: Some lines were ellipsized, use -l to show in full.
[alitke@lager ~]$ sudo /usr/libexec/vdsm/vdsmd_init_common.sh --pre-start
vdsm: Running mkdirs
vdsm: Running configure_coredump
vdsm: Running configure_vdsm_logs
vdsm: Running run_init_hooks
vdsm: Running check_is_configured
libvirt is already configured for vdsm
vdsm: Running validate_configuration
SUCCESS: ssl configured to true. No conflicts
vdsm: Running prepare_transient_repository
vdsm: Running syslog_available
vdsm: Running nwfilter
libvirt: Network Filter Driver error : Requested operation is not valid: nwfilter is in use
libvirt: Network Filter Driver error : operation failed: filter 'vdsm-no-mac-spoofing' already exists with uuid c3b98710-c74b-4300-b2e7-693bb259a369
Traceback (most recent call last):
  File "/usr/bin/vdsm-tool", line 209, in main
    return tool_command[cmd]["command"](*args)
  File "/usr/lib64/python2.7/site-packages/vdsm/tool/nwfilter.py", line 40, in main
    NoMacSpoofingFilter().defineNwFilter(conn)
  File "/usr/lib64/python2.7/site-packages/vdsm/tool/nwfilter.py", line 70, in defineNwFilter
    nwFilter = conn.nwfilterDefineXML(self.buildFilterXml())
  File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 4041, in nwfilterDefineXML
    if ret is None:raise libvirtError('virNWFilterDefineXML() failed', conn=self)
libvirtError: operation failed: filter 'vdsm-no-mac-spoofing' already exists with uuid c3b98710-c74b-4300-b2e7-693bb259a369
vdsm: stopped during execute nwfilter task (task returned with error code 1).

Comment 1 Dan Kenigsberg 2014-10-09 20:19:37 UTC
We should stop re-creating the vdsm-no-mac-spoofing nwfilter on every startup. The filter should be created by a vdsm-tool configurator. If we ever need to change the filter's implementation, we'd need to do a complicated upgrade (or just define a vdsm-no-mac-spoofing2 to be used by Engine.

Comment 2 Petr Horáček 2014-10-13 13:27:55 UTC
Hello,

I'm trying to reproduce this problem for last few days, but with no success.

This is my setup and steps, have you an idea where is the problem?

vdsm 4.16.7-1.gitdb83943.fc20
libvirt 1.1.3.6-1.fc20

1) yum install vdsm vdsm-jsonrpc vdsm-python vdsm-python-zombiereaper vdsm-xmlrpc vdsm-yajsonrpc vdsm-hook-macspoof

2) changed /etc/vdsm/vdsm.conf: ssl = false

3) vdsm-tool configure --force

4) service vdsmd restart
   service supervdsm restart

5) qemu-img create -f qcow2 /images/f20test/f20test.qcow2 8G

6) virsh net-start default

7) virt-install -r 1024 --accelerate -n f20test -f /images/f20test/f20test.qcow2 --cdrom /isos/Fedora-20-x86_64-DVD.iso

8) ➜  ~  sudo virsh list --all
     Id    Name                           State
    ----------------------------------------------------
     5     f20test                        running

9) virsh edit f20test:
    ...
    <interface type='network'>
      <mac address='52:54:00:04:12:1f'/>
      <source network='default'/>
      <model type='rtl8139'/>
      <filterref filter='vdsm-no-mac-spoofing'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    ...

10) virsh shutdown f20test
    virsh start f20test

11) yum reinstall vdsm

12) service vdsmd restart

13) service vdsmd status
    Redirecting to /bin/systemctl status  vdsmd.service
    vdsmd.service - Virtual Desktop Server Manager
       Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
       Active: active (running) since Mon 2014-10-13 15:11:45 CEST; 8min ago
      Process: 24353 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh --post-stop (code=exited, status=0/SUCCESS)
      Process: 24356 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
     Main PID: 24505 (vdsm)
       CGroup: /system.slice/vdsmd.service
               └─24505 /usr/bin/python /usr/share/vdsm/vdsm

    Oct 13 15:17:46 phoracek-fedora vdsm[24505]: vdsm vds ERROR Vm's recovery failed
                                                 Traceback (most recent call last):
                                                   File "/usr/share/vdsm/clientIF.py", line 419, in _recoverExistingVms...
    Oct 13 15:18:01 phoracek-fedora vdsm[24505]: vdsm vds ERROR Vm's recovery failed
                                                 Traceback (most recent call last):
    ...

Comment 3 Dan Kenigsberg 2014-10-13 15:18:04 UTC
Petr, I believe that you need a newer libvirt version to reproduce this bug (libvirt-daemon-1.2.9-3.fc20.x86_64 maybe?)

The stack trace that you report seems unrelated - could you provide it in full elsewhere?

To reproduce the bug, you do not need to have vdsm running. Calling

  vdsm-tool nwfilter

should be enough.

Comment 4 Adam Litke 2014-10-14 09:34:51 UTC
Please try to create the VM directly with vdsm.

Comment 6 Sandro Bonazzola 2015-01-15 14:25:40 UTC
This is an automated message: 
This bug should be fixed in oVirt 3.5.1 RC1, moving to QA

Comment 7 Sandro Bonazzola 2015-01-21 16:02:59 UTC
oVirt 3.5.1 has been released. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.