Bug 970627

Summary: can not stop vdsmd service using the service command
Product: Red Hat Enterprise Virtualization Manager Reporter: Ohad Basan <obasan>
Component: vdsmAssignee: Yaniv Bronhaim <ybronhei>
Status: CLOSED DUPLICATE QA Contact:
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.3.0CC: abaron, acathrow, alonbl, bazulay, danken, dcaroest, dougsland, eedri, hateya, iheim, knesenko, lpeer, mgoldboi, obasan, ybronhei, ykaul
Target Milestone: ---Keywords: Reopened, Triaged
Target Release: 3.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-13 08:06:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs from engine & host none

Description Ohad Basan 2013-06-04 13:26:31 UTC
Description of problem:

when ovirt-host-deploy is installing a host
one of the command it runs is service vdsmd stop
it looks like there is a problem with the vdsmd service

2013-06-04 13:30:55 DEBUG otopi.plugins.otopi.services.rhel plugin.execute:412 execute-output: ('/sbin/service', 'vdsmd', 'stop') stdout:
Shutting down vdsm daemon: 
[  OK  ]
vdsm watchdog stop[  OK  ]
[FAILED]
vdsm stop[FAILED]

2013-06-04 13:30:55 DEBUG otopi.plugins.otopi.services.rhel plugin.execute:417 execute-output: ('/sbin/service', 'vdsmd', 'stop') stderr:


2013-06-04 13:30:55 DEBUG otopi.context context._executeMethod:132 method exception
Traceback (most recent call last):
  File "/tmp/ovirt-vtl5GyfllR/pythonlib/otopi/context.py", line 122, in _executeMethod
    method['method']()
  File "/tmp/ovirt-vtl5GyfllR/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 96, in _packages
    self.services.state('vdsmd', False)
  File "/tmp/ovirt-vtl5GyfllR/otopi-plugins/otopi/services/rhel.py", line 188, in state
    'start' if state else 'stop'
  File "/tmp/ovirt-vtl5GyfllR/otopi-plugins/otopi/services/rhel.py", line 96, in _executeServiceCommand
    raiseOnError=raiseOnError
  File "/tmp/ovirt-vtl5GyfllR/pythonlib/otopi/plugin.py", line 422, in execute
    command=args[0],
RuntimeError: Command '/sbin/service' failed to execute
2013-06-04 13:30:55 ERROR otopi.context context._executeMethod:141 Failed to execute stage 'Package installation': Command '/sbin/service' failed to execute

Comment 4 Barak 2013-06-08 20:24:10 UTC
Ohad,

1 - from what version were you upgrading from
2 - to what version
3 - at what stage did it failed (beginning of the upgrade?)

Comment 5 Ohad Basan 2013-06-09 11:03:07 UTC
This did not happen during an upgrade
it was a clean 3.3 setup with host installation.
this system was not upgraded.

Comment 6 Alon Bar-Lev 2013-06-09 11:13:37 UTC
(In reply to Ohad Basan from comment #5)
> This did not happen during an upgrade
> it was a clean 3.3 setup with host installation.
> this system was not upgraded.

So how come we had vdsmd service to stop?

    @plugin.event(
        stage=plugin.Stages.STAGE_PACKAGES,
    )
    def _packages(self):
        if self.services.exists('vdsmd'):
            self.services.state('vdsmd', False)
        self.packager.install(('qemu-kvm-tools',))
        self.packager.installUpdate(('vdsm', 'vdsm-cli'))

Comment 7 Ohad Basan 2013-06-09 11:18:03 UTC
the vdsm version that is being used is vdsm-4.10.3-0.416.git5358ed2.el6.x86_64.rpm

Comment 8 Ohad Basan 2013-06-09 11:19:30 UTC
(In reply to Alon Bar-Lev from comment #6)
> (In reply to Ohad Basan from comment #5)
> > This did not happen during an upgrade
> > it was a clean 3.3 setup with host installation.
> > this system was not upgraded.
> 
> So how come we had vdsmd service to stop?
> 
>     @plugin.event(
>         stage=plugin.Stages.STAGE_PACKAGES,
>     )
>     def _packages(self):
>         if self.services.exists('vdsmd'):
>             self.services.state('vdsmd', False)
>         self.packager.install(('qemu-kvm-tools',))
>         self.packager.installUpdate(('vdsm', 'vdsm-cli'))

maybe when the host was reinstalled it had a running vdsmd instance.
does it make sense?
if it does > same version was installed.

Comment 9 Alon Bar-Lev 2013-06-09 11:37:03 UTC
(In reply to Ohad Basan from comment #8)
> (In reply to Alon Bar-Lev from comment #6)
> > (In reply to Ohad Basan from comment #5)
> > > This did not happen during an upgrade
> > > it was a clean 3.3 setup with host installation.
> > > this system was not upgraded.
> > 
> > So how come we had vdsmd service to stop?
> > 
> >     @plugin.event(
> >         stage=plugin.Stages.STAGE_PACKAGES,
> >     )
> >     def _packages(self):
> >         if self.services.exists('vdsmd'):
> >             self.services.state('vdsmd', False)
> >         self.packager.install(('qemu-kvm-tools',))
> >         self.packager.installUpdate(('vdsm', 'vdsm-cli'))
> 
> maybe when the host was reinstalled it had a running vdsmd instance.
> does it make sense?
> if it does > same version was installed.

Sure.
And the question was what version there was, and to which version there was the upgrade.

Comment 10 Alon Bar-Lev 2013-06-09 11:39:50 UTC
Based on host-deploy log upgrade to was:

2013-06-04 13:30:51 DEBUG otopi.plugins.ovirt_host_deploy.vdsm.packages packages._validation:70 Found vdsm {'display_name': 'vdsm-4.10.3-0.416.git5358ed2.el6.x86_64', 'name': 'vdsm', 'epoch': '0', 'version': '4.10.3', 'release': '0.416.git5358ed2.el6', 'operation': 'installed', 'arch': 'x86_64'}

Not sure what was before, because we did not initiated the update as we failed the stop.

Check locally on the machine.

Comment 11 Barak 2013-06-25 14:45:13 UTC
Ohad,

Does it still happen ?
I have a feeling it related to the lack of restart of the host after host deploy

Comment 12 Ohad Basan 2013-06-25 15:11:09 UTC
It appeared before we stopped rebooting the host after host deploy
so i'm not sure it's related to that.
but I don't see it anymore.
looks like it's gone

Comment 14 Barak 2013-07-01 12:26:46 UTC
According to comment #12 it looks like NOTABUG any more.
Hence moving to CLOSE NOTABUG

Comment 16 Ohad Basan 2013-07-04 10:11:45 UTC
error from host deploy log


2013-07-04 12:35:43 DEBUG otopi.context context._executeMethod:132 method exception
Traceback (most recent call last):
  File "/tmp/ovirt-oVvTxX9x0e/pythonlib/otopi/context.py", line 122, in _executeMethod
    method['method']()
  File "/tmp/ovirt-oVvTxX9x0e/otopi-plugins/otopi/network/firewalld.py", line 142, in _customization
    self._firewalld_version = self._get_firewalld_cmd_version()
  File "/tmp/ovirt-oVvTxX9x0e/otopi-plugins/otopi/network/firewalld.py", line 63, in _get_firewalld_cmd_version
    state=True,
  File "/tmp/ovirt-oVvTxX9x0e/otopi-plugins/otopi/services/rhel.py", line 188, in state
    'start' if state else 'stop'
  File "/tmp/ovirt-oVvTxX9x0e/otopi-plugins/otopi/services/rhel.py", line 96, in _executeServiceCommand
    raiseOnError=raiseOnError
  File "/tmp/ovirt-oVvTxX9x0e/pythonlib/otopi/plugin.py", line 451, in execute
    command=args[0],

Comment 18 Ohad Basan 2013-07-09 10:54:10 UTC
reopening
problem popped once again

2013-07-09 11:30:19 DEBUG otopi.plugins.otopi.services.rhel plugin.execute:446 execute-output: ('/sbin/initctl', 'status', 'vdsmd') stderr:
initctl: Unknown job: vdsmd

2013-07-09 11:30:19 DEBUG otopi.plugins.otopi.services.rhel rhel.exists:133 service vdsmd exists True upstart=False
2013-07-09 11:30:19 DEBUG otopi.plugins.otopi.services.rhel rhel.state:172 stopping service vdsmd
2013-07-09 11:30:19 DEBUG otopi.plugins.otopi.services.rhel plugin.executeRaw:366 execute: ('/sbin/initctl', 'status', 'vdsmd'), executable='None', cwd='None', env=None
2013-07-09 11:30:19 DEBUG otopi.plugins.otopi.services.rhel plugin.executeRaw:383 execute-result: ('/sbin/initctl', 'status', 'vdsmd'), rc=1
2013-07-09 11:30:19 DEBUG otopi.plugins.otopi.services.rhel plugin.execute:441 execute-output: ('/sbin/initctl', 'status', 'vdsmd') stdout:


2013-07-09 11:30:19 DEBUG otopi.plugins.otopi.services.rhel plugin.execute:446 execute-output: ('/sbin/initctl', 'status', 'vdsmd') stderr:
initctl: Unknown job: vdsmd

2013-07-09 11:30:19 DEBUG otopi.plugins.otopi.services.rhel plugin.executeRaw:366 execute: ('/sbin/service', 'vdsmd', 'stop'), executable='None', cwd='None', env=None
2013-07-09 11:30:22 DEBUG otopi.plugins.otopi.services.rhel plugin.executeRaw:383 execute-result: ('/sbin/service', 'vdsmd', 'stop'), rc=1
2013-07-09 11:30:22 DEBUG otopi.plugins.otopi.services.rhel plugin.execute:441 execute-output: ('/sbin/service', 'vdsmd', 'stop') stdout:
Shutting down vdsm daemon:
[  OK  ]
vdsm watchdog stop[  OK  ]
[FAILED]
vdsm stop[FAILED]

2013-07-09 11:30:22 DEBUG otopi.plugins.otopi.services.rhel plugin.execute:446 execute-output: ('/sbin/service', 'vdsmd', 'stop') stderr:


2013-07-09 11:30:22 DEBUG otopi.context context._executeMethod:132 method exception
Traceback (most recent call last):
  File "/tmp/ovirt-BOFhpy3cJa/pythonlib/otopi/context.py", line 122, in _executeMethod
    method['method']()
  File "/tmp/ovirt-BOFhpy3cJa/otopi-plugins/ovirt-host-deploy/vdsm/packages.py", line 96, in _packages
    self.services.state('vdsmd', False)
  File "/tmp/ovirt-BOFhpy3cJa/otopi-plugins/otopi/services/rhel.py", line 188, in state
    'start' if state else 'stop'
  File "/tmp/ovirt-BOFhpy3cJa/otopi-plugins/otopi/services/rhel.py", line 96, in _executeServiceCommand
    raiseOnError=raiseOnError
  File "/tmp/ovirt-BOFhpy3cJa/pythonlib/otopi/plugin.py", line 451, in execute
    command=args[0],
RuntimeError: Command '/sbin/service' failed to execute

Comment 30 Eyal Edri 2013-08-07 21:23:50 UTC
Created attachment 784136 [details]
logs from engine & host

Comment 33 Alon Bar-Lev 2013-08-08 09:01:09 UTC
*** Bug 994912 has been marked as a duplicate of this bug. ***

Comment 34 Alon Bar-Lev 2013-08-08 15:13:26 UTC
> Yaniv Bronhaim 2013-08-08 10:26:58 EDT
> Status: ASSIGNED → MODIFIED
> Assignee: dougsland → ybronhei
> External Bug ID: oVirt gerrit 17828

Wrong bug? should be bug#994912?

Comment 35 Ohad Basan 2013-08-08 15:47:35 UTC
Alon you are right

Comment 36 Yaniv Bronhaim 2013-08-11 12:46:02 UTC
After http://gerrit.ovirt.org/#/c/17662/11/vdsm/vdsmd.init.in, 'service stop vdsmd' returns 1 only if simultaneous restarts\starts occurred and during another start instance the stop is received.
Before that, 1 could return if after_vdsm_stop hook is failed, or if killproc failed - both could cause the failure we see and we can't distinguish the specific reason from the attached logs.
As it seems, the patch http://gerrit.ovirt.org/#/c/17662 should fix it.

Ohad, let me know asap if you get this error again with the current branch. I haven't managed to reproduce it.

Comment 37 Yaniv Bronhaim 2013-08-13 08:06:37 UTC
dup with 994912 - same fix for both bugs

*** This bug has been marked as a duplicate of bug 994912 ***