Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1155522

Summary: Restarting supervdsm service also restart vdsm service when using HOST with RHEL7
Product: Red Hat Enterprise Virtualization Manager Reporter: Gal Amado <gamado>
Component: vdsmAssignee: Nobody <nobody>
Status: CLOSED NOTABUG QA Contact: Pavel Stehlik <pstehlik>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: bazulay, danken, ecohen, gamado, gklein, iheim, lpeer, lsurette, ybronhei, yeylon
Target Milestone: ---Keywords: Triaged
Target Release: 3.5.0Flags: gamado: needinfo-
Hardware: Unspecified   
OS: Linux   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-22 09:58:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
relevant logs none

Description Gal Amado 2014-10-22 09:46:27 UTC
Description of problem:
Restarting supervdsm service also restart vdsm service when using HOST with RHEL7
On RHEL6.5 host , it works fine (restarting supervdsm , and vdsm remains up)

Version-Release number of selected component (if applicable):
Engine Red Hat Enterprise Virtualization Manager Version: 3.5.0-0.14.beta.el6ev
Vdsm: vdsm-4.16.6-1.el7.x86_64

How reproducible:
Happens all the time.

Steps to Reproduce:http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.5/job/3.5-storage_supervdsm-iscsi/

I was using an up and running environment with Eng, RHEL7 host , RHEL6.5 host and ISCI SD, but I think any RHEL7 host will do.


1. run "ps -C vdsm", and check the pid value.
2. run "service supervdsmd restart"
3. wait few seconds.
4. run "ps -C vdsm" again
5. check that the pid value remains the same.


Actual results:
[root@master-vds13 ~]# ps -C vdsm
  PID TTY          TIME CMD
17240 ?        00:00:10 vdsm
[root@master-vds13 ~]# service supervdsmd restart
Redirecting to /bin/systemctl restart  supervdsmd.service
[root@master-vds13 ~]# ps -C vdsm
  PID TTY          TIME CMD
17240 ?        00:00:11 vdsm
[root@master-vds13 ~]# ps -C vdsm
  PID TTY          TIME CMD
17699 ?        00:00:00 vdsm


Expected results:
[root@master-vds13 ~]# ps -C vdsm
  PID TTY          TIME CMD
17240 ?        00:00:10 vdsm
[root@master-vds13 ~]# service supervdsmd restart
Redirecting to /bin/systemctl restart  supervdsmd.service
[root@master-vds13 ~]# ps -C vdsmhttp://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.5/job/3.5-storage_supervdsm-iscsi/
  PID TTY          TIME CMD
17240 ?        00:00:11 vdsm
[root@master-vds13 ~]# ps -C vdsm
  PID TTY          TIME CMD
17240 ?        00:00:11 vdsm

Additional info:
1. I've checked the same for RHEL6.5 hosts , and there it passed.
2. It fails Job http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.5/job/3.5-storage_supervdsm-iscsi/

Comment 1 Gal Amado 2014-10-22 09:51:26 UTC
Created attachment 949325 [details]
relevant logs

Comment 2 Dan Kenigsberg 2014-10-22 09:58:52 UTC
That's not a bug. In el6 vdsm kills itself only when it needs to use supervdsmd, and finds out that it was restarted. In el7, systemd takes care of that dependency.

Comment 3 Yaniv Bronhaim 2014-10-22 10:12:10 UTC
ok, maybe for 3.5 - 
JsonRpc (StompReactor)::DEBUG::2014-10-22 12:32:24,803::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'>
Thread-15::DEBUG::2014-10-22 12:32:24,806::stompReactor::163::yajsonrpc.StompServer::(send) Sending response
JsonRpcServer::DEBUG::2014-10-22 12:32:24,804::__init__::504::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request
MainThread::DEBUG::2014-10-22 12:32:40,166::vdsm::58::vds::(sigtermHandler) Received signal 15
MainThread::DEBUG::2014-10-22 12:32:40,167::protocoldetector::135::vds.MultiProtocolAcceptor::(stop) Stopping Acceptor
MainThread::INFO::2014-10-22 12:32:40,167::__init__::563::jsonrpc.JsonRpcServer::(stop) Stopping JsonRPC Server
Detector thread::DEBUG::2014-10-22 12:32:40,171::protocoldetector::106::vds.MultiProtocolAcceptor::(_cleanup) Cleaning Acceptor
MainThread::INFO::2014-10-22 12:32:40,174::vmchannels::188::vds::(stop) VM channels listener was stopped.


we get the sigterm and restarted.
but i noticed with 3.4 over rhel7 that when supervdsmd is restart we get - 

VM Channels Listener::INFO::2014-10-22 12:18:39,822::vmChannels::174::vds::(run) Starting VM channels listener thread.
storageRefresh::DEBUG::2014-10-22 12:18:39,854::multipath::110::Storage.Misc.excCmd::(rescan) '/usr/bin/sudo -n /sbin/multipath' (cwd None)
MainThread::WARNING::2014-10-22 12:18:39,995::BindingJsonRpc::107::BindingJsonRpc::(start) Could not listen on reactor 'AsyncoreReactor'
Traceback (most recent call last):
  File "/usr/share/vdsm/BindingJsonRpc.py", line 101, in start
    self._createTcpListener(cfg)
  File "/usr/share/vdsm/BindingJsonRpc.py", line 69, in _createTcpListener
    self._onAccept)
  File "/usr/lib/python2.7/site-packages/yajsonrpc/asyncoreReactor.py", line 196, in createListener
    l = AsyncoreListener(self, address, acceptHandler)
  File "/usr/lib/python2.7/site-packages/yajsonrpc/asyncoreReactor.py", line 125, in __init__
    self.listen(5)
  File "/usr/lib64/python2.7/asyncore.py", line 338, in listen
    return self.socket.listen(num)
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
  File "/usr/lib64/python2.7/socket.py", line 170, in _dummy
    raise error(EBADF, 'Bad file descriptor')
error: [Errno 9] Bad file descriptor
storageRefresh::DEBUG::2014-10-22 12:18:40,064::multipath::110::Storage.Misc.excCmd::(rescan) SUCCESS: <err> = ''; <rc> = 0
storageRefresh::DEBUG::2014-10-22 12:18:40,076::lvm::497::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the

I wonder if its related..

Comment 4 Yaniv Bronhaim 2014-10-22 10:34:42 UTC
ok.. the sigterm from systemd leaded to close the fd and back then (3.4) we didn't catch the exception. so as far as it looks, its actually notabug, just different behavior which requires to change the test case.