Bug 1425194

Summary: vdsmd not running after 3.6 NGN to 4.1 NGN upgrade: untouched logger.conf should be updated
Product: [oVirt] ovirt-node Reporter: Jiri Belka <jbelka>
Component: GeneralAssignee: Ryan Barry <rbarry>
Status: CLOSED DUPLICATE QA Contact: Huijuan Zhao <huzhao>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.1CC: bugs, cshao, danken, dguo, gklein, huzhao, jiawu, mgoldboi, qiyuan, sbonazzo, weiwang, yaniwang, ycui, yzhao
Target Milestone: ovirt-4.1.1Keywords: TestBlocker
Target Release: ---Flags: rule-engine: ovirt-4.1+
rule-engine: blocker+
mgoldboi: planning_ack+
sbonazzo: devel_ack+
cshao: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-28 14:52:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1421098    
Attachments:
Description Flags
comment 3 : logs from host none

Description Jiri Belka 2017-02-20 19:04:58 UTC
Description of problem:

I was testing 3.6 NGN to 4.1 upgrade and after reboot vdsmd failed because of:

~~~
-- Logs begin at Mon 2017-02-20 19:33:14 CET, end at Mon 2017-02-20 19:36:36 CET. --
Feb 20 19:33:25 localhost.localdomain systemd[1]: Starting oVirt ImageIO Daemon...
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: Traceback (most recent call last):
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/bin/ovirt-imageio-daemon", line 14, in <module>
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: server.main(sys.argv)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 50, in main
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: configure_logger()
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 68, in configure_logger
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: logging.config.fileConfig(conf, disable_existing_loggers=False)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/config.py", line 78, in fileConfig
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: handlers = _install_handlers(cp, formatters)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/config.py", line 156, in _install_handlers
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: h = klass(*args)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/handlers.py", line 117, in __init__
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: BaseRotatingHandler.__init__(self, filename, mode, encoding, delay)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/handlers.py", line 64, in __init__
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: logging.FileHandler.__init__(self, filename, mode, encoding, delay)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/__init__.py", line 902, in __init__
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: StreamHandler.__init__(self, self._open())
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: stream = open(self.baseFilename, self.mode)
~~~

[root@dell-r210ii-03 ~]# install -d -o vdsm -g kvm -m 755 /var/log/ovirt-imageio-daemon
[root@dell-r210ii-03 ~]# systemctl restart ovirt-imageio-daemon
[root@dell-r210ii-03 ~]# systemctl status ovirt-imageio-daemon
● ovirt-imageio-daemon.service - oVirt ImageIO Daemon
   Loaded: loaded (/usr/lib/systemd/system/ovirt-imageio-daemon.service; disabled; vendor preset: disabled)
   Active: active (running) since Mon 2017-02-20 19:38:59 CET; 3s ago
 Main PID: 26443 (ovirt-imageio-d)
   CGroup: /system.slice/ovirt-imageio-daemon.service
           └─26443 /usr/bin/python /usr/bin/ovirt-imageio-daemon

Feb 20 19:38:59 dell-r210ii-03.rhev.lab.eng.brq.redhat.com systemd[1]: Starting oVirt ImageIO Daemon...
Feb 20 19:38:59 dell-r210ii-03.rhev.lab.eng.brq.redhat.com systemd[1]: Started oVirt ImageIO Daemon.

solved this issue, but next one:

MainThread::INFO::2017-02-20 19:30:34,942::vdsm::174::vds::(run) <WorkerThread(Thread-3, started daemon 139997769438976)>
ioprocess communication (22158)::ERROR::2017-02-20 19:30:34,919::__init__::173::IOProcessClient::(_communicate) IOProcess failure
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 128, in _communicate
    raise Exception("FD closed")
Exception: FD closed
ioprocess communication (22138)::ERROR::2017-02-20 19:30:34,920::__init__::173::IOProcessClient::(_communicate) IOProcess failure
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 128, in _communicate
    raise Exception("FD closed")
Exception: FD closed
ioprocess communication (22167)::ERROR::2017-02-20 19:30:34,921::__init__::173::IOProcessClient::(_communicate) IOProcess failure
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 128, in _communicate
    raise Exception("FD closed")
Exception: FD closed
ioprocess communication (27029)::ERROR::2017-02-20 19:30:34,921::__init__::173::IOProcessClient::(_communicate) IOProcess failure
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 128, in _communicate
    raise Exception("FD closed")
Exception: FD closed

following solved the issue
# diff -uNp /etc/vdsm/svdsm.logger.conf.orig /tmp/svdsm.logger.conf 
--- /etc/vdsm/svdsm.logger.conf.orig    2017-02-20 19:56:00.369488795 +0100
+++ /tmp/svdsm.logger.conf      2017-02-20 19:54:14.693767425 +0100
@@ -25,7 +25,7 @@ qualname=Gluster
 propagate=0
 
 [handler_logfile]
-class=logUtils.UserGroupEnforcingHandler
+class=vdsm.logUtils.UserGroupEnforcingHandler
 args=('root', 'root', '/var/log/vdsm/supervdsm.log',)
 filters=storage.misc.TracebackRepeatFilter
 level=DEBUG

--- /etc/vdsm/logger.conf.orig	2017-02-20 19:59:30.459869281 +0100
+++ /tmp/logger.conf	2017-02-20 19:58:40.301629935 +0100
@@ -1,27 +1,27 @@
 [loggers]
-keys=root,vds,Storage,virt,ovirt_hosted_engine_ha,ovirt_hosted_engine_ha_config,IOProcess,connectivity
+keys=root,vds,storage,virt,ovirt_hosted_engine_ha,ovirt_hosted_engine_ha_config,IOProcess,devel
 
 [handlers]
-keys=console,syslog,logfile,connlogfile
+keys=console,syslog,logfile
 
 [formatters]
 keys=long,simple,none,sysform
 
 [logger_root]
-level=DEBUG
+level=INFO
 handlers=syslog,logfile
 propagate=0
 
 [logger_vds]
-level=DEBUG
+level=INFO
 handlers=syslog,logfile
 qualname=vds
 propagate=0
 
-[logger_Storage]
-level=DEBUG
+[logger_storage]
+level=INFO
 handlers=logfile
-qualname=Storage
+qualname=storage
 propagate=0
 
 [logger_ovirt_hosted_engine_ha]
@@ -40,40 +40,33 @@ propagate=0
 level=INFO
 handlers=logfile
 qualname=IOProcess
-propagate=1
-
-[logger_connectivity]
-level=DEBUG
-handlers=connlogfile
-qualname=connectivity
 propagate=0
 
 [logger_virt]
-level=DEBUG
+level=INFO
 handlers=logfile
 qualname=virt
 propagate=0
 
+[logger_devel]
+level=ERROR
+handlers=logfile
+qualname=devel
+propagate=0
+
 [handler_syslog]
-level=WARNING
+level=WARN
 class=handlers.SysLogHandler
 formatter=sysform
 args=('/dev/log', handlers.SysLogHandler.LOG_USER)
 
 [handler_logfile]
-class=logUtils.UserGroupEnforcingHandler
+class=vdsm.logUtils.UserGroupEnforcingHandler
 args=('vdsm', 'kvm', '/var/log/vdsm/vdsm.log',)
 filters=storage.misc.TracebackRepeatFilter
 level=DEBUG
 formatter=long
 
-[handler_connlogfile]
-class=logging.handlers.WatchedFileHandler
-args=('/var/log/vdsm/connectivity.log',)
-filters=storage.misc.TracebackRepeatFilter
-level=DEBUG
-formatter=simple
-
 [handler_console]
 class: StreamHandler
 args: []
@@ -86,7 +79,7 @@ format: %(asctime)s:%(levelname)s:%(mess
 format: %(message)s
 
 [formatter_long]
-format: %(threadName)s::%(levelname)s::%(asctime)s::%(module)s::%(lineno)d::%(name)s::(%(funcName)s) %(message)s
+format: %(asctime)s %(levelname)-5s (%(threadName)s) [%(name)s] %(message)s (%(module)s:%(lineno)d)
 
 [formatter_sysform]
 format= vdsm %(name)s %(levelname)s %(message)s


Version-Release number of selected component (if applicable):
redhat-release-virtualization-host-content-4.1-0.6.el7.x86_64

How reproducible:
just happens

Steps to Reproduce:
1. get 3.6 NGN and update to 4.1 node
2.
3.

Actual results:
doesn't start successfully after reboot

Expected results:
should work

Additional info:

Comment 2 Ryan Barry 2017-02-20 20:26:31 UTC
This appears to be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1417534

Comment 3 Huijuan Zhao 2017-02-21 06:18:43 UTC
Can reproduce.

Test version:
From:
RHVH-3.6-20170217.5-RHVH-x86_64-dvd1.iso
To:
rhvh-4.1-0.20170208.0

Test steps:
1. Install RHVH-3.6-20170217.5-RHVH-x86_64-dvd1.iso
2. Reboot and login RHVH-3.6, add RHVH-3.6 to engine 3.6(3.6 cluster), add NFS storage to host.
3. Setup local repos in RHVH-3.6, update it to rhvh-4.1-0.20170208.0
   # yum update
4. Reboot and login rhvh-4.1-0.20170208.0, check the vdsmd service and RHVH status in engine side

Actual results:
After step4, vdsmd is inactive in RHVH-4.1, RHVH-4.1 is not up in engine3.6 side.

# service vdsmd status
Redirecting to /bin/systemctl status  vdsmd.service
● vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled)
   Active: inactive (dead) (Result: exit-code) since Tue 2017-02-21 06:03:13 GMT; 14min ago
  Process: 25938 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh --post-stop (code=exited, status=0/SUCCESS)
  Process: 25929 ExecStart=/usr/share/vdsm/daemonAdapter -0 /dev/null -1 /dev/null -2 /dev/null /usr/share/vdsm/vdsm (code=exited, status=1/FAILURE)
  Process: 25832 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
 Main PID: 25929 (code=exited, status=1/FAILURE)

Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Unit vdsmd.service entered failed state.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: vdsmd.service failed.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Dependency failed for Virtual Desktop Server Manager.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Job vdsmd.service/start failed with result 'dependency'.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: vdsmd.service holdoff time over, scheduling restart.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Dependency failed for Virtual Desktop Server Manager.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Job vdsmd.service/start failed with result 'dependency'.

Excepted results:
After step4, vdsmd is active in RHVH-4.1, RHVH-4.1 is up in engine3.6 side

Comment 4 Huijuan Zhao 2017-02-21 06:32:56 UTC
Created attachment 1255961 [details]
comment 3 : logs from host

Comment 5 Red Hat Bugzilla Rules Engine 2017-02-21 09:08:58 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 6 Dan Kenigsberg 2017-02-21 14:04:52 UTC
I see here 3 different bugs:
1. upgrade did not create /var/log/ovirt-imageio-daemon with correct permissions
2. /etc/vdsm/logger.conf was not replaced by a new version (only this is a dup of bug 1417534)
3. if someone touches logger.conf, and thus causes it not to be replaced on upgrade, it would still reffer to logUtils.UserGroupEnforcingHandler (instead of vdsm.logUtils.UserGroupEnforcingHandler) vdsmd would not start.

Comment 7 Jiri Belka 2017-02-21 15:24:54 UTC
ovirt-imageio-daemon issue is split into https://bugzilla.redhat.com/show_bug.cgi?id=1425502

this BZ remains only for vdsmd issue.

Comment 8 Ryan Barry 2017-02-28 14:52:23 UTC
Since the other bugs have been split out, closing this as a duplicate.

*** This bug has been marked as a duplicate of bug 1417534 ***