Bug 1425194 - vdsmd not running after 3.6 NGN to 4.1 NGN upgrade: untouched logger.conf should be updated
Summary: vdsmd not running after 3.6 NGN to 4.1 NGN upgrade: untouched logger.conf sho...
Keywords:
Status: CLOSED DUPLICATE of bug 1417534
Alias: None
Product: ovirt-node
Classification: oVirt
Component: General
Version: 4.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ovirt-4.1.1
: ---
Assignee: Ryan Barry
QA Contact: Huijuan Zhao
URL:
Whiteboard:
Depends On:
Blocks: 1421098
TreeView+ depends on / blocked
 
Reported: 2017-02-20 19:04 UTC by Jiri Belka
Modified: 2017-02-28 14:52 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-28 14:52:23 UTC
oVirt Team: Node
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: blocker+
mgoldboi: planning_ack+
sbonazzo: devel_ack+
cshao: testing_ack+


Attachments (Terms of Use)
comment 3 : logs from host (9.01 MB, application/x-gzip)
2017-02-21 06:32 UTC, Huijuan Zhao
no flags Details

Description Jiri Belka 2017-02-20 19:04:58 UTC
Description of problem:

I was testing 3.6 NGN to 4.1 upgrade and after reboot vdsmd failed because of:

~~~
-- Logs begin at Mon 2017-02-20 19:33:14 CET, end at Mon 2017-02-20 19:36:36 CET. --
Feb 20 19:33:25 localhost.localdomain systemd[1]: Starting oVirt ImageIO Daemon...
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: Traceback (most recent call last):
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/bin/ovirt-imageio-daemon", line 14, in <module>
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: server.main(sys.argv)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 50, in main
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: configure_logger()
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 68, in configure_logger
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: logging.config.fileConfig(conf, disable_existing_loggers=False)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/config.py", line 78, in fileConfig
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: handlers = _install_handlers(cp, formatters)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/config.py", line 156, in _install_handlers
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: h = klass(*args)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/handlers.py", line 117, in __init__
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: BaseRotatingHandler.__init__(self, filename, mode, encoding, delay)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/handlers.py", line 64, in __init__
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: logging.FileHandler.__init__(self, filename, mode, encoding, delay)
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/__init__.py", line 902, in __init__
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: StreamHandler.__init__(self, self._open())
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: File "/usr/lib64/python2.7/logging/__init__.py", line 925, in _open
Feb 20 19:33:28 localhost.localdomain ovirt-imageio-daemon[1130]: stream = open(self.baseFilename, self.mode)
~~~

[root@dell-r210ii-03 ~]# install -d -o vdsm -g kvm -m 755 /var/log/ovirt-imageio-daemon
[root@dell-r210ii-03 ~]# systemctl restart ovirt-imageio-daemon
[root@dell-r210ii-03 ~]# systemctl status ovirt-imageio-daemon
● ovirt-imageio-daemon.service - oVirt ImageIO Daemon
   Loaded: loaded (/usr/lib/systemd/system/ovirt-imageio-daemon.service; disabled; vendor preset: disabled)
   Active: active (running) since Mon 2017-02-20 19:38:59 CET; 3s ago
 Main PID: 26443 (ovirt-imageio-d)
   CGroup: /system.slice/ovirt-imageio-daemon.service
           └─26443 /usr/bin/python /usr/bin/ovirt-imageio-daemon

Feb 20 19:38:59 dell-r210ii-03.rhev.lab.eng.brq.redhat.com systemd[1]: Starting oVirt ImageIO Daemon...
Feb 20 19:38:59 dell-r210ii-03.rhev.lab.eng.brq.redhat.com systemd[1]: Started oVirt ImageIO Daemon.

solved this issue, but next one:

MainThread::INFO::2017-02-20 19:30:34,942::vdsm::174::vds::(run) <WorkerThread(Thread-3, started daemon 139997769438976)>
ioprocess communication (22158)::ERROR::2017-02-20 19:30:34,919::__init__::173::IOProcessClient::(_communicate) IOProcess failure
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 128, in _communicate
    raise Exception("FD closed")
Exception: FD closed
ioprocess communication (22138)::ERROR::2017-02-20 19:30:34,920::__init__::173::IOProcessClient::(_communicate) IOProcess failure
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 128, in _communicate
    raise Exception("FD closed")
Exception: FD closed
ioprocess communication (22167)::ERROR::2017-02-20 19:30:34,921::__init__::173::IOProcessClient::(_communicate) IOProcess failure
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 128, in _communicate
    raise Exception("FD closed")
Exception: FD closed
ioprocess communication (27029)::ERROR::2017-02-20 19:30:34,921::__init__::173::IOProcessClient::(_communicate) IOProcess failure
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 128, in _communicate
    raise Exception("FD closed")
Exception: FD closed

following solved the issue
# diff -uNp /etc/vdsm/svdsm.logger.conf.orig /tmp/svdsm.logger.conf 
--- /etc/vdsm/svdsm.logger.conf.orig    2017-02-20 19:56:00.369488795 +0100
+++ /tmp/svdsm.logger.conf      2017-02-20 19:54:14.693767425 +0100
@@ -25,7 +25,7 @@ qualname=Gluster
 propagate=0
 
 [handler_logfile]
-class=logUtils.UserGroupEnforcingHandler
+class=vdsm.logUtils.UserGroupEnforcingHandler
 args=('root', 'root', '/var/log/vdsm/supervdsm.log',)
 filters=storage.misc.TracebackRepeatFilter
 level=DEBUG

--- /etc/vdsm/logger.conf.orig	2017-02-20 19:59:30.459869281 +0100
+++ /tmp/logger.conf	2017-02-20 19:58:40.301629935 +0100
@@ -1,27 +1,27 @@
 [loggers]
-keys=root,vds,Storage,virt,ovirt_hosted_engine_ha,ovirt_hosted_engine_ha_config,IOProcess,connectivity
+keys=root,vds,storage,virt,ovirt_hosted_engine_ha,ovirt_hosted_engine_ha_config,IOProcess,devel
 
 [handlers]
-keys=console,syslog,logfile,connlogfile
+keys=console,syslog,logfile
 
 [formatters]
 keys=long,simple,none,sysform
 
 [logger_root]
-level=DEBUG
+level=INFO
 handlers=syslog,logfile
 propagate=0
 
 [logger_vds]
-level=DEBUG
+level=INFO
 handlers=syslog,logfile
 qualname=vds
 propagate=0
 
-[logger_Storage]
-level=DEBUG
+[logger_storage]
+level=INFO
 handlers=logfile
-qualname=Storage
+qualname=storage
 propagate=0
 
 [logger_ovirt_hosted_engine_ha]
@@ -40,40 +40,33 @@ propagate=0
 level=INFO
 handlers=logfile
 qualname=IOProcess
-propagate=1
-
-[logger_connectivity]
-level=DEBUG
-handlers=connlogfile
-qualname=connectivity
 propagate=0
 
 [logger_virt]
-level=DEBUG
+level=INFO
 handlers=logfile
 qualname=virt
 propagate=0
 
+[logger_devel]
+level=ERROR
+handlers=logfile
+qualname=devel
+propagate=0
+
 [handler_syslog]
-level=WARNING
+level=WARN
 class=handlers.SysLogHandler
 formatter=sysform
 args=('/dev/log', handlers.SysLogHandler.LOG_USER)
 
 [handler_logfile]
-class=logUtils.UserGroupEnforcingHandler
+class=vdsm.logUtils.UserGroupEnforcingHandler
 args=('vdsm', 'kvm', '/var/log/vdsm/vdsm.log',)
 filters=storage.misc.TracebackRepeatFilter
 level=DEBUG
 formatter=long
 
-[handler_connlogfile]
-class=logging.handlers.WatchedFileHandler
-args=('/var/log/vdsm/connectivity.log',)
-filters=storage.misc.TracebackRepeatFilter
-level=DEBUG
-formatter=simple
-
 [handler_console]
 class: StreamHandler
 args: []
@@ -86,7 +79,7 @@ format: %(asctime)s:%(levelname)s:%(mess
 format: %(message)s
 
 [formatter_long]
-format: %(threadName)s::%(levelname)s::%(asctime)s::%(module)s::%(lineno)d::%(name)s::(%(funcName)s) %(message)s
+format: %(asctime)s %(levelname)-5s (%(threadName)s) [%(name)s] %(message)s (%(module)s:%(lineno)d)
 
 [formatter_sysform]
 format= vdsm %(name)s %(levelname)s %(message)s


Version-Release number of selected component (if applicable):
redhat-release-virtualization-host-content-4.1-0.6.el7.x86_64

How reproducible:
just happens

Steps to Reproduce:
1. get 3.6 NGN and update to 4.1 node
2.
3.

Actual results:
doesn't start successfully after reboot

Expected results:
should work

Additional info:

Comment 2 Ryan Barry 2017-02-20 20:26:31 UTC
This appears to be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1417534

Comment 3 Huijuan Zhao 2017-02-21 06:18:43 UTC
Can reproduce.

Test version:
From:
RHVH-3.6-20170217.5-RHVH-x86_64-dvd1.iso
To:
rhvh-4.1-0.20170208.0

Test steps:
1. Install RHVH-3.6-20170217.5-RHVH-x86_64-dvd1.iso
2. Reboot and login RHVH-3.6, add RHVH-3.6 to engine 3.6(3.6 cluster), add NFS storage to host.
3. Setup local repos in RHVH-3.6, update it to rhvh-4.1-0.20170208.0
   # yum update
4. Reboot and login rhvh-4.1-0.20170208.0, check the vdsmd service and RHVH status in engine side

Actual results:
After step4, vdsmd is inactive in RHVH-4.1, RHVH-4.1 is not up in engine3.6 side.

# service vdsmd status
Redirecting to /bin/systemctl status  vdsmd.service
● vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled)
   Active: inactive (dead) (Result: exit-code) since Tue 2017-02-21 06:03:13 GMT; 14min ago
  Process: 25938 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh --post-stop (code=exited, status=0/SUCCESS)
  Process: 25929 ExecStart=/usr/share/vdsm/daemonAdapter -0 /dev/null -1 /dev/null -2 /dev/null /usr/share/vdsm/vdsm (code=exited, status=1/FAILURE)
  Process: 25832 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
 Main PID: 25929 (code=exited, status=1/FAILURE)

Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Unit vdsmd.service entered failed state.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: vdsmd.service failed.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Dependency failed for Virtual Desktop Server Manager.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Job vdsmd.service/start failed with result 'dependency'.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: vdsmd.service holdoff time over, scheduling restart.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Dependency failed for Virtual Desktop Server Manager.
Feb 21 06:03:13 dell-per730-35.lab.eng.pek2.redhat.com systemd[1]: Job vdsmd.service/start failed with result 'dependency'.

Excepted results:
After step4, vdsmd is active in RHVH-4.1, RHVH-4.1 is up in engine3.6 side

Comment 4 Huijuan Zhao 2017-02-21 06:32:56 UTC
Created attachment 1255961 [details]
comment 3 : logs from host

Comment 5 Red Hat Bugzilla Rules Engine 2017-02-21 09:08:58 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 6 Dan Kenigsberg 2017-02-21 14:04:52 UTC
I see here 3 different bugs:
1. upgrade did not create /var/log/ovirt-imageio-daemon with correct permissions
2. /etc/vdsm/logger.conf was not replaced by a new version (only this is a dup of bug 1417534)
3. if someone touches logger.conf, and thus causes it not to be replaced on upgrade, it would still reffer to logUtils.UserGroupEnforcingHandler (instead of vdsm.logUtils.UserGroupEnforcingHandler) vdsmd would not start.

Comment 7 Jiri Belka 2017-02-21 15:24:54 UTC
ovirt-imageio-daemon issue is split into https://bugzilla.redhat.com/show_bug.cgi?id=1425502

this BZ remains only for vdsmd issue.

Comment 8 Ryan Barry 2017-02-28 14:52:23 UTC
Since the other bugs have been split out, closing this as a duplicate.

*** This bug has been marked as a duplicate of bug 1417534 ***


Note You need to log in before you can comment on or make changes to this bug.