Bug 1454536
Summary: | HostedEngine setup fails if RHV-H timezone < UTC set during installation | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Germano Veit Michel <gveitmic> | ||||
Component: | imgbased | Assignee: | Ryan Barry <rbarry> | ||||
Status: | CLOSED ERRATA | QA Contact: | Yihui Zhao <yzhao> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 4.1.1 | CC: | cshao, dfediuck, huzhao, lsurette, melewis, mgoldboi, qiyuan, rbarry, sbonazzo, srevivo, stirabos, trichard, weiwang, yaniwang, ycui, ykaul, yzhao | ||||
Target Milestone: | ovirt-4.2.0 | Keywords: | ZStream | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | imgbased-0.9.30-0.1.el7ev | Doc Type: | Known Issue | ||||
Doc Text: |
Red Hat Virtualization Host generates VDSM certificates at the time of the first boot. This means that if the system clock was not set correctly at install time, chronyd or ntpd may resynchronize the clock after the VDSM certificate was generated, leading to a certificate that is not valid yet if the timezone is behind UTC. Now, imgbased-configure-vdsm starts after chronyd or ntpd and waits two seconds for the clock to synchronize; however, this does guarantee that the clock will be synchronized in time, so the best workaround is to set the system clock appropriately during installation.
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 1455441 (view as bug list) | Environment: | |||||
Last Closed: | 2018-05-15 17:57:40 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Integration | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1455441 | ||||||
Attachments: |
|
Description
Germano Veit Michel
2017-05-23 01:10:02 UTC
Same thing with /etc/pki/vdsm/certs/cacert.pem The certificate is not generated at install time, but rather on the first boot by vdsm itself. Reassigning... Can the customer also ensure that this isn't ntp/chrony catching the clock up after install? That is -- does the hardware clock change on the first boot (new in box server)? (In reply to Ryan Barry from comment #2) > The certificate is not generated at install time, but rather on the first > boot by vdsm itself. Reassigning... Hi Ryan, Right, thanks. I was actually expecting it to be generated during ovirt-hosted-engine-deploy using src/plugins/gr-he-setup/pki/vdsmpki.py. > Can the customer also ensure that this isn't ntp/chrony catching the clock > up after install? > > That is -- does the hardware clock change on the first boot (new in box > server)? I can reproduce it as many times as I want, without any NTP setting. Ryan, Are you saying this is generated during that vdsm_init sh script? So it would be through a vdsm-tool configure call? Any ideas on what debug to enable to get more info on this? (In reply to Germano Veit Michel from comment #5) > Ryan, > > Are you saying this is generated during that vdsm_init sh script? So it > would be through a vdsm-tool configure call? > > Any ideas on what debug to enable to get more info on this? Well, kind of. It's generated as part of the patches attached to https://bugzilla.redhat.com/show_bug.cgi?id=1429288 with 'vdsm-tool configure --force' You could disable this during RHVH install during %post, then start it at will, with debugging added to the systemd service. Ex: %post systemctl disable imgbase-config-vdsm imgbase init --layout %end But I'd expect this is also reproducible on RHEL. Simone can you have a look? Reproduced just delpoying ovirt-node-ng-installer-ovirt-4.1-2017052309.iso without starting hosted-engine-setup. Just after node installation, vdsmcert.pem is already there but invalid due to not before. We need to understand what generates it, if it's already there hosted-engine-setup will simply keep that one. [root@node412 ~]# date -u mer 24 mag 2017, 18.26.47, UTC [root@node412 ~]# openssl x509 -noout -dates -in /etc/pki/vdsm/certs/vdsmcert.pem notBefore=May 25 00:10:05 2017 GMT notAfter=May 25 00:10:05 2018 GMT This looks interesting, on my reproduction vdsmcert.pem has been generated at 17:10:05 [root@node412 ~]# ls --full-time /etc/pki/vdsm/certs/vdsmcert.pem -rw-------. 1 vdsm kvm 1224 2017-05-24 17:10:05.973100716 -0700 /etc/pki/vdsm/certs/vdsmcert.pem but at 17:13:16 chronyd moved back the time to 10:14:27; I assume that my cert is not valid due to that reason. mag 24 17:13:16 node412.localdomain NetworkManager[1063]: <info> [1495671196.9871] policy: set 'eth0' (eth0) as default for IPv6 routing and DNS mag 24 10:13:21 node412.localdomain systemd[1]: Time has been changed mag 24 17:13:21 node412.localdomain chronyd[997]: Selected source 212.45.144.206 mag 24 17:13:21 node412.localdomain chronyd[997]: System clock wrong by -25199.566962 seconds, adjustment started mag 24 10:13:21 node412.localdomain chronyd[997]: System clock was stepped by -25199.566962 seconds mag 24 10:14:27 node412.localdomain chronyd[997]: Selected source 85.199.214.99 On my opinion this is node specific and the issue is that, at least on my system, imgbase-config-vdsm started configuring vdsm (and creating its certs) at 17:09:59 while only at 17:13:21 chronyd moved back the time to 10:14:27 and so the certs create at 17:10:05 was simply invalid because they was created in a relative future. Maybe we should make imgbase-config-vdsm waiting chronyd at systemd level [root@node412 ~]# journalctl -u imgbase-config-vdsm -- Logs begin at mer 2017-05-24 10:09:51 PDT, end at mer 2017-05-24 11:48:01 PDT. -- mag 24 17:09:59 node412.localdomain systemd[1]: Starting Reconfigure vdsm... mag 24 17:10:08 node412.localdomain usermod[1204]: add 'sanlock' to group 'kvm' mag 24 17:10:08 node412.localdomain usermod[1204]: add 'sanlock' to group 'qemu' mag 24 17:10:08 node412.localdomain usermod[1204]: add 'sanlock' to shadow group 'kvm' mag 24 17:10:08 node412.localdomain usermod[1204]: add 'sanlock' to shadow group 'qemu' mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Checking configuration status... mag 24 17:10:08 node412.localdomain vdsm-tool[951]: multipath requires configuration mag 24 17:10:08 node412.localdomain vdsm-tool[951]: WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration mag 24 17:10:08 node412.localdomain vdsm-tool[951]: lvm requires configuration mag 24 17:10:08 node412.localdomain vdsm-tool[951]: libvirt is not configured for vdsm yet mag 24 17:10:08 node412.localdomain vdsm-tool[951]: FAILED: conflicting vdsm and libvirt-qemu tls configuration. mag 24 17:10:08 node412.localdomain vdsm-tool[951]: vdsm.conf with ssl=True requires the following changes: mag 24 17:10:08 node412.localdomain vdsm-tool[951]: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 mag 24 17:10:08 node412.localdomain vdsm-tool[951]: qemu.conf: spice_tls=1. mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Running configure... mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Reconfiguration of certificates is done. mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Reconfiguration of sebool is done. mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Reconfiguration of multipath is done. mag 24 17:10:08 node412.localdomain vdsm-tool[951]: WARNING: LVM local configuration: /etc/lvm/lvmlocal.conf is not based on vdsm configuration mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Backing up /etc/lvm/lvmlocal.conf to /etc/lvm/lvmlocal.conf.201705241710 mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Installing /usr/share/vdsm/lvmlocal.conf at /etc/lvm/lvmlocal.conf mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Reconfiguration of lvm is done. mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Reconfiguration of passwd is done. mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Reconfiguration of sanlock is done. mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Reconfiguration of libvirt is done. mag 24 17:10:08 node412.localdomain vdsm-tool[951]: Done configuring modules to VDSM. mag 24 17:10:08 node412.localdomain systemd[1]: Started Reconfigure vdsm. Checking the sos-reports from the original reporter, I think it happened exactly in the same way: May 22 18:30:59 ******** dracut: *** Including module: network *** May 22 14:30:59 ******** chronyd[1726]: Selected source ******* May 22 14:30:59 ******** systemd: Time has been changed May 22 14:30:59 ******** chronyd[1726]: System clock wrong by -14399.772337 seconds, adjustment started May 22 14:30:59 ******** chronyd[1726]: System clock was stepped by -14399.772337 seconds QE can reproduce this issue. Test version: rhvh-4.1-0.20170421.0+1 RHVH-4.1-20170425.3-RHVH-x86_64-dvd1.iso imgbased-0.9.23-0.1.el7ev.noarch libvirt-2.0.0-10.el7_3.5.x86_64 vdsm-4.19.10.1-1.el7ev.x86_64 ovirt-hosted-engine-ha-2.1.0.5-2.el7ev.noarch ovirt-hosted-engine-setup-2.1.0.5-1.el7ev.noarch rhvm-appliance-20170511.0-1.x86_64.rhevm.ova Test steps: 1. Boot Installation ISO 2. Select New York Timezone 3. Finish Install 4. vdsm-tool configure --module libvirt --force 5. Try to Deploy Hosted-Engine via cockpit Test result: Hosted Engine deployment failed. =================================================== # imgbase w [INFO] You are on rhvh-4.1-0.20170421.0+1 # date Wed May 24 22:58:47 EDT 2017 # ls --full-time /etc/pki/vdsm/certs/vdsmcert.pem -rw-------. 1 vdsm kvm 1224 2017-05-25 02:55:49.477178065 -0400 /etc/pki/vdsm/certs/vdsmcert.pem [root@dhcp-9-70 ~]# date Wed May 24 23:02:06 EDT 2017 # date Wed May 24 23:03:26 EDT 2017 [root@dhcp-9-70 ~]# date -u Thu May 25 03:03:29 UTC 2017 # vdsm-tool configure --module libvirt --force Checking configuration status... libvirt is already configured for vdsm SUCCESS: ssl configured to true. No conflicts Running configure... Reconfiguration of libvirt is done. Error: ServiceOperationError: _systemctlStart failed Job for libvirtd.service failed because the control process exited with error code. See "systemctl status libvirtd.service" and "journalctl -xe" for details. May 24 23:04:14 dhcp-9-70.nay.redhat.com libvirtd[21246]: The server certificate /etc/pki/vdsm/certs/vdsmcert.pem is not yet active May 24 23:04:14 dhcp-9-70.nay.redhat.com systemd[1]: libvirtd.service: main process exited, code=exited, status=6/NOTCONFIGURED May 24 23:04:14 dhcp-9-70.nay.redhat.com systemd[1]: Failed to start Virtualization daemon. May 24 23:04:14 dhcp-9-70.nay.redhat.com systemd[1]: Dependency failed for Virtual Desktop Server Manager network restoration. May 24 23:04:14 dhcp-9-70.nay.redhat.com systemd[1]: Dependency failed for Virtual Desktop Server Manager. May 24 23:04:14 dhcp-9-70.nay.redhat.com systemd[1]: Dependency failed for MOM instance configured for VDSM purposes. May 24 23:04:14 dhcp-9-70.nay.redhat.com systemd[1]: Job mom-vdsm.service/start failed with result 'dependency'. May 24 23:04:14 dhcp-9-70.nay.redhat.com systemd[1]: Job vdsmd.service/start failed with result 'dependency'. May 24 23:04:14 dhcp-9-70.nay.redhat.com systemd[1]: Job vdsm-network.service/start failed with result 'dependency'. Host was installed 20 minutes ago. # date Wed May 24 23:18:54 EDT 2017 # openssl verify -CAfile /etc/pki/vdsm/certs/cacert.pem /etc/pki/vdsm/certs/vdsmcert.pem /etc/pki/vdsm/certs/vdsmcert.pem: CN = VDSM Certificate Authority error 9 at 1 depth lookup:certificate is not yet valid # openssl x509 -noout -dates -in /etc/pki/vdsm/certs/vdsmcert.pem notBefore=May 25 06:55:49 2017 GMT notAfter=May 25 06:55:49 2018 GMT (In reply to shaochen from comment #14) > QE can reproduce this issue. shaochen, could you please check if also in your reproduction, chronyd changed the system clock after cert creation by imgbase-config-vdsm? You should look for 'systemd: Time has been changed' in systemd journal or on /var/log/messages. (In reply to Simone Tiraboschi from comment #15) > (In reply to shaochen from comment #14) > > QE can reproduce this issue. > > shaochen, could you please check if also in your reproduction, chronyd > changed the system clock after cert creation by imgbase-config-vdsm? Yes. > You should look for 'systemd: Time has been changed' in systemd journal or > on /var/log/messages. May 24 22:55:57 dhcp-9-70 chronyd[892]: System clock wrong by -14400.420517 seconds, adjustment started May 24 22:55:57 dhcp-9-70 systemd: Time has been changed May 24 22:55:57 dhcp-9-70 chronyd[892]: System clock was stepped by -14400.420517 seconds Attach all logs for you reference. Created attachment 1282136 [details]
all log info
1. Cannot reproduce this issue. Test version: RHVH-4.1-20171127.0-RHVH-x86_64-dvd1.iso vdsm-4.19.40-1.el7ev.x86_64 imgbased-0.9.51-0.1.el7ev.noarch Test steps: 1. Set the hwclock is UTC+ 2. Install RHVH via ISO 3. Do not enable nic and network time 4. Continue to finish other installation steps, then reboot 5. Check whether the RTC is default to UTC using "timedatectl" 6. check vdsm certificate using: #openssl x509 -in /etc/pki/vdsm/certs/vdsmcert.pem -text -noout |grep Not #openssl verify -CAfile /etc/pki/vdsm/certs/cacert.pem /etc/pki/vdsm/certs/vdsmcert.pem 7. Check libvirtd.service status 8. Enable network and sync time 9. Repeat step6,7 10. Try to deploy HostedEngine via cockpit Test Results: 1. After step 5: [root@dhcp-8-151 certs]# openssl x509 -in /etc/pki/vdsm/certs/vdsmcert.pem -text -noout |grep Not Not Before: Nov 30 12:13:40 2017 GMT Not After : Dec 1 12:13:40 2018 GMT [root@dhcp-8-151 certs]# openssl x509 -in /etc/pki/vdsm/certs/cacert.pem -text -noout |grep Not Not Before: Nov 30 12:13:40 2017 GMT Not After : Dec 1 12:13:40 2018 GMT [root@dhcp-8-151 certs]# openssl verify -CAfile /etc/pki/vdsm/certs/cacert.pem /etc/pki/vdsm/certs/vdsmcert.pem /etc/pki/vdsm/certs/vdsmcert.pem: OK 2. After step 9: [root@dhcp-8-151 certs]# openssl x509 -in /etc/pki/vdsm/certs/vdsmcert.pem -text -noout |grep Not Not Before: Nov 30 12:13:40 2017 GMT Not After : Dec 1 12:13:40 2018 GMT [root@dhcp-8-151 certs]# openssl x509 -in /etc/pki/vdsm/certs/cacert.pem -text -noout |grep Not Not Before: Nov 30 12:13:40 2017 GMT Not After : Dec 1 12:13:40 2018 GMT [root@dhcp-8-151 certs]# openssl verify -CAfile /etc/pki/vdsm/certs/cacert.pem /etc/pki/vdsm/certs/vdsmcert.pem /etc/pki/vdsm/certs/vdsmcert.pem: OK 3. After step10, Deploy HE successfully. Due to the result of the step5,9,10, so this issue cannot reproduce, and change the bug's status to verified. According to Comment 20, moving to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1524 BZ<2>Jira Resync |