Bug 1467813
Summary: | [downstream clone - 4.1.4] Hosted Engine upgrade from 3.6 to 4.0 will fail if the NFS is exported with root_squash | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | rhev-integ | ||||||
Component: | ovirt-hosted-engine-setup | Assignee: | Simone Tiraboschi <stirabos> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Nikolai Sednev <nsednev> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 4.0.7 | CC: | gveitmic, lsurette, mavital, melewis, mkalinin, nashok, pvilayat, rhodain, rjones, stirabos, ykaul, ylavi | ||||||
Target Milestone: | ovirt-4.1.4-1 | Keywords: | Triaged, ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | ovirt-hosted-engine-setup-2.1.3.5 | Doc Type: | Bug Fix | ||||||
Doc Text: |
Previously, the upgrade tool did not drop root privileges when accessing the storage. This meant that on NFS with the root_squash option enabled the tool failed to access the storage. Now, the upgrade tool drops root privileges while accessing the storage.
|
Story Points: | --- | ||||||
Clone Of: | 1466234 | Environment: | |||||||
Last Closed: | 2017-08-03 19:37:36 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | Integration | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1466234 | ||||||||
Bug Blocks: | 1467934 | ||||||||
Attachments: |
|
Description
rhev-integ
2017-07-05 08:31:54 UTC
Very strange, I have root_squash enabled on my share and described scenario did worked for me just fine for 3.6->4.0 upgrade-appliance scenario. Do you also have 'anonuid=36,anongid=36'? root_squash map requests from uid/gid 0 to the anonymous uid/gid so if you have also 'anonuid=36,anongid=36' it was supposed to work. The only case to hit that was if you had root_squash without anonuid=36,anongid=36 (which is not the recommended and documented configuration). *** Bug 1469909 has been marked as a duplicate of this bug. *** Reproduction steps: 1.I've properly configured the NFS server with root_squash parameter enabled: mnt]# touch 1.txt mnt]# ls -lsha total 8.0K 4.0K drwxrwxrwx. 2 root root 4.0K Jul 20 15:51 . 4.0K dr-xr-xr-x. 18 root root 4.0K Jul 20 15:51 .. 0 -rw-r--r--. 1 nfsnobody nfsnobody 0 Jul 20 15:51 1.txt Cleaned the NFS share and then deployed latest 3.6 SHE environment on that share. Environment consisted of two ha el7.4 hosts and the 3.6 engine. I've added also two NFS data storage domains from different NFS shares, from different server and got hosted-engine storage domain auto-imported successfully. Components on engine: rhevm-log-collector-3.6.1-1.el6ev.noarch rhevm-setup-plugin-vmconsole-proxy-helper-3.6.11.3-0.1.el6.noarch rhevm-userportal-3.6.11.3-0.1.el6.noarch rhevm-dependencies-3.6.1-1.el6ev.noarch rhevm-branding-rhev-3.6.0-10.el6ev.noarch rhevm-setup-plugin-websocket-proxy-3.6.11.3-0.1.el6.noarch rhevm-setup-3.6.11.3-0.1.el6.noarch rhevm-spice-client-x86-cab-3.6-7.el6.noarch ovirt-engine-extension-aaa-jdbc-1.0.7-2.el6ev.noarch ovirt-host-deploy-1.4.1-1.el6ev.noarch rhev-guest-tools-iso-3.6-6.el6ev.noarch rhevm-sdk-python-3.6.9.1-1.el6ev.noarch ovirt-vmconsole-1.0.4-1.el6ev.noarch rhevm-lib-3.6.11.3-0.1.el6.noarch rhevm-websocket-proxy-3.6.11.3-0.1.el6.noarch rhevm-backend-3.6.11.3-0.1.el6.noarch rhevm-setup-plugins-3.6.5-1.el6ev.noarch rhevm-tools-3.6.11.3-0.1.el6.noarch rhevm-spice-client-x64-msi-3.6-7.el6.noarch rhevm-iso-uploader-3.6.0-1.el6ev.noarch ovirt-setup-lib-1.0.1-1.el6ev.noarch rhevm-doc-3.6.10-1.el6ev.noarch rhevm-cli-3.6.9.0-1.el6ev.noarch rhevm-setup-base-3.6.11.3-0.1.el6.noarch rhevm-tools-backup-3.6.11.3-0.1.el6.noarch rhevm-restapi-3.6.11.3-0.1.el6.noarch rhevm-vmconsole-proxy-helper-3.6.11.3-0.1.el6.noarch rhevm-dbscripts-3.6.11.3-0.1.el6.noarch rhevm-spice-client-x86-msi-3.6-7.el6.noarch ovirt-host-deploy-java-1.4.1-1.el6ev.noarch rhevm-image-uploader-3.6.1-2.el6ev.noarch rhevm-setup-plugin-ovirt-engine-common-3.6.11.3-0.1.el6.noarch rhevm-webadmin-portal-3.6.11.3-0.1.el6.noarch rhevm-spice-client-x64-cab-3.6-7.el6.noarch rhevm-extensions-api-impl-3.6.11.3-0.1.el6.noarch rhevm-setup-plugin-ovirt-engine-3.6.11.3-0.1.el6.noarch rhevm-3.6.11.3-0.1.el6.noarch ovirt-vmconsole-proxy-1.0.4-1.el6ev.noarch Linux version 2.6.32-696.el6.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC) ) #1 SMP Tue Feb 21 00:53:17 EST 2017 Linux RHEL6.9 2.6.32-696.el6.x86_64 #1 SMP Tue Feb 21 00:53:17 EST 2017 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 6.9 (Santiago) Components on hosts: qemu-kvm-rhev-2.9.0-16.el7_4.2.x86_64 libvirt-client-3.2.0-14.el7.x86_64 sanlock-3.5.0-1.el7.x86_64 ovirt-host-deploy-1.4.1-1.el7ev.noarch rhevm-sdk-python-3.6.9.1-1.el7ev.noarch ovirt-vmconsole-1.0.4-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.5.10-2.el7ev.noarch mom-0.5.6-1.el7ev.noarch vdsm-4.17.43-1.el7ev.noarch rhevm-appliance-20160620.0-1.el7ev.noarch ovirt-setup-lib-1.0.1-1.el7ev.noarch ovirt-vmconsole-host-1.0.4-1.el7ev.noarch ovirt-hosted-engine-setup-1.3.7.4-1.el7ev.noarch Linux version 3.10.0-514.26.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Jun 20 01:16:02 EDT 2017 Linux 3.10.0-514.26.1.el7.x86_64 #1 SMP Tue Jun 20 01:16:02 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.4 (Maipo) 1. I've upgraded one of the hosts to 4.1 latest components and rebooted it. 2. I've migrated HE-VM to upgraded 4.1 host. 3. I've upgraded remaining 3.6 ha-host to 4.1 components and rebooted it. 4. I've placed ha-hosts in to global maintenance 5. I've backed up engine's DB and copied it to ha-host running the 3.6 engine. 6. I've started engine's upgrade using "hosted-engine --upgrade-appliance" from ha-host, which was running the engine. 7. During the upgrade I've provided the appropriate path to backup file and successfully restored engine's db during the upgrade. No issues with root_squash were detected. Still getting this error on these components on hosts: qemu-kvm-rhev-2.9.0-16.el7_4.3.x86_64 ovirt-setup-lib-1.1.3-1.el7ev.noarch sanlock-3.5.0-1.el7.x86_64 mom-0.5.9-1.el7ev.noarch vdsm-4.19.23-1.el7ev.x86_64 ovirt-hosted-engine-setup-2.1.3.4-1.el7ev.noarch ovirt-vmconsole-1.0.4-1.el7ev.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch ovirt-imageio-common-1.0.0-0.el7ev.noarch ovirt-hosted-engine-ha-2.1.4-1.el7ev.noarch libvirt-client-3.2.0-14.el7.x86_64 ovirt-vmconsole-host-1.0.4-1.el7ev.noarch ovirt-imageio-daemon-1.0.0-0.el7ev.noarch ovirt-host-deploy-1.6.6-1.el7ev.noarch Linux version 3.10.0-514.26.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Jun 20 01:16:02 EDT 2017 Linux 3.10.0-514.26.1.el7.x86_64 #1 SMP Tue Jun 20 01:16:02 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.4 (Maipo) [ ERROR ] Failed to execute stage 'Misc configuration': /run/user/0/libguestfsAh29q7: cannot create temporary directory: Permission denied [ INFO ] Yum Performing yum transaction rollback [ INFO ] Stage: Clean up [ ERROR ] Failed to execute stage 'Clean up': 'Plugin' object has no attribute 'log' [ ERROR ] Failed to execute stage 'Clean up': [Errno 13] Permission denied: '/tmp/tmpbKaOwu' [ ERROR ] Failed to execute stage 'Clean up': [Errno 13] Permission denied: '/tmp/tmpgtzauo' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine upgrade failed: this system is not reliable, you can use --rollback-upgrade option to recover the engine VM disk from a backup Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20170723150218-f8x0cz.log Returning back to assigned. Created attachment 1303159 [details]
sosreport from host
Created attachment 1303160 [details]
ovirt-hosted-engine-setup-20170723150218-f8x0cz.log
To get back to functional engine I had to run "hosted-engine --rollback-upgrade" on host alma04 and then to cancel global maintenance, so the engine could got started. What Nikolai reported on https://bugzilla.redhat.com/show_bug.cgi?id=1467813#c16 is about having ovirt-hosted-engine-setup temporary running as VDSM user not being able to write a temporary file under /run/user/0 (as expected). That issue happens only with libguestfs 1:1.36.3-6.el7 from RHEL 7.4 while it correctly works with libguestfs 1:1.32.7-3.el7_3.3 from RHEL 7.3. The issue is well described here: https://bugzilla.redhat.com/show_bug.cgi?id=1469134#c4 As Simone says, the error in the log file is: /run/user/0/libguestfsAh29q7: cannot create temporary directory: Permission denied which is indeed caused by the su / XDG_RUNTIME_DIR problem as described fully in this comment: https://bugzilla.redhat.com/show_bug.cgi?id=1469134#c4 I've deployed clean 3.6.11.3 HE on RHEL7.4 host over NFS and attached NFS data storage domain to get auto-imported HE-VM and HE's storage domain. Components on host: rhevm-appliance-20160620.0-1.el7ev.noarch libvirt-client-3.2.0-14.el7.x86_64 mom-0.5.6-1.el7ev.noarch sanlock-3.5.0-1.el7.x86_64 ovirt-setup-lib-1.0.1-1.el7ev.noarch vdsm-4.17.43-1.el7ev.noarch ovirt-vmconsole-1.0.4-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.5.10-2.el7ev.noarch rhevm-sdk-python-3.6.9.1-1.el7ev.noarch qemu-kvm-rhev-2.9.0-16.el7_4.3.x86_64 ovirt-vmconsole-host-1.0.4-1.el7ev.noarch ovirt-hosted-engine-setup-1.3.7.4-1.el7ev.noarch ovirt-host-deploy-1.4.1-1.el7ev.noarch Linux version 3.10.0-693.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Thu Jul 6 19:56:57 EDT 2017 Linux 3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.4 (Maipo) Components on engine: rhevm-spice-client-x64-cab-3.6-7.el6.noarch rhevm-setup-plugins-3.6.5-1.el6ev.noarch rhevm-tools-backup-3.6.11.3-0.1.el6.noarch rhevm-setup-plugin-vmconsole-proxy-helper-3.6.11.3-0.1.el6.noarch rhevm-branding-rhev-3.6.0-10.el6ev.noarch rhevm-tools-3.6.11.3-0.1.el6.noarch rhevm-setup-base-3.6.11.3-0.1.el6.noarch rhevm-spice-client-x86-cab-3.6-7.el6.noarch rhevm-guest-agent-common-1.0.11-6.el6ev.noarch rhevm-restapi-3.6.11.3-0.1.el6.noarch rhevm-3.6.11.3-0.1.el6.noarch rhevm-setup-plugin-ovirt-engine-3.6.11.3-0.1.el6.noarch rhevm-websocket-proxy-3.6.11.3-0.1.el6.noarch rhevm-image-uploader-3.6.1-2.el6ev.noarch rhevm-extensions-api-impl-3.6.11.3-0.1.el6.noarch rhevm-log-collector-3.6.1-1.el6ev.noarch rhevm-spice-client-x86-msi-3.6-7.el6.noarch rhevm-webadmin-portal-3.6.11.3-0.1.el6.noarch rhevm-backend-3.6.11.3-0.1.el6.noarch rhevm-lib-3.6.11.3-0.1.el6.noarch rhevm-sdk-python-3.6.9.1-1.el6ev.noarch rhevm-setup-plugin-websocket-proxy-3.6.11.3-0.1.el6.noarch rhevm-setup-3.6.11.3-0.1.el6.noarch rhevm-cli-3.6.9.0-1.el6ev.noarch rhevm-dependencies-3.6.1-1.el6ev.noarch rhevm-setup-plugin-ovirt-engine-common-3.6.11.3-0.1.el6.noarch rhevm-doc-3.6.10-1.el6ev.noarch rhevm-userportal-3.6.11.3-0.1.el6.noarch rhev-guest-tools-iso-3.6-6.el6ev.noarch rhevm-spice-client-x64-msi-3.6-7.el6.noarch rhevm-iso-uploader-3.6.0-1.el6ev.noarch rhevm-vmconsole-proxy-helper-3.6.11.3-0.1.el6.noarch rhevm-dbscripts-3.6.11.3-0.1.el6.noarch Linux version 2.6.32-642.el6.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC) ) #1 SMP Wed Apr 13 00:51:26 EDT 2016 Linux 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 6.9 (Santiago) Then I've set in to global maintenance ha-host and backed-up the engine and copied backup files to ha-host. Then I've updated repositories on host to 4.1 and upgraded the host and started upgrade-appliance. I've provided backup file during the upgrade and it finished successfully. Components on upgraded host: ovirt-imageio-daemon-1.0.0-0.el7ev.noarch libvirt-client-3.2.0-14.el7.x86_64 ovirt-imageio-common-1.0.0-0.el7ev.noarch ovirt-vmconsole-1.0.4-1.el7ev.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch ovirt-hosted-engine-setup-2.1.3.5-1.el7ev.noarch qemu-kvm-rhev-2.9.0-16.el7_4.3.x86_64 ovirt-vmconsole-host-1.0.4-1.el7ev.noarch rhevm-appliance-4.0.20170302.0-1.el7ev.noarch ovirt-host-deploy-1.6.6-1.el7ev.noarch mom-0.5.9-1.el7ev.noarch vdsm-4.19.25-1.el7ev.x86_64 ovirt-setup-lib-1.1.3-1.el7ev.noarch sanlock-3.5.0-1.el7.x86_64 ovirt-hosted-engine-ha-2.1.4-1.el7ev.noarch Linux version 3.10.0-693.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Thu Jul 6 19:56:57 EDT 2017 Linux 3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.4 (Maipo) Engine: rhevm-spice-client-x64-msi-4.0-3.el7ev.noarch rhevm-4.0.7.4-0.1.el7ev.noarch rhevm-spice-client-x86-msi-4.0-3.el7ev.noarch rhev-guest-tools-iso-4.0-7.el7ev.noarch rhevm-setup-plugins-4.0.0.3-1.el7ev.noarch rhevm-doc-4.0.7-1.el7ev.noarch rhevm-dependencies-4.0.0-1.el7ev.noarch rhevm-guest-agent-common-1.0.12-4.el7ev.noarch rhevm-branding-rhev-4.0.0-7.el7ev.noarch Linux version 3.10.0-514.6.2.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Fri Feb 17 19:21:31 EST 2017 Linux 3.10.0-514.6.2.el7.x86_64 #1 SMP Fri Feb 17 19:21:31 EST 2017 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.3 (Maipo) These were seen on NFS share with root_squash attribute: # ls -lsha /home/share/ total 12K 4.0K drwxrwxrwx. 3 root root 4.0K Aug 1 13:10 . 4.0K drwxr-xr-x. 4 root root 4.0K Jul 23 11:32 .. 4.0K drwxr-xr-x. 5 36 36 4.0K Aug 1 13:11 38d23a4d-1184-4498-a198-09ec23ab4c1f 0 -rwxr-xr-x. 1 36 36 0 Aug 1 20:33 __DIRECT_IO_TEST__ Moving to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2422 |