Description of problem: During deployment of HE on NGN 4.0 the FQDN changed to localhost.localdomain and HE deployment had failed, also there were insufficient space in /var/tmp, so HE could be installed using appliance. [root@alma03 ~]# hosted-engine --deploy [ INFO ] Stage: Initializing [ INFO ] Generating a temporary VNC password. [ INFO ] Stage: Environment setup During customization use CTRL-D to abort. Continuing will configure this host for serving as hypervisor and create a VM where you have to install the engine afterwards. Are you sure you want to continue? (Yes, No)[Yes]: It has been detected that this program is executed through an SSH connection without using screen. Continuing with the installation may lead to broken installation if the network connection fails. It is highly recommended to abort the installation and run it inside a screen session using command "screen". Do you want to continue anyway? (Yes, No)[No]: yes [ INFO ] Hardware supports virtualization Configuration files: [] Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160530154518-pkecdc.log Version: otopi-1.5.0_beta1 (otopi-1.5.0-0.1.beta1.el7.centos) [ INFO ] Stage: Environment packages setup [ INFO ] Stage: Programs detection [ INFO ] Stage: Environment setup [ INFO ] Generating libvirt-spice certificates [WARNING] Cannot locate gluster packages, Hyper Converged setup support will be disabled. [ INFO ] Please abort the setup and install vdsm-gluster, gluster-server >= 3.7.2 and restart vdsmd service in order to gain Hyper Converged setup support. [ INFO ] Stage: Environment customization --== STORAGE CONFIGURATION ==-- Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: Please specify the full shared storage connection path to use (example: host:/path): 10.35.64.11:/vol/RHEV/Virt/nsednev_3_6_HE_1 [ INFO ] Installing on first host --== SYSTEM CONFIGURATION ==-- --== NETWORK CONFIGURATION ==-- Please indicate a nic to set ovirtmgmt bridge on: (p1p1, p1p2, em1, em2) [em1]: p1p1 iptables was detected on your computer, do you wish setup to configure it? (Yes, No)[Yes]: Please indicate a pingable gateway IP address [10.35.117.254]: --== VM CONFIGURATION ==-- Booting from cdrom on RHEL7 is ISO image based only, as cdrom passthrough is disabled (BZ760885) Please specify the device to boot the VM from (choose disk for the oVirt engine appliance) (cdrom, disk, pxe) [disk]: Please specify the console type you would like to use to connect to the VM (vnc, spice) [vnc]: [ INFO ] Detecting available oVirt engine appliances The following appliance have been found on your system: [1] - The oVirt Engine Appliance image (OVA) - 4.0-20160528.1.el7.centos [2] - Directly select an OVA file Please select an appliance (1, 2) [1]: [ INFO ] Verifying its sha1sum [ INFO ] Checking OVF archive content (could take a few minutes depending on archive size) [ INFO ] Checking OVF XML content (could take a few minutes depending on archive size) [WARNING] OVF does not contain a valid image description, using default. [ ERROR ] Not enough space in the temporary directory [/var/tmp] Please specify path to a temporary directory with at least 10 GB [/var/tmp]: /root Would you like to use cloud-init to customize the appliance on the first boot (Yes, No)[Yes]? Would you like to generate on-fly a cloud-init ISO image (of no-cloud type) or do you have an existing one (Generate, Existing)[Generate]? Please provide the FQDN you would like to use for the engine appliance. Note: This will be the FQDN of the engine VM you are now going to launch, it should not point to the base host or to any other existing machine. Engine VM FQDN: (leave it empty to skip): []: nsednev-he-1.qa.lab.tlv.redhat.com Automatically execute engine-setup on the engine appliance on first boot (Yes, No)[Yes]? Automatically restart the engine VM as a monitored service after engine-setup (Yes, No)[Yes]? Please provide the domain name you would like to use for the engine appliance. Engine VM domain: [qa.lab.tlv.redhat.com] Enter root password that will be used for the engine appliance (leave it empty to skip): Confirm appliance root password: The following CPU types are supported by this host: - model_SandyBridge: Intel SandyBridge Family - model_Westmere: Intel Westmere Family - model_Nehalem: Intel Nehalem Family - model_Penryn: Intel Penryn Family - model_Conroe: Intel Conroe Family Please specify the CPU type to be used by the VM [model_SandyBridge]: Please specify the number of virtual CPUs for the VM (Defaults to appliance OVF value): [4]: [WARNING] Minimum requirements for disk size not met You may specify a unicast MAC address for the VM or accept a randomly generated default [00:16:3e:05:1a:74]: 00:16:3E:7B:B8:53 Please specify the memory size of the VM in MB (Defaults to appliance OVF value): [16384]: How should the engine VM network be configured (DHCP, Static)[DHCP]? Add lines for the appliance itself and for this host to /etc/hosts on the engine VM? Note: ensuring that this host could resolve the engine VM hostname is still up to you (Yes, No)[No] yes --== HOSTED ENGINE CONFIGURATION ==-- Enter the name which will be used to identify this host inside the Administrator Portal [hosted_engine_1]: alma03.qa.lab.tlv.redhat.com Enter engine admin password: Confirm engine admin password: Please provide the name of the SMTP server through which we will send notifications [localhost]: smtp.redhat.com Please provide the TCP port number of the SMTP server [25]: Please provide the email address from which notifications will be sent [root@localhost]: nsednevhe1 Please provide a comma-separated list of email addresses which will get notifications [root@localhost]: nsednev [ INFO ] Stage: Setup validation --== CONFIGURATION PREVIEW ==-- Bridge interface : p1p1 Engine FQDN : nsednev-he-1.qa.lab.tlv.redhat.com Bridge name : ovirtmgmt Host address : alma03.qa.lab.tlv.redhat.com SSH daemon port : 22 Firewall manager : iptables Gateway address : 10.35.117.254 Host name for web application : alma03.qa.lab.tlv.redhat.com Storage Domain type : nfs3 Host ID : 1 Image size GB : 10 Storage connection : 10.35.64.11:/vol/RHEV/Virt/nsednev_3_6_HE_1 Console type : vnc Memory size MB : 16384 MAC address : 00:16:3E:7B:B8:53 Boot type : disk Number of CPUs : 4 OVF archive (for disk boot) : /usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.0-20160528.1.el7.centos.ova Restart engine VM after engine-setup: True CPU Type : model_SandyBridge Please confirm installation settings (Yes, No)[Yes]: [ INFO ] Stage: Transaction setup [ INFO ] Stage: Misc configuration [ INFO ] Stage: Package installation [ INFO ] Stage: Misc configuration [ INFO ] Configuring libvirt [ INFO ] Configuring VDSM [ INFO ] Starting vdsmd [ INFO ] Configuring the management bridge [ INFO ] Creating Storage Domain [ INFO ] Creating Storage Pool [ INFO ] Connecting Storage Pool [ INFO ] Verifying sanlock lockspace initialization [ INFO ] Creating Image for 'hosted-engine.lockspace' ... [ INFO ] Image for 'hosted-engine.lockspace' created successfully [ INFO ] Creating Image for 'hosted-engine.metadata' ... [ INFO ] Image for 'hosted-engine.metadata' created successfully [ INFO ] Creating VM Image [ INFO ] Extracting disk image from OVF archive (could take a few minutes depending on archive size) [ INFO ] Validating pre-allocated volume size [ INFO ] Image not uploaded to data domain [ ERROR ] Failed to execute stage 'Misc configuration': Command '/bin/sudo' failed to execute [ INFO ] Stage: Clean up [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160530161141.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine deployment failed: this system is not reliable, please check the issue, fix and redeploy Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160530154518-pkecdc.log Version-Release number of selected component (if applicable): ovirt-engine-appliance-4.0-20160528.1.el7.centos.noarch mom-0.5.3-1.1.el7.noarch ovirt-vmconsole-host-1.0.2-0.0.master.20160517094103.git06df50a.el7.noarch vdsm-4.17.999-1155.gitcf216a0.el7.centos.x86_64 ovirt-setup-lib-1.0.2-0.0.master.20160502125738.gitf05af9e.el7.centos.noarch ovirt-release40-4.0.0-0.3.beta1.noarch ovirt-vmconsole-1.0.2-0.0.master.20160517094103.git06df50a.el7.noarch libvirt-client-1.2.17-13.el7_2.4.x86_64 ovirt-engine-sdk-python-3.6.5.1-0.1.20160507.git5fb7e0e.el7.centos.noarch ovirt-host-deploy-1.5.0-0.1.alpha1.el7.centos.noarch ovirt-hosted-engine-setup-2.0.0-0.1.beta1.el7.centos.noarch ovirt-release-host-node-4.0.0-0.3.beta1.el7.noarch ovirt-engine-appliance-4.0-20160528.1.el7.centos.noarch sanlock-3.2.4-2.el7_2.x86_64 ovirt-hosted-engine-ha-2.0.0-0.1.beta1.el7.centos.noarch ovirt-node-ng-image-update-placeholder-4.0.0-0.3.beta1.el7.noarch CentOS Linux release 7.2.1511 (Core) Linux 3.10.0-327.18.2.el7.x86_64 #1 SMP Thu May 12 11:03:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Linux version 3.10.0-327.18.2.el7.x86_64 (builder.centos.org) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Thu May 12 11:03:55 UTC 2016 How reproducible: 100% Steps to Reproduce: 1.Deploy via CLI HE on NGN 4.0 from appliance. 2. 3. Actual results: 1)Deployment failed 2)There is not enough space in /var/tmp for the appliance. Expected results: Deployment should succeed. Additional info:
Created attachment 1162912 [details] sosreport from host
Can you please attach also HE VM sos report?
The error is here: [ ERROR ] Failed to execute stage 'Misc configuration': Command '/bin/sudo' failed to execute But on my opinion the issue was here: Please specify path to a temporary directory with at least 10 GB [/var/tmp]: /root Can you please try using a scratch directory that could be accessed by vdsm user? If the issue is here we have a bug since we need to enforce it.
I'm aware of two bugs who might have an impact here: Bug 1338511 - HE does not work when /var is too small (default case) MODIFIED Bug 1329943 - myhostname is missing from the hosts line in nsswitch.conf When we've got the logs, then we can probably see if it's bug 1338511 or something else.
Yes, the issue is simply here: 2016-05-30 16:1:33 DEBUG otopi.plugins.gr_he_common.vm.boot_disk plugin.execute:926 execute-output: ('/bin/sudo', '-u', 'vdsm', '-g', 'kvm', '/bin/qemu-img', 'info', '--output', 'json', '/root/tmpZd0aEB') stderr: qemu-img: Could not open '/root/tmpZd0aEB': Could not open '/root/tmpZd0aEB': Permission denied 2016-05-30 16:1:33 DEBUG otopi.transaction transaction._prepare:66 exception during prepare phase Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/transaction.py", line 62, in _prepare element.prepare() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/vm/boot_disk.py", line 218, in prepare self._validate_volume() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/vm/boot_disk.py", line 101, in _validate_volume raiseOnError=True File "/usr/lib/python2.7/site-packages/otopi/plugin.py", line 931, in execute command=args[0], RuntimeError: Command '/bin/sudo' failed to execute 2016-05-30 16:1:33 DEBUG otopi.transaction transaction.abort:19 aborting 'Image Transaction' 2016-05-30 16:1:33 INFO otopi.plugins.gr_he_common.vm.boot_disk boot_disk.abort:222 Image not uploaded to data domain The scratch dir where we extract the OVF image should be readable by VDSM user since we need to upload the image as VDSM user. Having more space in /var/tmp as for bug 1338511 could prevent it since the default dir will have enough space but we should also avoid letting the user enter a wrong path better validating it. I seams instead that hosted-engine-setup got the right hostname and I don't see it changing from the logs so the bug title seams a bit confusing: 2016-05-30 16:08:17 DEBUG otopi.plugins.gr_he_setup.network.bridge bridge._get_hostname_from_bridge_if:308 hostname: 'alma03.qa.lab.tlv.redhat.com', aliaslist: '[]', ipaddrlist: '['10.35.17.24']' 2016-05-30 16:08:17 DEBUG otopi.context context.dumpEnvironment:760 ENVIRONMENT DUMP - BEGIN 2016-05-30 16:08:17 DEBUG otopi.context context.dumpEnvironment:770 ENV OVEHOSTED_NETWORK/host_name=str:'alma03.qa.lab.tlv.redhat.com'
(In reply to Sandro Bonazzola from comment #2) > Can you please attach also HE VM sos report? It didn't got to it at all. [root@localhost ~]# hosted-engine --vm-status You must run deploy first
(In reply to Nikolai Sednev from comment #6) > (In reply to Sandro Bonazzola from comment #2) > > Can you please attach also HE VM sos report? > > It didn't got to it at all. > [root@localhost ~]# hosted-engine --vm-status > You must run deploy first But when you started it was: [root@alma03 ~]# hosted-engine --deploy [ INFO ] Stage: Initializing when did it changed?
(In reply to Simone Tiraboschi from comment #7) > (In reply to Nikolai Sednev from comment #6) > > (In reply to Sandro Bonazzola from comment #2) > > > Can you please attach also HE VM sos report? > > > > It didn't got to it at all. > > [root@localhost ~]# hosted-engine --vm-status > > You must run deploy first > > But when you started it was: > > [root@alma03 ~]# hosted-engine --deploy > [ INFO ] Stage: Initializing > > when did it changed? During HE deployment phase.
there should be simple check if vdsm user can access this provided scratch dir, the user executes the action as root and it can be not obvious that some activities during the flow are done by other user, ie. vdsm.
HE deployment successfully completed on 4.1 RHEVH (rhvh-4.1-0.20170202.0+1): # rpm -qa | grep appliance rhvm-appliance-4.1.20170126.0-1.el7ev.noarch [root@puma18 ~]# find / | grep rhvm-appliance-4.1.20170126.0-1.el7ev.noarch /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch/checksum_data /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch/origin_url /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch/checksum_type /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch/releasever /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch/command_line /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch/from_repo_revision /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch/installed_by /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch/reason /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch/from_repo /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch/from_repo_timestamp /usr/share/yum/yumdb/r/50269336266f59037ccd019653df2fe785410fa3-rhvm-appliance-4.1.20170126.0-1.el7ev-noarch/var_uuid /var/tmp/yum-root-SPSf4h/rhvm-appliance-4.1.20170126.0-1.el7ev.noarch.rpm /var/imgbased/persisted-rpms/rhvm-appliance-4.1.20170126.0-1.el7ev.noarch.rpm Components on host: rhvm-appliance-4.1.20170126.0-1.el7ev.noarch ovirt-imageio-common-1.0.0-0.el7ev.noarch ovirt-hosted-engine-ha-2.1.0.1-1.el7ev.noarch ovirt-hosted-engine-setup-2.1.0.1-1.el7ev.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch ovirt-host-deploy-1.6.0-1.el7ev.noarch ovirt-vmconsole-1.0.4-1.el7ev.noarch ovirt-node-ng-nodectl-4.1.0-0.20170104.1.el7.noarch libvirt-client-2.0.0-10.el7_3.4.x86_64 qemu-kvm-rhev-2.6.0-28.el7_3.3.x86_64 vdsm-4.19.4-1.el7ev.x86_64 sanlock-3.4.0-1.el7.x86_64 ovirt-vmconsole-host-1.0.4-1.el7ev.noarch mom-0.5.8-1.el7ev.noarch ovirt-imageio-daemon-1.0.0-0.el7ev.noarch ovirt-setup-lib-1.1.0-1.el7ev.noarch Linux version 3.10.0-514.6.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Sat Dec 10 11:15:38 EST 2016 Linux 3.10.0-514.6.1.el7.x86_64 #1 SMP Sat Dec 10 11:15:38 EST 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux release 7.3 Moving to verified.