Created attachment 1397975 [details] engine logs Description of problem: I've used part of FQDN of the engine as engine admin's password. Ansible deployment of SHE on RHEL7.5 over Gluster on these componets failed: [ INFO ] TASK [Get ovirtmgmt route table id] [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50, "changed": true, "cmd": "ip rule list | grep ovirtmgmt | sed s/\\\\[.*\\\\]\\ //g | awk '{ print $9 }'", "delta": "0:00:00.011787", "end": "2018-02-19 19:53:09.673841", "rc": 0, "start": "2018-02-19 19:53:09.662054", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []} [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Remove local vm dir] [ INFO ] changed: [localhost] [ INFO ] TASK [Notify the user about a failure] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.\n"} [ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook [ INFO ] Stage: Clean up [ INFO ] Cleaning temporary resources [ INFO ] TASK [Gathering Facts] [ INFO ] ok: [localhost] [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Remove local vm dir] [ INFO ] ok: [localhost] [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180219195317.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch. Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180219193322-xb8oaq.log [root@alma03 ~]# date Mon Feb 19 19:54:11 IST 2018 Version-Release number of selected component (if applicable): ovirt-engine-setup-4.2.1.5-0.1.el7.noarch ovirt-hosted-engine-ha-2.2.5-1.el7ev.noarch ovirt-hosted-engine-setup-2.2.10-1.el7ev.noarch rhvm-appliance.noarch 2:4.2-20180202.0.el7 Red Hat Enterprise Linux Server release 7.5 Beta (Maipo) Linux 3.10.0-829.el7.x86_64 #1 SMP Tue Jan 9 23:06:01 EST 2018 x86_64 x86_64 x86_64 GNU/Linux alma03 ~]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.35.95.254 0.0.0.0 UG 100 0 0 enp5s0f0 10.35.92.0 0.0.0.0 255.255.252.0 U 100 0 0 enp5s0f0 192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0 alma03 ~]# brctl show bridge name bridge id STP enabled interfaces virbr0 8000.525400a85bc3 yes virbr0-nic vnet0 alma03 ~]# ip rule list 0: from all lookup local 32766: from all lookup main 32767: from all lookup default alma03 ~]# arp -a vm-93-254.qa.lab.tlv.redhat.com (10.35.95.254) at 9c:cc:83:52:81:60 [ether] on enp5s0f0 alma04.qa.lab.tlv.redhat.com (10.35.92.4) at a0:36:9f:3b:16:7c [ether] on enp5s0f0 nsednev-he-1.qa.lab.tlv.redhat.com (192.168.122.167) at 00:16:3e:7b:b8:53 [ether] on virbr0 nsednev-he-1 ~]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.122.1 0.0.0.0 UG 100 0 0 eth0 192.168.122.0 0.0.0.0 255.255.255.0 U 100 0 0 eth0 nsednev-he-1 ~]# ip rule list 0: from all lookup local 32766: from all lookup main 32767: from all lookup default nsednev-he-1 ~]# arp -a gateway (192.168.122.1) at 52:54:00:a8:5b:c3 [ether] on eth0 How reproducible: 100% Steps to Reproduce: 1.Deploy SHE Node 0 over Gluster on RHEL7.5 and provide engine admin's password as part of SHE's FQDN, e.g. nsednev-he-1.qa.lab.tlv.redhat.com , so admin's password be nsednev. Actual results: Deployment failed with errors described above. Expected results: Deployment should succeed. Additional info: sosreports from engine and host attached.
Created attachment 1397976 [details] alma03 logs
The problem disappeared after running rhese, on host, prior to deployment: systemctl stop NetworkManager systemctl restart network Pretty the same as was made here https://bugzilla.redhat.com/show_bug.cgi?id=1540451#c16.
I was able to successfully deploy Node 0 on RHEL7.5, over Gluster, after stopping NetworkManager and restarting the network on host, before making SHE deployment.
This seams relevant: Feb 19 19:44:53 alma03 vdsm-tool: Traceback (most recent call last): Feb 19 19:44:53 alma03 vdsm-tool: File "/usr/bin/vdsm-tool", line 219, in main Feb 19 19:44:53 alma03 vdsm-tool: return tool_command[cmd]["command"](*args) Feb 19 19:44:53 alma03 vdsm-tool: File "/usr/lib/python2.7/site-packages/vdsm/tool/network.py", line 94, in dump_bonding_options Feb 19 19:44:53 alma03 vdsm-tool: sysfs_options_mapper.dump_bonding_options() Feb 19 19:44:53 alma03 vdsm-tool: File "/usr/lib/python2.7/site-packages/vdsm/network/link/bond/sysfs_options_mapper.py", line 50, in dump_bonding_options Feb 19 19:44:53 alma03 vdsm-tool: jdump(_get_bonding_options_name2numeric(), f) Feb 19 19:44:53 alma03 vdsm-tool: File "/usr/lib/python2.7/site-packages/vdsm/network/link/bond/sysfs_options_mapper.py", line 90, in _get_bonding_options_name2numeric Feb 19 19:44:53 alma03 vdsm-tool: with _bond_device(bond_name, mode): Feb 19 19:44:53 alma03 vdsm-tool: File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ Feb 19 19:44:53 alma03 vdsm-tool: return self.gen.next() Feb 19 19:44:53 alma03 vdsm-tool: File "/usr/lib/python2.7/site-packages/vdsm/network/link/bond/sysfs_options_mapper.py", line 102, in _bond_device Feb 19 19:44:53 alma03 vdsm-tool: _change_mode(bond_name, mode) Feb 19 19:44:53 alma03 vdsm-tool: File "/usr/lib/python2.7/site-packages/vdsm/network/link/bond/sysfs_options_mapper.py", line 112, in _change_mode Feb 19 19:44:53 alma03 vdsm-tool: opt.write(mode) Feb 19 19:44:53 alma03 vdsm-tool: IOError: [Errno 22] Invalid argument Feb 19 19:44:53 alma03 NetworkManager[579]: <info> [1519062293.6920] manager: (bondscan-irTVhQ): new Bond device (/org/freedesktop/NetworkManager/Devices/29) Feb 19 19:44:53 alma03 iscsid: iSCSI daemon with pid=93652 started! Feb 19 19:44:53 alma03 systemd: vdsm-network-init.service: main process exited, code=exited, status=1/FAILURE Feb 19 19:44:53 alma03 systemd: Failed to start Virtual Desktop Server Manager network IP+link restoration. Feb 19 19:44:53 alma03 systemd: Dependency failed for Virtual Desktop Server Manager network restoration. Feb 19 19:44:53 alma03 systemd: Dependency failed for Virtual Desktop Server Manager. Feb 19 19:44:53 alma03 systemd: Dependency failed for MOM instance configured for VDSM purposes. Feb 19 19:44:53 alma03 systemd: Job mom-vdsm.service/start failed with result 'dependency'. Feb 19 19:44:53 alma03 systemd: Job vdsmd.service/start failed with result 'dependency'. Feb 19 19:44:53 alma03 systemd: Job vdsm-network.service/start failed with result 'dependency'. Feb 19 19:44:53 alma03 systemd: Unit vdsm-network-init.service entered failed state. Feb 19 19:44:53 alma03 systemd: vdsm-network-init.service failed.
please retry with newer kernel.
We still have the same kernel version as was reported within the description: Linux 3.10.0-829.el7.x86_64 #1 SMP Tue Jan 9 23:06:01 EST 2018 x86_64 x86_64 x86_64 GNU/Linux Why this bug had been moved to ON_QA? Did I missed something or we have an old kernel?
it should be >= -851 according to https://bugzilla.redhat.com/show_bug.cgi?id=1540451#c19 -829 is not >= -851
Works for me on these components: ovirt-hosted-engine-ha-2.2.5-1.el7ev.noarch ovirt-hosted-engine-setup-2.2.10-1.el7ev.noarch rhvm-appliance-4.2-20180202.0.el7.noarch Linux 3.10.0-851.el7.x86_64 #1 SMP Mon Feb 12 07:53:52 EST 2018 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.5 Beta (Maipo)
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.