Description of problem: When configuring hosted-engine-setup, install fails on host setup. The issue can be verified by installing rhvm-appliance and then running vdsm-tool configure --force which fails on an error about finding /var/run/vdsm/bonding-defaults.json Version-Release number of selected component (if applicable): vdsm-4.20.22-1.el7ev.x86_64 ovirt-hosted-engine-setup-2.2.13-1.el7ev.noarch kernel-3.10.0-830.el7.x86_64 rhvm-appliance-4.2-20180202.0.el7.noarch libvirt-python-3.9.0-1.el7.x86_64 libvirt-libs-3.9.0-14.el7.x86_64 mom-0.5.12-1.el7ev.noarch ovirt-host-deploy-1.7.3-1.el7ev.noarch How reproducible: For 4.2.2-6 it is reproducible on my system everytime (also reproduced on 4.2.2-5) Steps to Reproduce: 1. Follow steps here - http://ci-web.eng.lab.tlv.redhat.com/docs/4.2/Guide/install_guide/index.html 2. yum -y update 3. reboot 4. yum install rhvm-appliance 5. configure ovirt-mgmt bridge (this wasn't done first time but was done subsequent attempts, it did not make a difference - https://www.ovirt.org/develop/developer-guide/vdsm/installing-vdsm-from-rpm/) 6. reboot 7. systemctl start lvm2-activation-early.service lvm2-activation.service (these didn't start 8a. vdsm-tool configure --force (fails) OR 8b. hosted-engine --deploy --config-append=/root/storage.conf (this is just a gluster replica 3 conf but it never gets to this stage, it fails on deploy host) Actual results: [root@virt04 ~]# vdsm-tool configure --force Checking configuration status... abrt is not configured for vdsm lvm is configured for vdsm libvirt is not configured for vdsm yet FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. Current revision of multipath.conf detected, preserving schema should be configured Running configure... Reconfiguration of abrt is done. Reconfiguration of passwd is done. Reconfiguration of sebool is done. Reconfiguration of certificates is done. Reconfiguration of libvirt is done. Traceback (most recent call last): File "/usr/bin/vdsm-tool", line 219, in main return tool_command[cmd]["command"](*args) File "/usr/lib/python2.7/site-packages/vdsm/tool/__init__.py", line 38, in wrapper func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/tool/configurator.py", line 141, in configure _configure(c) File "/usr/lib/python2.7/site-packages/vdsm/tool/configurator.py", line 88, in _configure getattr(module, 'configure', lambda: None)() File "/usr/lib/python2.7/site-packages/vdsm/tool/configurators/bond_defaults.py", line 37, in configure sysfs_options_mapper.dump_bonding_options() File "/usr/lib/python2.7/site-packages/vdsm/network/link/bond/sysfs_options_mapper.py", line 46, in dump_bonding_options with open(sysfs_options.BONDING_DEFAULTS, 'w') as f: IOError: [Errno 2] No such file or directory: '/var/run/vdsm/bonding-defaults.json' OR fails on host deploy in hosted-engine setup it passes on DHCP ip for engine, engine ssh and I can ping the host from the engine vm engine-setup runs and completes on engine it tries to configure the host and fails Expected results: success Additional info: errors noticed in vdsm.log 2018-03-16 19:43:02,983-0400 INFO (MainThread) [vds] (PID: 3567) I am the actual vdsm 4.20.22-1.el7ev virt04.gamull.com (3.10.0-861.el 7.x86_64) (vdsmd:149) 2018-03-16 19:43:02,984-0400 INFO (MainThread) [vds] VDSM will run with cpu affinity: frozenset([1]) (vdsmd:262) 2018-03-16 19:43:02,994-0400 INFO (MainThread) [storage.HSM] START HSM init (hsm:366) 2018-03-16 19:43:03,000-0400 INFO (MainThread) [storage.HSM] Creating data-center mount directory '/rhev/data-center/mnt' (hsm:373) 2018-03-16 19:43:03,001-0400 INFO (MainThread) [storage.fileUtils] Creating directory: /rhev/data-center/mnt mode: None (fileUtils:197 ) 2018-03-16 19:43:03,242-0400 INFO (MainThread) [storage.udev] Registering multipath event monitor <vdsm.storage.mpathhealth.Monitor ob ject at 0x7f5369a4df90> (udev:182) 2018-03-16 19:43:03,243-0400 INFO (MainThread) [storage.udev] Starting multipath event listener (udev:116) 2018-03-16 19:43:03,290-0400 INFO (MainThread) [storage.check] Starting check service (check:91) 2018-03-16 19:43:03,301-0400 INFO (MainThread) [storage.Dispatcher] Starting StorageDispatcher... (dispatcher:48) 2018-03-16 19:43:03,302-0400 INFO (check/loop) [storage.asyncevent] Starting <EventLoop running=True closed=False at 0x139996205984592 > (asyncevent:125) 2018-03-16 19:43:03,381-0400 INFO (MainThread) [vdsm.api] START registerDomainStateChangeCallback(callbackFunc=<functools.partial obje ct at 0x7f53681b0e68>) from=internal, task_id=61af4ff5-b152-4459-a5a3-0a66b8e81af7 (api:46) 2018-03-16 19:43:03,382-0400 INFO (MainThread) [vdsm.api] FINISH registerDomainStateChangeCallback return=None from=internal, task_id= 61af4ff5-b152-4459-a5a3-0a66b8e81af7 (api:52) 2018-03-16 19:43:03,382-0400 INFO (MainThread) [MOM] Preparing MOM interface (momIF:53) 2018-03-16 19:43:03,384-0400 INFO (MainThread) [MOM] Using named unix socket /var/run/vdsm/mom-vdsm.sock (momIF:62) 2018-03-16 19:43:03,385-0400 INFO (MainThread) [root] Unregistering all secrets (secret:91) 2018-03-16 19:43:03,453-0400 INFO (MainThread) [vds] Setting channels' timeout to 30 seconds. (vmchannels:222) 2018-03-16 19:43:03,454-0400 INFO (MainThread) [vds] Starting QEMU-GA poller (qemuguestagent:112) 2018-03-16 19:43:03,458-0400 INFO (MainThread) [vds.MultiProtocolAcceptor] Listening at :::54321 (protocoldetector:183) 2018-03-16 19:43:03,524-0400 INFO (vmrecovery) [vds] recovery: completed in 0s (clientIF:593) 2018-03-16 19:43:03,535-0400 ERROR (MainThread) [vds] failed to init clientIF, shutting down storage dispatcher (clientIF:147) 2018-03-16 19:43:03,536-0400 INFO (MainThread) [vdsm.api] START prepareForShutdown(options=None) from=internal, task_id=6565f58d-017d- 42af-ba1b-62400467e7ae (api:46) 2018-03-16 19:43:03,591-0400 INFO (MainThread) [storage.Monitor] Shutting down domain monitors (monitor:222) 2018-03-16 19:43:03,591-0400 INFO (MainThread) [storage.check] Stopping check service (check:104) 2018-03-16 19:43:03,592-0400 INFO (check/loop) [storage.asyncevent] Stopping <EventLoop running=True closed=False at 0x139996205984592 > (asyncevent:220) 2018-03-16 19:43:03,592-0400 INFO (MainThread) [storage.udev] Stopping multipath event listener (udev:149) 2018-03-16 19:43:03,593-0400 INFO (MainThread) [vdsm.api] FINISH prepareForShutdown return=None from=internal, task_id=6565f58d-017d-4 2af-ba1b-62400467e7ae (api:52) 2018-03-16 19:43:03,593-0400 ERROR (MainThread) [vds] Exception raised (vdsmd:158) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 156, in run serve_clients(log) File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 103, in serve_clients cif = clientIF.getInstance(irs, log, scheduler) File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 250, in getInstance cls._instance = clientIF(irs, log, scheduler) File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 144, in __init__ self._prepareJSONRPCServer() File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 307, in _prepareJSONRPCServer bridge = Bridge.DynamicBridge() File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 67, in __init__ self._schema = vdsmapi.Schema(paths, api_strict_mode) File "/usr/lib/python2.7/site-packages/vdsm/api/vdsmapi.py", line 217, in __init__ raise SchemaNotFound("Unable to find API schema file") SchemaNotFound: Unable to find API schema file 2018-03-16 19:43:03,595-0400 INFO (MainThread) [vds] Stopping threads (vdsmd:160) 2018-03-16 19:43:03,595-0400 INFO (MainThread) [vds] Exiting (vdsmd:171) 2018-03-16 19:43:03,637-0400 ERROR (hsm/init) [storage.HSM] FINISH thread <Thread(hsm/init, started daemon 139995448858368)> failed (co ncurrent:201) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 406, in storageRefresh sdCache.refreshStorage() File "/usr/lib/python2.7/site-packages/vdsm/storage/misc.py", line 404, in helper return sm(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/misc.py", line 389, in __call__ self.__lastResult = self.__func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 82, in refreshStorage multipath.rescan() File "/usr/lib/python2.7/site-packages/vdsm/storage/multipath.py", line 83, in rescan udevadm.settle(timeout) File "/usr/lib/python2.7/site-packages/vdsm/common/udevadm.py", line 50, in settle _run_command(args) File "/usr/lib/python2.7/site-packages/vdsm/common/udevadm.py", line 105, in _run_command rc, out, err = commands.execCmd(cmd, raw=True) TypeError: 'NoneType' object is not callable 2018-03-16 19:43:09,938-0400 INFO (MainThread) [vds] (PID: 3697) I am the actual vdsm 4.20.22-1.el7ev virt04.gamull.com (3.10.0-861.el7.x86_64) (vdsmd:149) 2018-03-16 19:43:09,938-0400 INFO (MainThread) [vds] VDSM will run with cpu affinity: frozenset([1]) (vdsmd:262) 2018-03-16 19:43:09,948-0400 INFO (MainThread) [storage.HSM] START HSM init (hsm:366) 2018-03-16 19:43:09,953-0400 INFO (MainThread) [storage.HSM] Creating data-center mount directory '/rhev/data-center/mnt' (hsm:373) 2018-03-16 19:43:09,954-0400 INFO (MainThread) [storage.fileUtils] Creating directory: /rhev/data-center/mnt mode: None (fileUtils:197) 2018-03-16 19:43:10,182-0400 INFO (MainThread) [storage.udev] Registering multipath event monitor <vdsm.storage.mpathhealth.Monitor object at 0x7f199eccef90> (udev:182) 2018-03-16 19:43:10,183-0400 INFO (MainThread) [storage.udev] Starting multipath event listener (udev:116) 2018-03-16 19:43:10,230-0400 INFO (MainThread) [storage.check] Starting check service (check:91) 2018-03-16 19:43:10,242-0400 INFO (MainThread) [storage.Dispatcher] Starting StorageDispatcher... (dispatcher:48) 2018-03-16 19:43:10,243-0400 INFO (check/loop) [storage.asyncevent] Starting <EventLoop running=True closed=False at 0x139747972914000> (asyncevent:125) 2018-03-16 19:43:10,322-0400 INFO (MainThread) [vdsm.api] START registerDomainStateChangeCallback(callbackFunc=<functools.partial object at 0x7f199c42fe68>) from=internal, task_id=d76d22bc-2164-4ec6-8a2a-3bd056410996 (api:46) 2018-03-16 19:43:10,322-0400 INFO (MainThread) [vdsm.api] FINISH registerDomainStateChangeCallback return=None from=internal, task_id=d76d22bc-2164-4ec6-8a2a-3bd056410996 (api:52) 2018-03-16 19:43:10,323-0400 INFO (MainThread) [MOM] Preparing MOM interface (momIF:53) 2018-03-16 19:43:10,325-0400 INFO (MainThread) [MOM] Using named unix socket /var/run/vdsm/mom-vdsm.sock (momIF:62) 2018-03-16 19:43:10,326-0400 INFO (MainThread) [root] Unregistering all secrets (secret:91) 2018-03-16 19:43:10,396-0400 INFO (MainThread) [vds] Setting channels' timeout to 30 seconds. (vmchannels:222) 2018-03-16 19:43:10,397-0400 INFO (MainThread) [vds] Starting QEMU-GA poller (qemuguestagent:112) 2018-03-16 19:43:10,400-0400 INFO (MainThread) [vds.MultiProtocolAcceptor] Listening at :::54321 (protocoldetector:183) 2018-03-16 19:43:10,451-0400 INFO (vmrecovery) [vds] recovery: completed in 0s (clientIF:593) 2018-03-16 19:43:10,469-0400 ERROR (MainThread) [vds] failed to init clientIF, shutting down storage dispatcher (clientIF:147) 2018-03-16 19:43:10,470-0400 INFO (MainThread) [vdsm.api] START prepareForShutdown(options=None) from=internal, task_id=2104d6f2-56c5-4019-922a-c91ee1675f90 (api:46) 2018-03-16 19:43:10,524-0400 INFO (MainThread) [storage.Monitor] Shutting down domain monitors (monitor:222) 2018-03-16 19:43:10,525-0400 INFO (MainThread) [storage.check] Stopping check service (check:104) 2018-03-16 19:43:10,525-0400 INFO (check/loop) [storage.asyncevent] Stopping <EventLoop running=True closed=False at 0x139747972914000> (asyncevent:220) 2018-03-16 19:43:10,526-0400 INFO (MainThread) [storage.udev] Stopping multipath event listener (udev:149) 2018-03-16 19:43:10,526-0400 INFO (MainThread) [vdsm.api] FINISH prepareForShutdown return=None from=internal, task_id=2104d6f2-56c5-4019-922a-c91ee1675f90 (api:52) 2018-03-16 19:43:10,527-0400 ERROR (MainThread) [vds] Exception raised (vdsmd:158) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 156, in run serve_clients(log) File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 103, in serve_clients cif = clientIF.getInstance(irs, log, scheduler) File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 250, in getInstance cls._instance = clientIF(irs, log, scheduler) File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 144, in __init__ self._prepareJSONRPCServer() File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 307, in _prepareJSONRPCServer bridge = Bridge.DynamicBridge() File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 67, in __init__ self._schema = vdsmapi.Schema(paths, api_strict_mode) File "/usr/lib/python2.7/site-packages/vdsm/api/vdsmapi.py", line 217, in __init__ raise SchemaNotFound("Unable to find API schema file") SchemaNotFound: Unable to find API schema file 2018-03-16 19:43:10,528-0400 INFO (MainThread) [vds] Stopping threads (vdsmd:160) 2018-03-16 19:43:10,528-0400 INFO (MainThread) [vds] Exiting (vdsmd:171) 2018-03-16 19:43:10,570-0400 ERROR (hsm/init) [storage.HSM] FINISH thread <Thread(hsm/init, started daemon 139747095738112)> failed (concurrent:201) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 406, in storageRefresh sdCache.refreshStorage()
Additional information: I am using the vdo and kmod_kvdo packages with rhel 7.5 I have made this attempt without a bond and the following network interface configs NetworkManager was disabled and stopped [root@virt04 network-scripts]# cat ifcfg-eno1 DEVICE=eno1 ONBOOT=yes BRIDGE=ovirtmgmt [root@virt04 network-scripts]# cat ifcfg-eno2 TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=none DEFROUTE=no IPV4_FAILURE_FATAL=no IPV6INIT=no IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=eno2 UUID=bcb17727-32de-4565-8b5e-b6d126f74345 DEVICE=eno2 ONBOOT=yes IPADDR=10.1.200.84 PREFIX=23 [root@virt04 network-scripts]# cat ifcfg-ovirtmgmt DEVICE=ovirtmgmt TYPE=Bridge ONBOOT=yes DELAY=0 BOOTPROTO=static IPADDR=10.1.10.84 NETMASK=255.255.255.0 GATEWAY=10.1.10.1 DNS1=10.1.10.1 DEFROUTE="yes" [root@virt04 network-scripts]# brctl show bridge name bridge id STP enabled interfaces ovirtmgmt 8000.001d096c4c18 no eno1 virbr0 8000.525400555d82 yes virbr0-nic
This may be a permissions issue. It appears if I create and set owner/permission on /var/run/vdsm prior to running vdsm-tool configure --force it succeeds. I am proceeding to test the hosted-engine --deploy now [root@virt04 ~]# mkdir /var/run/vdsm [root@virt04 ~]# chown vdsm:kvm /var/run/vdsm [root@virt04 ~]# chmod 755 /var/run/vdsm [root@virt04 ~]# vdsm-tool configure --force Checking configuration status... abrt is already configured for vdsm lvm is configured for vdsm libvirt is already configured for vdsm SUCCESS: ssl configured to true. No conflicts Current revision of multipath.conf detected, preserving schema should be configured Running configure... Reconfiguration of abrt is done. Reconfiguration of passwd is done. Reconfiguration of libvirt is done. Reconfiguration of bond_defaults is done. Reconfiguration of schema is done. Done configuring modules to VDSM. [root@virt04 ~]#
hosted-engine deploy still fails [ INFO ] TASK [Set VLAN ID at datacenter level] [ INFO ] skipping: [localhost] [ INFO ] TASK [Force host-deploy in offline mode] [ INFO ] changed: [localhost] [ INFO ] TASK [Add host] [ INFO ] changed: [localhost] [ INFO ] TASK [Wait for the host to be up] [ INFO ] ok: [localhost] [ INFO ] TASK [Check host status] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The host has been set in non_operational status, please check engine logs, fix accordingly and re-deploy.\n"} [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Remove local vm dir] [ INFO ] changed: [localhost] [ INFO ] TASK [Notify the user about a failure] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.\n"} [ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook [ INFO ] Stage: Clean up [ INFO ] Cleaning temporary resources [ INFO ] TASK [Gathering Facts] [ INFO ] ok: [localhost] [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Remove local vm dir] [ INFO ] ok: [localhost] [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180317055458.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch. Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180317052917-4hoa56.log journalctl from virt04 (vdsmd is running throughout this?) Mar 17 05:51:56 virt04.gamull.com python[27827]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0} Mar 17 05:51:59 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:01 virt04.gamull.com systemd[1]: Reloading. Mar 17 05:52:01 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:01 virt04.gamull.com systemd[1]: Started Flexible Branding Service. Mar 17 05:52:01 virt04.gamull.com systemd[1]: Starting Flexible Branding Service... Mar 17 05:52:02 virt04.gamull.com systemd[1]: Started Session 72 of user root. Mar 17 05:52:02 virt04.gamull.com systemd[1]: Starting Session 72 of user root. Mar 17 05:52:02 virt04.gamull.com CROND[27897]: (root) CMD (/usr/lib64/sa/sa1 1 1) Mar 17 05:52:02 virt04.gamull.com python[27914]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0} Mar 17 05:52:08 virt04.gamull.com systemd[1]: Stopping Virtualization daemon... Mar 17 05:52:08 virt04.gamull.com python[27977]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0} Mar 17 05:52:08 virt04.gamull.com systemd[1]: Stopped Virtualization daemon. Mar 17 05:52:08 virt04.gamull.com saslpasswd2[27986]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found Mar 17 05:52:08 virt04.gamull.com saslpasswd2[27986]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found Mar 17 05:52:08 virt04.gamull.com saslpasswd2[27986]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found Mar 17 05:52:09 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:09 virt04.gamull.com systemd[1]: Starting Virtualization daemon... Mar 17 05:52:09 virt04.gamull.com systemd[1]: Started Virtualization daemon. Mar 17 05:52:09 virt04.gamull.com systemd[1]: Reloading. Mar 17 05:52:10 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:10 virt04.gamull.com systemd[1]: Started Flexible Branding Service. Mar 17 05:52:10 virt04.gamull.com systemd[1]: Starting Flexible Branding Service... Mar 17 05:52:10 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:10 virt04.gamull.com systemd[1]: Starting Cockpit Web Service Socket. Mar 17 05:52:10 virt04.gamull.com ln[28063]: /bin/ln: failed to create symbolic link ‘/run/cockpit/issue’: No such file or directory Mar 17 05:52:10 virt04.gamull.com systemd[1]: Listening on Cockpit Web Service Socket. Mar 17 05:52:10 virt04.gamull.com systemd[1]: Starting Cockpit Web Service... Mar 17 05:52:10 virt04.gamull.com dnsmasq[2121]: read /etc/hosts - 6 addresses Mar 17 05:52:10 virt04.gamull.com dnsmasq[2121]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses Mar 17 05:52:10 virt04.gamull.com dnsmasq-dhcp[2121]: read /var/lib/libvirt/dnsmasq/default.hostsfile Mar 17 05:52:10 virt04.gamull.com systemd[1]: Started Cockpit Web Service. Mar 17 05:52:10 virt04.gamull.com cockpit-ws[28075]: Using certificate: /etc/cockpit/ws-certs.d/0-self-signed.cert Mar 17 05:52:10 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:10 virt04.gamull.com kernel: type=1400 audit(1521280330.547:22): avc: denied { open } for pid=26687 comm="systemd-machine" path="/proc/26686/cgroup" dev="proc" ino=88676 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:svirt_t:s0:c72,c338 tclass=file Mar 17 05:52:10 virt04.gamull.com kernel: type=1400 audit(1521280330.547:23): avc: denied { getattr } for pid=26687 comm="systemd-machine" path="/proc/26686/cgroup" dev="proc" ino=88676 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:svirt_t:s0:c72,c338 tclass=file Mar 17 05:52:10 virt04.gamull.com systemd[1]: Reloading. Mar 17 05:52:10 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:10 virt04.gamull.com systemd[1]: Started Flexible Branding Service. Mar 17 05:52:10 virt04.gamull.com systemd[1]: Starting Flexible Branding Service... Mar 17 05:52:10 virt04.gamull.com systemd[1]: Reloading. Mar 17 05:52:11 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:11 virt04.gamull.com systemd[1]: Started Flexible Branding Service. Mar 17 05:52:11 virt04.gamull.com systemd[1]: Starting Flexible Branding Service... Mar 17 05:52:11 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:11 virt04.gamull.com systemd[1]: Stopping Virtualization daemon... Mar 17 05:52:11 virt04.gamull.com systemd[1]: Starting Virtualization daemon... Mar 17 05:52:12 virt04.gamull.com systemd[1]: Started Virtualization daemon. Mar 17 05:52:12 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:12 virt04.gamull.com systemd[1]: Starting oVirt ImageIO Daemon... Mar 17 05:52:13 virt04.gamull.com dnsmasq[2121]: read /etc/hosts - 6 addresses Mar 17 05:52:13 virt04.gamull.com dnsmasq[2121]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses Mar 17 05:52:13 virt04.gamull.com dnsmasq-dhcp[2121]: read /var/lib/libvirt/dnsmasq/default.hostsfile Mar 17 05:52:13 virt04.gamull.com systemd[1]: Started oVirt ImageIO Daemon. Mar 17 05:52:13 virt04.gamull.com kernel: type=1400 audit(1521280333.387:24): avc: denied { open } for pid=26687 comm="systemd-machine" path="/proc/26686/cgroup" dev="proc" ino=88676 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:svirt_t:s0:c72,c338 tclass=file Mar 17 05:52:13 virt04.gamull.com kernel: type=1400 audit(1521280333.387:25): avc: denied { getattr } for pid=26687 comm="systemd-machine" path="/proc/26686/cgroup" dev="proc" ino=88676 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:svirt_t:s0:c72,c338 tclass=file Mar 17 05:52:13 virt04.gamull.com systemd[1]: Reloading. Mar 17 05:52:13 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:13 virt04.gamull.com systemd[1]: Started Flexible Branding Service. Mar 17 05:52:13 virt04.gamull.com systemd[1]: Starting Flexible Branding Service... Mar 17 05:52:13 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:13 virt04.gamull.com systemd[1]: Starting Virtual Desktop Server Manager network restoration... Mar 17 05:52:13 virt04.gamull.com systemd[1]: Started Auxiliary vdsm service for running helper functions as root. Mar 17 05:52:13 virt04.gamull.com systemd[1]: Starting Auxiliary vdsm service for running helper functions as root... Mar 17 05:52:14 virt04.gamull.com python[28249]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0} Mar 17 05:52:15 virt04.gamull.com systemd[1]: Started Virtual Desktop Server Manager network restoration. Mar 17 05:52:15 virt04.gamull.com systemd[1]: Starting Virtual Desktop Server Manager... Mar 17 05:52:15 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running mkdirs Mar 17 05:52:15 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running configure_coredump Mar 17 05:52:15 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running configure_vdsm_logs Mar 17 05:52:15 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running wait_for_network Mar 17 05:52:16 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running run_init_hooks Mar 17 05:52:16 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running check_is_configured Mar 17 05:52:17 virt04.gamull.com vdsmd_init_common.sh[28321]: abrt is already configured for vdsm Mar 17 05:52:17 virt04.gamull.com vdsmd_init_common.sh[28321]: lvm is configured for vdsm Mar 17 05:52:17 virt04.gamull.com vdsmd_init_common.sh[28321]: libvirt is already configured for vdsm Mar 17 05:52:17 virt04.gamull.com vdsmd_init_common.sh[28321]: Current revision of multipath.conf detected, preserving Mar 17 05:52:17 virt04.gamull.com vdsmd_init_common.sh[28321]: schema is already configuredvdsm: Running validate_configuration Mar 17 05:52:18 virt04.gamull.com vdsmd_init_common.sh[28321]: SUCCESS: ssl configured to true. No conflicts Mar 17 05:52:18 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running prepare_transient_repository Mar 17 05:52:19 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running syslog_available Mar 17 05:52:19 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running nwfilter Mar 17 05:52:19 virt04.gamull.com vdsmd_init_common.sh[28321]: libvirt: Network Filter Driver error : Network filter not found: no nwfilter with matching name 'vdsm-no-mac-spoofing' Mar 17 05:52:19 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running dummybr Mar 17 05:52:20 virt04.gamull.com python[28393]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0} Mar 17 05:52:20 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running tune_system Mar 17 05:52:20 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running test_space Mar 17 05:52:20 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running test_lo Mar 17 05:52:20 virt04.gamull.com systemd[1]: Started Virtual Desktop Server Manager. Mar 17 05:52:20 virt04.gamull.com systemd[1]: Started MOM instance configured for VDSM purposes. Mar 17 05:52:20 virt04.gamull.com systemd[1]: Starting MOM instance configured for VDSM purposes... Mar 17 05:52:20 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:20 virt04.gamull.com systemd[1]: Started oVirt VM Console SSH server daemon. Mar 17 05:52:20 virt04.gamull.com systemd[1]: Starting oVirt VM Console SSH server daemon... Mar 17 05:52:20 virt04.gamull.com systemd[1]: Reloading. Mar 17 05:52:21 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:21 virt04.gamull.com systemd[1]: Started Flexible Branding Service. Mar 17 05:52:21 virt04.gamull.com systemd[1]: Starting Flexible Branding Service... Mar 17 05:52:21 virt04.gamull.com ovirt-vmconsole-host-sshd[28426]: /usr/share/ovirt-vmconsole/ovirt-vmconsole-host/ovirt-vmconsole-host-sshd/sshd_config line 23: Deprecated option RSAAuthentication Mar 17 05:52:21 virt04.gamull.com sshd[28426]: Server listening on 0.0.0.0 port 2223. Mar 17 05:52:21 virt04.gamull.com sshd[28426]: Server listening on :: port 2223. Mar 17 05:52:21 virt04.gamull.com goferd[1430]: [INFO][worker-0] root:542 - Disconnected Mar 17 05:52:21 virt04.gamull.com goferd[1430]: [ERROR][worker-0] gofer.messaging.adapter.connect:33 - connect: proton+amqps://subscription.rhsm.redhat.com:5647, failed: Connection amqps://subscription.rhsm.redhat.com:5647 disconnected Mar 17 05:52:21 virt04.gamull.com goferd[1430]: [INFO][worker-0] gofer.messaging.adapter.connect:35 - retry in 106 seconds Mar 17 05:52:22 virt04.gamull.com vdsm[28417]: ERROR add_watch: cannot watch /var/run/vdsm/dhclientmon WD=-1, Errno=No such file or directory (ENOENT) Mar 17 05:52:22 virt04.gamull.com vdsm[28417]: ERROR Unhandled exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/logutils.py", line 197, in wrapper return f(*a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/network/dhclient_monitor.py", line 139, in _monitor_dhcp_responses for filePath in sorted(os.listdir(MONITOR_FOLDER)): OSError: [Errno 2] No such file or directory: '/var/run/vdsm/dhclientmon' Mar 17 05:52:22 virt04.gamull.com vdsm[28417]: WARN Not ready yet, ignoring event '|virt|VM_status|1e424040-ba67-4138-bd71-63cd2afdd5dd' args={'1e424040-ba67-4138-bd71-63cd2afdd5dd': {'status': 'Powering up', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '127.0.0.1', 'type': 'vnc', 'port': '5900'}], 'hash': '1638793806033292744', 'cpuUser': '0.00', 'monitorResponse': '0', 'elapsedTime': '0', 'cpuSys': '0.00', 'vcpuPeriod': 100000L, 'timeOffset': '0', 'clientIp': '', 'pauseCode': 'NOERR', 'vcpuQuota': '-1'}} Mar 17 05:52:23 virt04.gamull.com vdsm[28417]: WARN MOM not available. Mar 17 05:52:23 virt04.gamull.com vdsm[28417]: WARN MOM not available, KSM stats will be missing. Mar 17 05:52:26 virt04.gamull.com python[28563]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0} Mar 17 05:52:26 virt04.gamull.com systemd[1]: mom-vdsm.service holdoff time over, scheduling restart. Mar 17 05:52:26 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 17 05:52:26 virt04.gamull.com systemd[1]: Started MOM instance configured for VDSM purposes. Mar 17 05:52:26 virt04.gamull.com systemd[1]: Starting MOM instance configured for VDSM purposes... Mar 17 05:52:29 virt04.gamull.com sshd[28584]: Accepted publickey for root from 192.168.122.10 port 36596 ssh2: RSA SHA256:5fIUyD//xr7SwVrjEM4Rg5ty8QG+AAidKiRbODDR+pk
looks like the engine.log on engine.gamull.com is indicating the ovirtmgmt interface is invalid. will retry without ovirtmgmt configured before setup 2018-03-17 05:54:43,790-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] Exception: org.ovirt.engine.core.bll.network.NetworkConfigurator$NetworkConfiguratorException: Interface ovirtmgmt is invalid for management network at org.ovirt.engine.core.bll.network.NetworkConfigurator.findNicToSetupManagementNetwork(NetworkConfigurator.java:166) [bll.jar:] at org.ovirt.engine.core.bll.network.NetworkConfigurator.createManagementNetworkIfRequired(NetworkConfigurator.java:84) [bll.jar:] at org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand.configureManagementNetwork(InstallVdsInternalCommand.java:297) [bll.jar:] at org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand.installHost(InstallVdsInternalCommand.java:216) [bll.jar:] at org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand.executeCommand(InstallVdsInternalCommand.java:112) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1133) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1285) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1934) [bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:164) [utils.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:103) [utils.jar:] at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1345) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:400) [bll.jar:] at org.ovirt.engine.core.bll.executor.DefaultBackendActionExecutor.execute(DefaultBackendActionExecutor.java:13) [bll.jar:] at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:468) [bll.jar:] at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:450) [bll.jar:] at org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:656) [bll.jar:] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0_161] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [rt.jar:1.8.0_161] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_161] at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_161] at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52) at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340) at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:437) at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:70) [wildfly-weld-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2] at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:80) [wildfly-weld-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2] at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:93) [wildfly-weld-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2] at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63) at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340) at org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43) [wildfly-ejb3-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340) at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:437) at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73) [weld-core-impl.jar:2.3.3.Final-redhat-1] at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:83) [wildfly-weld-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340) at org.jboss.as.ee.concurrent.ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.java:45) [wildfly-ee-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340) at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:21) at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340) at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:52) at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340) at org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53) [wildfly-ejb3-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340) at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:263) [wildfly-ejb3-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2] ..... 2018-03-17 05:54:43,799-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] Host installation failed for host '9eb58b8a-44f9-412c-9061-754d72d06a1c', 'virt04.gamull.com': Failed to configure management network on the host 2018-03-17 05:54:43,809-04 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] START, SetVdsStatusVDSCommand(HostName = virt04.gamull.com, SetVdsStatusVDSCommandParameters:{hostId='9eb58b8a-44f9-412c-9061-754d72d06a1c', status='NonOperational', nonOperationalReason='NONE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 28c858a4 2018-03-17 05:54:43,818-04 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] FINISH, SetVdsStatusVDSCommand, log id: 28c858a4 2018-03-17 05:54:43,831-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] EVENT_ID: VDS_INSTALL_FAILED(505), Host virt04.gamull.com installation failed. Failed to configure management network on the host. 2018-03-17 05:54:43,846-04 INFO [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] Lock freed to object 'EngineLock:{exclusiveLocks='[9eb58b8a-44f9-412c-9061-754d72d06a1c=VDS]', sharedLocks=''}' 2018-03-17 05:54:46,991-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [] START, GetHardwareInfoVDSCommand(HostName = virt04.gamull.com, VdsIdAndVdsVDSCommandParametersBase:{hostId='9eb58b8a-44f9-412c-9061-754d72d06a1c', vds='Host[virt04.gamull.com,9eb58b8a-44f9-412c-9061-754d72d06a1c]'}), log id: 79a82d82 2018-03-17 05:54:47,050-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [] FINISH, GetHardwareInfoVDSCommand, log id: 79a82d82 2018-03-17 05:54:47,459-04 INFO [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2cae1c56] Running command: SetNonOperationalVdsCommand internal: true. Entities affected : ID: 9eb58b8a-44f9-412c-9061-754d72d06a1c Type: VDS 2018-03-17 05:54:47,466-04 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2cae1c56] START, SetVdsStatusVDSCommand(HostName = virt04.gamull.com, SetVdsStatusVDSCommandParameters:{hostId='9eb58b8a-44f9-412c-9061-754d72d06a1c', status='NonOperational', nonOperationalReason='NETWORK_UNREACHABLE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 459189d1 2018-03-17 05:54:47,475-04 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2cae1c56] FINISH, SetVdsStatusVDSCommand, log id: 459189d1 2018-03-17 05:54:47,701-04 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2cae1c56] Host 'virt04.gamull.com' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt' 2018-03-17 05:54:47,816-04 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2cae1c56] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK(519), Host virt04.gamull.com does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt' 2018-03-17 05:54:47,984-04 INFO [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2d5e2769] Running command: HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities affected : ID: 9eb58b8a-44f9-412c-9061-754d72d06a1c Type: VDS 2018-03-17 05:54:48,063-04 INFO [org.ovirt.engine.core.bll.UpdateClusterCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [ebcca4] Lock Acquired to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='[]'}' 2018-03-17 05:54:48,317-04 INFO [org.ovirt.engine.core.bll.UpdateClusterCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [ebcca4] Running command: UpdateClusterCommand internal: true. Entities affected : ID: ccfabfac-29c7-11e8-a8a8-00163e710c5a Type: ClusterAction group EDIT_CLUSTER_CONFIGURATION with role type ADMIN 2018-03-17 05:54:48,370-04 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [ebcca4] EVENT_ID: SYSTEM_UPDATE_CLUSTER(835), Host cluster Default was updated by system 2018-03-17 05:54:48,372-04 INFO [org.ovirt.engine.core.bll.UpdateClusterCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [ebcca4] Lock freed to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='[]'}' 2018-03-17 05:54:48,400-04 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [ebcca4] EVENT_ID: VDS_DETECTED(13), Status of host virt04.gamull.com was set to NonOperational.
Latest is still errors on host relevant log is engine.log from engine 2018-03-17 07:27:57,483-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-24) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM virt04.gamull.com command GetCapabilitiesVDS failed: Message timeout which can be caused by communication issues 2018-03-17 07:27:57,484-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-24) [] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand' return value 'org.ovirt.engine.core.vdsbroker.vdsbroker.VDSInfoReturn@1fd4e0c1' 2018-03-17 07:27:57,484-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-24) [] HostName = virt04.gamull.com 2018-03-17 07:27:57,485-04 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-24) [] Command 'GetCapabilitiesVDSCommand(HostName = virt04.gamull.com, VdsIdAndVdsVDSCommandParametersBase:{hostId='a55ba422-bd50-47c2-85aa-bb4427d4621c', vds='Host[virt04.gamull.com,a55ba422-bd50-47c2-85aa-bb4427d4621c]'})' execution failed: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues 2018-03-17 07:27:57,485-04 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-24) [] Failure to refresh host 'virt04.gamull.com' runtime info: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues 2018-03-17 07:28:00,494-04 INFO [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [6ae12f83] Connecting to virt04.gamull.com/10.1.10.84
I've also tried to deploy Niode 0 over FC using tagged interface and failed with: [ INFO ] TASK [Check host status] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The host has been set in non_operational status, please check engine logs, fix accordingly and re-deploy.\n"} [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Remove local vm dir] [ INFO ] changed: [localhost] [ INFO ] TASK [Notify the user about a failure] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.\n"} [ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook [ INFO ] Stage: Clean up [ INFO ] Cleaning temporary resources [ INFO ] TASK [Gathering Facts] [ INFO ] ok: [localhost] [ INFO ] TASK [include_tasks] [ INFO ] ok: [localhost] [ INFO ] TASK [Remove local vm dir] [ INFO ] ok: [localhost] [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180514134621.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch. Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180514133147-32lu0a.log In vdsm.log I saw: 2018-05-14 13:46:03,387+0300 ERROR (jsonrpc/0) [vds] connectivity check failed (API:1530) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1527, in setupNetworks supervdsm.getProxy().setupNetworks(networks, bondings, options) File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 55, in __call__ return callMethod() File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 53, in <lambda> **kwargs) File "<string>", line 2, in setupNetworks File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod raise convert_to_error(kind, result) ConfigNetworkError: (10, 'connectivity check failed') 2018-05-14 13:46:03,388+0300 INFO (jsonrpc/0) [api.network] FINISH setupNetworks return={'status': {'message': 'conne ctivity check failed', 'code': 10}} from=::ffff:192.168.122.182,49644, flow_id=10bd1361 (api:52) 2018-05-14 13:46:03,388+0300 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Host.setupNetworks failed (error 10) i n 159.95 seconds (__init__:573) 2018-05-14 13:46:03,407+0300 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call Host.confirmConnectivity succeeded in 0.00 seconds (__init__:573) Tested on these components: ovirt-engine-setup-4.2.3.4-0.1.el7.noarch rhvm-appliance-4.2-20180504.0.el7.noarch ovirt-hosted-engine-ha-2.2.11-1.el7ev.noarch ovirt-hosted-engine-setup-2.2.20-1.el7ev.noarch Linux 3.10.0-862.2.3.el7.x86_64 #1 SMP Mon Apr 30 12:37:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.5 (Maipo) Attaching the sosreports from both engine and host here, just in case.
Created attachment 1436128 [details] engine logs
Created attachment 1436129 [details] sosreport from puma18
The failure says- 'connectivity check failed', 'code': 10 Did you change anything in the network or made sure there's proper network defined?
(In reply to Doron Fediuck from comment #11) > The failure says- > 'connectivity check failed', 'code': 10 > > Did you change anything in the network or made sure there's proper network > defined? Changed in the network? I think the only change was to configure VLAN and use it as interface during deployment.
The issue is here: 2018-05-14 13:46:04,908+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Failed in 'HostSetupNetworksVDS' method 2018-05-14 13:46:04,922+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM puma18-vlan404.scl.lab.tlv.redhat.com command HostSetupNetworksVDS failed: connectivity check failed 2018-05-14 13:46:04,923+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Error: VDSGenericException: VDSErrorException: Failed to HostSetupNetworksVDS, error = connectivity check failed, code = 10 2018-05-14 13:46:04,923+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Command 'HostSetupNetworksVDSCommand(HostName = puma18-vlan404.scl.lab.tlv.redhat.com, HostSetupNetworksVdsCommandParameters:{hostId='749e6c33-7a22-4772-9e7b-90048b07a317', vds='Host[puma18-vlan404.scl.lab.tlv.redhat.com,749e6c33-7a22-4772-9e7b-90048b07a317]', rollbackOnFailure='true', connectivityTimeout='120', networks='[HostNetwork:{defaultRoute='true', bonding='false', networkName='ovirtmgmt', vdsmName='ovirtmgmt', nicName='enp5s0f1', vlan='404', mtu='0', vmNetwork='true', stp='false', properties='null', ipv4BootProtocol='DHCP', ipv4Address='null', ipv4Netmask='null', ipv4Gateway='null', ipv6BootProtocol='AUTOCONF', ipv6Address='null', ipv6Prefix='null', ipv6Gateway='null', nameServers='null'}]', removedNetworks='[]', bonds='[]', removedBonds='[]', clusterSwitchType='LEGACY'})' execution failed: VDSGenericException: VDSErrorException: Failed to HostSetupNetworksVDS, error = connectivity check failed, code = 10 2018-05-14 13:46:04,924+03 INFO [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Host setup networks finished. Lock released. Monitoring can run now for host 'puma18-vlan404.scl.lab.tlv.redhat.com' from data-center 'Default' 2018-05-14 13:46:04,924+03 ERROR [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Command 'org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to HostSetupNetworksVDS, error = connectivity check failed, code = 10 (Failed with error noConPeer and code 10) 2018-05-14 13:46:04,927+03 INFO [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Lock freed to object 'EngineLock:{exclusiveLocks='[HOST_NETWORK749e6c33-7a22-4772-9e7b-90048b07a317=HOST_NETWORK]', sharedLocks=''}' 2018-05-14 13:46:04,939+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] EVENT_ID: SETUP_NETWORK_FAILED_FOR_MANAGEMENT_NETWORK_CONFIGURATION(1,120), Failed to configure management network on host puma18-vlan404.scl.lab.tlv.redhat.com due to setup networks failure. The host failed to contact back the engine to test its network configuration, and so it rolled back the network configuration and we lost the VM. Let's start checking if reproducible and how.
puma19 ~]# ifconfig enp5s0f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.35.163.56 netmask 255.255.252.0 broadcast 10.35.163.255 inet6 fe80::73b1:29ec:5152:1bdf prefixlen 64 scopeid 0x20<link> ether 44:1e:a1:73:39:60 txqueuelen 1000 (Ethernet) RX packets 14495 bytes 913716 (892.3 KiB) RX errors 0 dropped 1 overruns 0 frame 0 TX packets 15 bytes 1738 (1.6 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device memory 0xfbe60000-fbe7ffff enp5s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.35.160.47 netmask 255.255.252.0 broadcast 10.35.163.255 inet6 fe80::461e:a1ff:fe73:3961 prefixlen 64 scopeid 0x20<link> ether 44:1e:a1:73:39:61 txqueuelen 1000 (Ethernet) RX packets 14587 bytes 949847 (927.5 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 642 bytes 162129 (158.3 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device memory 0xfbee0000-fbefffff enp5s0f1.404: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.35.147.67 netmask 255.255.255.240 broadcast 10.35.147.79 inet6 fe80::44b0:f299:a9aa:af77 prefixlen 64 scopeid 0x20<link> ether 00:16:3e:7b:b8:65 txqueuelen 1000 (Ethernet) RX packets 4 bytes 928 (928.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 10 bytes 1868 (1.8 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 32 bytes 2928 (2.8 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 32 bytes 2928 (2.8 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 virbr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 192.168.122.1 netmask 255.255.255.0 broadcast 192.168.122.255 ether 52:54:00:38:8c:04 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 puma19 ~]# multipath -ll 3514f0c5a5160177c dm-0 XtremIO ,XtremApp size=100G features='1 queue_if_no_path' hwhandler='0' wp=rw `-+- policy='queue-length 0' prio=1 status=active |- 6:0:0:1 sdb 8:16 active ready running `- 6:0:1:1 sdc 8:32 active ready running ovirtmgmt bridge was not created. puma19-vlan404 NetworkManager[778]: <warn> [1526313643.1282] iff cfg-rh: cannot load /etc/sysconfig/network-scripts/ifcfg-enp5s0f1.404 due to conn flicting UUID for /etc/sysconfig/network-scripts/ifcfg-VLAN_connection_1 (eb6f0bb 28-957f-4a5d-8d52-a5c12e7496f8,"VLAN connection 1") May 14 19:00:43 puma19-vlan404 NetworkManager[778]: <warn> [1526313643.5656] iff cfg-rh: cannot load /etc/sysconfig/network-scripts/ifcfg-enp5s0f1.404 due to conn flicting UUID for /etc/sysconfig/network-scripts/ifcfg-VLAN_connection_1 (eb6f0bb 28-957f-4a5d-8d52-a5c12e7496f8,"VLAN connection 1")
Created attachment 1436368 [details] logs from puma19
Created attachment 1436369 [details] logs from engine
Logs from comments #15 and #16 are from another clean deployment over VLAN on another host.
Nikolai, how was the VLAN created? Can you explain the conflict: cannot load /etc/sysconfig/network-scripts/ifcfg-enp5s0f1.404 due to conn flicting UUID for /etc/sysconfig/network-scripts/ifcfg-VLAN_connection_1 (eb6f0bb 28-957f-4a5d-8d52-a5c12e7496f8,"VLAN connection 1" ?
(In reply to Yaniv Kaul from comment #18) > Nikolai, how was the VLAN created? Can you explain the conflict: > cannot load /etc/sysconfig/network-scripts/ifcfg-enp5s0f1.404 due to conn > flicting UUID for /etc/sysconfig/network-scripts/ifcfg-VLAN_connection_1 > (eb6f0bb > 28-957f-4a5d-8d52-a5c12e7496f8,"VLAN connection 1" > > ? nmtui. I'm not sure why this conflict appeared.
(In reply to Nikolai Sednev from comment #14) > > enp5s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 > ether 44:1e:a1:73:39:61 txqueuelen 1000 (Ethernet) > > enp5s0f1.404: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 > ether 00:16:3e:7b:b8:65 txqueuelen 1000 (Ethernet) Please note that you were also spoofing the mac address over the vlan interface.
(In reply to Simone Tiraboschi from comment #20) > (In reply to Nikolai Sednev from comment #14) > > > > enp5s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 > > ether 44:1e:a1:73:39:61 txqueuelen 1000 (Ethernet) > > > > enp5s0f1.404: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 > > ether 00:16:3e:7b:b8:65 txqueuelen 1000 (Ethernet) > > Please note that you were also spoofing the mac address over the vlan > interface. Should it be a problem? I've received appropriate IP address through that VLAN to spoofed MAC, hence network connectivity was fine, otherwise I could not even get an IP address to that VLAN interface.
Closing as insufficient data since I never really reproduced this on a clean environment. Please reopen this if we find a clean reproducer.
I think I got some kind of the same Problem on my clean system. My configuration is 4 NICS => bond0 (mode balance-rr) => bond.5 => (host ip & engine ip) I will attach a log after this Post. The error occurs in line 2123 Maybe this is a reproduction of this? Tried this 3 times.
Created attachment 1470697 [details] ovirt hosted engine setup non_operational log Error occurs in line 2123
(In reply to D. Tervooren from comment #25) > I think I got some kind of the same Problem on my clean system. > > My configuration is > > 4 NICS => bond0 (mode balance-rr) => bond.5 => (host ip & engine ip) Your issue is in the vlan over bond interface name: vdsm is pretty picky on this, see also: https://bugzilla.redhat.com/show_bug.cgi?id=1379833 If your bond is called bond0, vlan 5 over bond0 should be called exactly bond0.5 and nothing else. Can you please report back if it's correctly working with bond0.5 ?
(In reply to Simone Tiraboschi from comment #27) > (In reply to D. Tervooren from comment #25) > > I think I got some kind of the same Problem on my clean system. > > > > My configuration is > > > > 4 NICS => bond0 (mode balance-rr) => bond.5 => (host ip & engine ip) > > Your issue is in the vlan over bond interface name: vdsm is pretty picky on > this, see also: > https://bugzilla.redhat.com/show_bug.cgi?id=1379833 > > If your bond is called bond0, vlan 5 over bond0 should be called exactly > bond0.5 and nothing else. > > Can you please report back if it's correctly working with bond0.5 ? Hi Simone, I'm sorry that was a typo in my original post. The Interface was named bond0.5 of course. Anyway I got it working now by testing some stuff with NIC configuration. I found a file in the GitHub repo (https://github.com/oVirt/ovirt-hosted-engine-setup/blob/master/src/ansible/get_network_interfaces.yml) which on line 15 defines a variable called 'acceptable_bond_modes'. My bond was running in balance-rr which is not contained. I guess ansible did not recognize this correctly because of the vlan on top of it. I now decided move the Engine-Bridge on a dedicated untagged bond in balance-xor mode which Worked for me.