Bug 1557624 - hosted-engine-setup fails on host setup - vdsm fails with error finding bond-defaults.json
Summary: hosted-engine-setup fails on host setup - vdsm fails with error finding bond-...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.2.2
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ovirt-4.2.5
: ---
Assignee: Simone Tiraboschi
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-17 09:13 UTC by Tom Gamull
Modified: 2019-04-28 08:41 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-07-05 10:33:17 UTC
oVirt Team: Integration
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
engine logs (9.59 MB, application/x-xz)
2018-05-14 11:26 UTC, Nikolai Sednev
no flags Details
sosreport from puma18 (9.50 MB, application/x-xz)
2018-05-14 11:27 UTC, Nikolai Sednev
no flags Details
logs from puma19 (9.46 MB, application/x-xz)
2018-05-14 17:07 UTC, Nikolai Sednev
no flags Details
logs from engine (9.59 MB, application/x-xz)
2018-05-14 17:08 UTC, Nikolai Sednev
no flags Details
ovirt hosted engine setup non_operational log (352.95 KB, text/plain)
2018-07-26 11:00 UTC, D. Tervooren
no flags Details

Description Tom Gamull 2018-03-17 09:13:58 UTC
Description of problem:
When configuring hosted-engine-setup, install fails on host setup.  The issue can be verified by installing rhvm-appliance and then running vdsm-tool configure --force which fails on an error about finding /var/run/vdsm/bonding-defaults.json


Version-Release number of selected component (if applicable):
vdsm-4.20.22-1.el7ev.x86_64
ovirt-hosted-engine-setup-2.2.13-1.el7ev.noarch
kernel-3.10.0-830.el7.x86_64
rhvm-appliance-4.2-20180202.0.el7.noarch
libvirt-python-3.9.0-1.el7.x86_64
libvirt-libs-3.9.0-14.el7.x86_64
mom-0.5.12-1.el7ev.noarch
ovirt-host-deploy-1.7.3-1.el7ev.noarch


How reproducible:
For 4.2.2-6 it is reproducible on my system everytime (also reproduced on 4.2.2-5)

Steps to Reproduce:
1. Follow steps here - http://ci-web.eng.lab.tlv.redhat.com/docs/4.2/Guide/install_guide/index.html
2. yum -y update
3. reboot
4. yum install rhvm-appliance
5. configure ovirt-mgmt bridge (this wasn't done first time but was done subsequent attempts, it did not make a difference - https://www.ovirt.org/develop/developer-guide/vdsm/installing-vdsm-from-rpm/)
6. reboot
7. systemctl start lvm2-activation-early.service lvm2-activation.service (these didn't start
8a. vdsm-tool configure --force (fails)
OR
8b. hosted-engine --deploy --config-append=/root/storage.conf (this is just a gluster replica 3 conf but it never gets to this stage, it fails on deploy host)

Actual results:
[root@virt04 ~]# vdsm-tool configure --force

Checking configuration status...

abrt is not configured for vdsm
lvm is configured for vdsm
libvirt is not configured for vdsm yet
FAILED: conflicting vdsm and libvirt-qemu tls configuration.
vdsm.conf with ssl=True requires the following changes:
libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1
qemu.conf: spice_tls=1.
Current revision of multipath.conf detected, preserving
schema should be configured
Running configure...
Reconfiguration of abrt is done.
Reconfiguration of passwd is done.
Reconfiguration of sebool is done.
Reconfiguration of certificates is done.
Reconfiguration of libvirt is done.
Traceback (most recent call last):
  File "/usr/bin/vdsm-tool", line 219, in main
    return tool_command[cmd]["command"](*args)
  File "/usr/lib/python2.7/site-packages/vdsm/tool/__init__.py", line 38, in wrapper
    func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/tool/configurator.py", line 141, in configure
    _configure(c)
  File "/usr/lib/python2.7/site-packages/vdsm/tool/configurator.py", line 88, in _configure
    getattr(module, 'configure', lambda: None)()
  File "/usr/lib/python2.7/site-packages/vdsm/tool/configurators/bond_defaults.py", line 37, in configure
    sysfs_options_mapper.dump_bonding_options()
  File "/usr/lib/python2.7/site-packages/vdsm/network/link/bond/sysfs_options_mapper.py", line 46, in dump_bonding_options
    with open(sysfs_options.BONDING_DEFAULTS, 'w') as f:
IOError: [Errno 2] No such file or directory: '/var/run/vdsm/bonding-defaults.json'

OR

fails on host deploy in hosted-engine setup
it passes on DHCP ip for engine, engine ssh and I can ping the host from the engine vm
engine-setup runs and completes on engine
it tries to configure the host and fails


Expected results:
success

Additional info:

errors noticed in vdsm.log
2018-03-16 19:43:02,983-0400 INFO  (MainThread) [vds] (PID: 3567) I am the actual vdsm 4.20.22-1.el7ev virt04.gamull.com (3.10.0-861.el
7.x86_64) (vdsmd:149)
2018-03-16 19:43:02,984-0400 INFO  (MainThread) [vds] VDSM will run with cpu affinity: frozenset([1]) (vdsmd:262)
2018-03-16 19:43:02,994-0400 INFO  (MainThread) [storage.HSM] START HSM init (hsm:366)
2018-03-16 19:43:03,000-0400 INFO  (MainThread) [storage.HSM] Creating data-center mount directory '/rhev/data-center/mnt' (hsm:373)
2018-03-16 19:43:03,001-0400 INFO  (MainThread) [storage.fileUtils] Creating directory: /rhev/data-center/mnt mode: None (fileUtils:197
)
2018-03-16 19:43:03,242-0400 INFO  (MainThread) [storage.udev] Registering multipath event monitor <vdsm.storage.mpathhealth.Monitor ob
ject at 0x7f5369a4df90> (udev:182)
2018-03-16 19:43:03,243-0400 INFO  (MainThread) [storage.udev] Starting multipath event listener (udev:116)
2018-03-16 19:43:03,290-0400 INFO  (MainThread) [storage.check] Starting check service (check:91)
2018-03-16 19:43:03,301-0400 INFO  (MainThread) [storage.Dispatcher] Starting StorageDispatcher... (dispatcher:48)
2018-03-16 19:43:03,302-0400 INFO  (check/loop) [storage.asyncevent] Starting <EventLoop running=True closed=False at 0x139996205984592
> (asyncevent:125)
2018-03-16 19:43:03,381-0400 INFO  (MainThread) [vdsm.api] START registerDomainStateChangeCallback(callbackFunc=<functools.partial obje
ct at 0x7f53681b0e68>) from=internal, task_id=61af4ff5-b152-4459-a5a3-0a66b8e81af7 (api:46)
2018-03-16 19:43:03,382-0400 INFO  (MainThread) [vdsm.api] FINISH registerDomainStateChangeCallback return=None from=internal, task_id=
61af4ff5-b152-4459-a5a3-0a66b8e81af7 (api:52)
2018-03-16 19:43:03,382-0400 INFO  (MainThread) [MOM] Preparing MOM interface (momIF:53)
2018-03-16 19:43:03,384-0400 INFO  (MainThread) [MOM] Using named unix socket /var/run/vdsm/mom-vdsm.sock (momIF:62)
2018-03-16 19:43:03,385-0400 INFO  (MainThread) [root] Unregistering all secrets (secret:91)
2018-03-16 19:43:03,453-0400 INFO  (MainThread) [vds] Setting channels' timeout to 30 seconds. (vmchannels:222)
2018-03-16 19:43:03,454-0400 INFO  (MainThread) [vds] Starting QEMU-GA poller (qemuguestagent:112)
2018-03-16 19:43:03,458-0400 INFO  (MainThread) [vds.MultiProtocolAcceptor] Listening at :::54321 (protocoldetector:183)
2018-03-16 19:43:03,524-0400 INFO  (vmrecovery) [vds] recovery: completed in 0s (clientIF:593)
2018-03-16 19:43:03,535-0400 ERROR (MainThread) [vds] failed to init clientIF, shutting down storage dispatcher (clientIF:147)
2018-03-16 19:43:03,536-0400 INFO  (MainThread) [vdsm.api] START prepareForShutdown(options=None) from=internal, task_id=6565f58d-017d-
42af-ba1b-62400467e7ae (api:46)
2018-03-16 19:43:03,591-0400 INFO  (MainThread) [storage.Monitor] Shutting down domain monitors (monitor:222)
2018-03-16 19:43:03,591-0400 INFO  (MainThread) [storage.check] Stopping check service (check:104)
2018-03-16 19:43:03,592-0400 INFO  (check/loop) [storage.asyncevent] Stopping <EventLoop running=True closed=False at 0x139996205984592
> (asyncevent:220)
2018-03-16 19:43:03,592-0400 INFO  (MainThread) [storage.udev] Stopping multipath event listener (udev:149)
2018-03-16 19:43:03,593-0400 INFO  (MainThread) [vdsm.api] FINISH prepareForShutdown return=None from=internal, task_id=6565f58d-017d-4
2af-ba1b-62400467e7ae (api:52)
2018-03-16 19:43:03,593-0400 ERROR (MainThread) [vds] Exception raised (vdsmd:158)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 156, in run
    serve_clients(log)
  File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 103, in serve_clients
    cif = clientIF.getInstance(irs, log, scheduler)
  File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 250, in getInstance
    cls._instance = clientIF(irs, log, scheduler)
  File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 144, in __init__
    self._prepareJSONRPCServer()
  File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 307, in _prepareJSONRPCServer
    bridge = Bridge.DynamicBridge()
  File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 67, in __init__
    self._schema = vdsmapi.Schema(paths, api_strict_mode)
  File "/usr/lib/python2.7/site-packages/vdsm/api/vdsmapi.py", line 217, in __init__
    raise SchemaNotFound("Unable to find API schema file")
SchemaNotFound: Unable to find API schema file
2018-03-16 19:43:03,595-0400 INFO  (MainThread) [vds] Stopping threads (vdsmd:160)
2018-03-16 19:43:03,595-0400 INFO  (MainThread) [vds] Exiting (vdsmd:171)
2018-03-16 19:43:03,637-0400 ERROR (hsm/init) [storage.HSM] FINISH thread <Thread(hsm/init, started daemon 139995448858368)> failed (co
ncurrent:201)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 406, in storageRefresh
    sdCache.refreshStorage()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/misc.py", line 404, in helper
    return sm(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/misc.py", line 389, in __call__
    self.__lastResult = self.__func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 82, in refreshStorage
    multipath.rescan()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/multipath.py", line 83, in rescan
    udevadm.settle(timeout)
  File "/usr/lib/python2.7/site-packages/vdsm/common/udevadm.py", line 50, in settle
    _run_command(args)
  File "/usr/lib/python2.7/site-packages/vdsm/common/udevadm.py", line 105, in _run_command
    rc, out, err = commands.execCmd(cmd, raw=True)
TypeError: 'NoneType' object is not callable
2018-03-16 19:43:09,938-0400 INFO  (MainThread) [vds] (PID: 3697) I am the actual vdsm 4.20.22-1.el7ev virt04.gamull.com (3.10.0-861.el7.x86_64) (vdsmd:149)
2018-03-16 19:43:09,938-0400 INFO  (MainThread) [vds] VDSM will run with cpu affinity: frozenset([1]) (vdsmd:262)
2018-03-16 19:43:09,948-0400 INFO  (MainThread) [storage.HSM] START HSM init (hsm:366)
2018-03-16 19:43:09,953-0400 INFO  (MainThread) [storage.HSM] Creating data-center mount directory '/rhev/data-center/mnt' (hsm:373)
2018-03-16 19:43:09,954-0400 INFO  (MainThread) [storage.fileUtils] Creating directory: /rhev/data-center/mnt mode: None (fileUtils:197)
2018-03-16 19:43:10,182-0400 INFO  (MainThread) [storage.udev] Registering multipath event monitor <vdsm.storage.mpathhealth.Monitor object at 0x7f199eccef90> (udev:182)
2018-03-16 19:43:10,183-0400 INFO  (MainThread) [storage.udev] Starting multipath event listener (udev:116)
2018-03-16 19:43:10,230-0400 INFO  (MainThread) [storage.check] Starting check service (check:91)
2018-03-16 19:43:10,242-0400 INFO  (MainThread) [storage.Dispatcher] Starting StorageDispatcher... (dispatcher:48)
2018-03-16 19:43:10,243-0400 INFO  (check/loop) [storage.asyncevent] Starting <EventLoop running=True closed=False at 0x139747972914000> (asyncevent:125)
2018-03-16 19:43:10,322-0400 INFO  (MainThread) [vdsm.api] START registerDomainStateChangeCallback(callbackFunc=<functools.partial object at 0x7f199c42fe68>) from=internal, task_id=d76d22bc-2164-4ec6-8a2a-3bd056410996 (api:46)
2018-03-16 19:43:10,322-0400 INFO  (MainThread) [vdsm.api] FINISH registerDomainStateChangeCallback return=None from=internal, task_id=d76d22bc-2164-4ec6-8a2a-3bd056410996 (api:52)
2018-03-16 19:43:10,323-0400 INFO  (MainThread) [MOM] Preparing MOM interface (momIF:53)
2018-03-16 19:43:10,325-0400 INFO  (MainThread) [MOM] Using named unix socket /var/run/vdsm/mom-vdsm.sock (momIF:62)
2018-03-16 19:43:10,326-0400 INFO  (MainThread) [root] Unregistering all secrets (secret:91)
2018-03-16 19:43:10,396-0400 INFO  (MainThread) [vds] Setting channels' timeout to 30 seconds. (vmchannels:222)
2018-03-16 19:43:10,397-0400 INFO  (MainThread) [vds] Starting QEMU-GA poller (qemuguestagent:112)
2018-03-16 19:43:10,400-0400 INFO  (MainThread) [vds.MultiProtocolAcceptor] Listening at :::54321 (protocoldetector:183)
2018-03-16 19:43:10,451-0400 INFO  (vmrecovery) [vds] recovery: completed in 0s (clientIF:593)
2018-03-16 19:43:10,469-0400 ERROR (MainThread) [vds] failed to init clientIF, shutting down storage dispatcher (clientIF:147)
2018-03-16 19:43:10,470-0400 INFO  (MainThread) [vdsm.api] START prepareForShutdown(options=None) from=internal, task_id=2104d6f2-56c5-4019-922a-c91ee1675f90 (api:46)
2018-03-16 19:43:10,524-0400 INFO  (MainThread) [storage.Monitor] Shutting down domain monitors (monitor:222)
2018-03-16 19:43:10,525-0400 INFO  (MainThread) [storage.check] Stopping check service (check:104)
2018-03-16 19:43:10,525-0400 INFO  (check/loop) [storage.asyncevent] Stopping <EventLoop running=True closed=False at 0x139747972914000> (asyncevent:220)
2018-03-16 19:43:10,526-0400 INFO  (MainThread) [storage.udev] Stopping multipath event listener (udev:149)
2018-03-16 19:43:10,526-0400 INFO  (MainThread) [vdsm.api] FINISH prepareForShutdown return=None from=internal, task_id=2104d6f2-56c5-4019-922a-c91ee1675f90 (api:52)
2018-03-16 19:43:10,527-0400 ERROR (MainThread) [vds] Exception raised (vdsmd:158)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 156, in run
    serve_clients(log)
  File "/usr/lib/python2.7/site-packages/vdsm/vdsmd.py", line 103, in serve_clients
    cif = clientIF.getInstance(irs, log, scheduler)
  File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 250, in getInstance
    cls._instance = clientIF(irs, log, scheduler)
  File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 144, in __init__
    self._prepareJSONRPCServer()
  File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 307, in _prepareJSONRPCServer
    bridge = Bridge.DynamicBridge()
  File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 67, in __init__
    self._schema = vdsmapi.Schema(paths, api_strict_mode)
  File "/usr/lib/python2.7/site-packages/vdsm/api/vdsmapi.py", line 217, in __init__
    raise SchemaNotFound("Unable to find API schema file")
SchemaNotFound: Unable to find API schema file
2018-03-16 19:43:10,528-0400 INFO  (MainThread) [vds] Stopping threads (vdsmd:160)
2018-03-16 19:43:10,528-0400 INFO  (MainThread) [vds] Exiting (vdsmd:171)
2018-03-16 19:43:10,570-0400 ERROR (hsm/init) [storage.HSM] FINISH thread <Thread(hsm/init, started daemon 139747095738112)> failed (concurrent:201)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in run
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 406, in storageRefresh
    sdCache.refreshStorage()

Comment 1 Tom Gamull 2018-03-17 09:16:19 UTC
Additional information: I am using the vdo and kmod_kvdo packages with rhel 7.5
I have made this attempt without a bond and the following network interface configs
NetworkManager was disabled and stopped

[root@virt04 network-scripts]# cat ifcfg-eno1
DEVICE=eno1
ONBOOT=yes
BRIDGE=ovirtmgmt
[root@virt04 network-scripts]# cat ifcfg-eno2
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=no
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=eno2
UUID=bcb17727-32de-4565-8b5e-b6d126f74345
DEVICE=eno2
ONBOOT=yes
IPADDR=10.1.200.84
PREFIX=23
[root@virt04 network-scripts]# cat ifcfg-ovirtmgmt 
DEVICE=ovirtmgmt
TYPE=Bridge
ONBOOT=yes
DELAY=0
BOOTPROTO=static
IPADDR=10.1.10.84
NETMASK=255.255.255.0
GATEWAY=10.1.10.1
DNS1=10.1.10.1
DEFROUTE="yes"
[root@virt04 network-scripts]# brctl show
bridge name	bridge id		STP enabled	interfaces
ovirtmgmt		8000.001d096c4c18	no		eno1
virbr0		8000.525400555d82	yes		virbr0-nic

Comment 2 Tom Gamull 2018-03-17 09:27:32 UTC
This may be a permissions issue.  It appears if I create and set owner/permission on /var/run/vdsm prior to running vdsm-tool configure --force it succeeds.  I am proceeding to test the hosted-engine --deploy now

[root@virt04 ~]# mkdir /var/run/vdsm
[root@virt04 ~]# chown vdsm:kvm /var/run/vdsm
[root@virt04 ~]# chmod 755 /var/run/vdsm
[root@virt04 ~]# vdsm-tool configure --force

Checking configuration status...

abrt is already configured for vdsm
lvm is configured for vdsm
libvirt is already configured for vdsm
SUCCESS: ssl configured to true. No conflicts
Current revision of multipath.conf detected, preserving
schema should be configured
Running configure...
Reconfiguration of abrt is done.
Reconfiguration of passwd is done.
Reconfiguration of libvirt is done.
Reconfiguration of bond_defaults is done.
Reconfiguration of schema is done.

Done configuring modules to VDSM.
[root@virt04 ~]#

Comment 3 Tom Gamull 2018-03-17 09:58:35 UTC
hosted-engine deploy still fails
[ INFO  ] TASK [Set VLAN ID at datacenter level]
[ INFO  ] skipping: [localhost]
[ INFO  ] TASK [Force host-deploy in offline mode]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [Add host]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [Wait for the host to be up]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [Check host status]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The host has been set in non_operational status, please check engine logs, fix accordingly and re-deploy.\n"}
[ INFO  ] TASK [include_tasks]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [Remove local vm dir]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [Notify the user about a failure]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.\n"}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook
[ INFO  ] Stage: Clean up
[ INFO  ] Cleaning temporary resources
[ INFO  ] TASK [Gathering Facts]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [include_tasks]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [Remove local vm dir]
[ INFO  ] ok: [localhost]
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180317055458.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch.
          Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180317052917-4hoa56.log


journalctl from virt04 (vdsmd is running throughout this?)

Mar 17 05:51:56 virt04.gamull.com python[27827]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0}
Mar 17 05:51:59 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:01 virt04.gamull.com systemd[1]: Reloading.
Mar 17 05:52:01 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:01 virt04.gamull.com systemd[1]: Started Flexible Branding Service.
Mar 17 05:52:01 virt04.gamull.com systemd[1]: Starting Flexible Branding Service...
Mar 17 05:52:02 virt04.gamull.com systemd[1]: Started Session 72 of user root.
Mar 17 05:52:02 virt04.gamull.com systemd[1]: Starting Session 72 of user root.
Mar 17 05:52:02 virt04.gamull.com CROND[27897]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Mar 17 05:52:02 virt04.gamull.com python[27914]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0}
Mar 17 05:52:08 virt04.gamull.com systemd[1]: Stopping Virtualization daemon...
Mar 17 05:52:08 virt04.gamull.com python[27977]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0}
Mar 17 05:52:08 virt04.gamull.com systemd[1]: Stopped Virtualization daemon.
Mar 17 05:52:08 virt04.gamull.com saslpasswd2[27986]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found
Mar 17 05:52:08 virt04.gamull.com saslpasswd2[27986]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found
Mar 17 05:52:08 virt04.gamull.com saslpasswd2[27986]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found
Mar 17 05:52:09 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:09 virt04.gamull.com systemd[1]: Starting Virtualization daemon...
Mar 17 05:52:09 virt04.gamull.com systemd[1]: Started Virtualization daemon.
Mar 17 05:52:09 virt04.gamull.com systemd[1]: Reloading.
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Started Flexible Branding Service.
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Starting Flexible Branding Service...
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Starting Cockpit Web Service Socket.
Mar 17 05:52:10 virt04.gamull.com ln[28063]: /bin/ln: failed to create symbolic link ‘/run/cockpit/issue’: No such file or directory
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Listening on Cockpit Web Service Socket.
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Starting Cockpit Web Service...
Mar 17 05:52:10 virt04.gamull.com dnsmasq[2121]: read /etc/hosts - 6 addresses
Mar 17 05:52:10 virt04.gamull.com dnsmasq[2121]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Mar 17 05:52:10 virt04.gamull.com dnsmasq-dhcp[2121]: read /var/lib/libvirt/dnsmasq/default.hostsfile
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Started Cockpit Web Service.
Mar 17 05:52:10 virt04.gamull.com cockpit-ws[28075]: Using certificate: /etc/cockpit/ws-certs.d/0-self-signed.cert
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:10 virt04.gamull.com kernel: type=1400 audit(1521280330.547:22): avc:  denied  { open } for  pid=26687 comm="systemd-machine" path="/proc/26686/cgroup" dev="proc" ino=88676 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:svirt_t:s0:c72,c338 tclass=file
Mar 17 05:52:10 virt04.gamull.com kernel: type=1400 audit(1521280330.547:23): avc:  denied  { getattr } for  pid=26687 comm="systemd-machine" path="/proc/26686/cgroup" dev="proc" ino=88676 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:svirt_t:s0:c72,c338 tclass=file
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Reloading.
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Started Flexible Branding Service.
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Starting Flexible Branding Service...
Mar 17 05:52:10 virt04.gamull.com systemd[1]: Reloading.
Mar 17 05:52:11 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:11 virt04.gamull.com systemd[1]: Started Flexible Branding Service.
Mar 17 05:52:11 virt04.gamull.com systemd[1]: Starting Flexible Branding Service...
Mar 17 05:52:11 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:11 virt04.gamull.com systemd[1]: Stopping Virtualization daemon...
Mar 17 05:52:11 virt04.gamull.com systemd[1]: Starting Virtualization daemon...
Mar 17 05:52:12 virt04.gamull.com systemd[1]: Started Virtualization daemon.
Mar 17 05:52:12 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:12 virt04.gamull.com systemd[1]: Starting oVirt ImageIO Daemon...
Mar 17 05:52:13 virt04.gamull.com dnsmasq[2121]: read /etc/hosts - 6 addresses
Mar 17 05:52:13 virt04.gamull.com dnsmasq[2121]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Mar 17 05:52:13 virt04.gamull.com dnsmasq-dhcp[2121]: read /var/lib/libvirt/dnsmasq/default.hostsfile
Mar 17 05:52:13 virt04.gamull.com systemd[1]: Started oVirt ImageIO Daemon.
Mar 17 05:52:13 virt04.gamull.com kernel: type=1400 audit(1521280333.387:24): avc:  denied  { open } for  pid=26687 comm="systemd-machine" path="/proc/26686/cgroup" dev="proc" ino=88676 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:svirt_t:s0:c72,c338 tclass=file
Mar 17 05:52:13 virt04.gamull.com kernel: type=1400 audit(1521280333.387:25): avc:  denied  { getattr } for  pid=26687 comm="systemd-machine" path="/proc/26686/cgroup" dev="proc" ino=88676 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:svirt_t:s0:c72,c338 tclass=file
Mar 17 05:52:13 virt04.gamull.com systemd[1]: Reloading.
Mar 17 05:52:13 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:13 virt04.gamull.com systemd[1]: Started Flexible Branding Service.
Mar 17 05:52:13 virt04.gamull.com systemd[1]: Starting Flexible Branding Service...
Mar 17 05:52:13 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:13 virt04.gamull.com systemd[1]: Starting Virtual Desktop Server Manager network restoration...
Mar 17 05:52:13 virt04.gamull.com systemd[1]: Started Auxiliary vdsm service for running helper functions as root.
Mar 17 05:52:13 virt04.gamull.com systemd[1]: Starting Auxiliary vdsm service for running helper functions as root...
Mar 17 05:52:14 virt04.gamull.com python[28249]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0}
Mar 17 05:52:15 virt04.gamull.com systemd[1]: Started Virtual Desktop Server Manager network restoration.
Mar 17 05:52:15 virt04.gamull.com systemd[1]: Starting Virtual Desktop Server Manager...
Mar 17 05:52:15 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running mkdirs
Mar 17 05:52:15 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running configure_coredump
Mar 17 05:52:15 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running configure_vdsm_logs
Mar 17 05:52:15 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running wait_for_network
Mar 17 05:52:16 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running run_init_hooks
Mar 17 05:52:16 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running check_is_configured
Mar 17 05:52:17 virt04.gamull.com vdsmd_init_common.sh[28321]: abrt is already configured for vdsm
Mar 17 05:52:17 virt04.gamull.com vdsmd_init_common.sh[28321]: lvm is configured for vdsm
Mar 17 05:52:17 virt04.gamull.com vdsmd_init_common.sh[28321]: libvirt is already configured for vdsm
Mar 17 05:52:17 virt04.gamull.com vdsmd_init_common.sh[28321]: Current revision of multipath.conf detected, preserving
Mar 17 05:52:17 virt04.gamull.com vdsmd_init_common.sh[28321]: schema is already configuredvdsm: Running validate_configuration
Mar 17 05:52:18 virt04.gamull.com vdsmd_init_common.sh[28321]: SUCCESS: ssl configured to true. No conflicts
Mar 17 05:52:18 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running prepare_transient_repository
Mar 17 05:52:19 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running syslog_available
Mar 17 05:52:19 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running nwfilter
Mar 17 05:52:19 virt04.gamull.com vdsmd_init_common.sh[28321]: libvirt: Network Filter Driver error : Network filter not found: no nwfilter with matching name 'vdsm-no-mac-spoofing'
Mar 17 05:52:19 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running dummybr
Mar 17 05:52:20 virt04.gamull.com python[28393]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0}
Mar 17 05:52:20 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running tune_system
Mar 17 05:52:20 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running test_space
Mar 17 05:52:20 virt04.gamull.com vdsmd_init_common.sh[28321]: vdsm: Running test_lo
Mar 17 05:52:20 virt04.gamull.com systemd[1]: Started Virtual Desktop Server Manager.
Mar 17 05:52:20 virt04.gamull.com systemd[1]: Started MOM instance configured for VDSM purposes.
Mar 17 05:52:20 virt04.gamull.com systemd[1]: Starting MOM instance configured for VDSM purposes...
Mar 17 05:52:20 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:20 virt04.gamull.com systemd[1]: Started oVirt VM Console SSH server daemon.
Mar 17 05:52:20 virt04.gamull.com systemd[1]: Starting oVirt VM Console SSH server daemon...
Mar 17 05:52:20 virt04.gamull.com systemd[1]: Reloading.
Mar 17 05:52:21 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:21 virt04.gamull.com systemd[1]: Started Flexible Branding Service.
Mar 17 05:52:21 virt04.gamull.com systemd[1]: Starting Flexible Branding Service...
Mar 17 05:52:21 virt04.gamull.com ovirt-vmconsole-host-sshd[28426]: /usr/share/ovirt-vmconsole/ovirt-vmconsole-host/ovirt-vmconsole-host-sshd/sshd_config line 23: Deprecated option RSAAuthentication
Mar 17 05:52:21 virt04.gamull.com sshd[28426]: Server listening on 0.0.0.0 port 2223.
Mar 17 05:52:21 virt04.gamull.com sshd[28426]: Server listening on :: port 2223.
Mar 17 05:52:21 virt04.gamull.com goferd[1430]: [INFO][worker-0] root:542 - Disconnected
Mar 17 05:52:21 virt04.gamull.com goferd[1430]: [ERROR][worker-0] gofer.messaging.adapter.connect:33 - connect: proton+amqps://subscription.rhsm.redhat.com:5647, failed: Connection amqps://subscription.rhsm.redhat.com:5647 disconnected
Mar 17 05:52:21 virt04.gamull.com goferd[1430]: [INFO][worker-0] gofer.messaging.adapter.connect:35 - retry in 106 seconds
Mar 17 05:52:22 virt04.gamull.com vdsm[28417]: ERROR add_watch: cannot watch /var/run/vdsm/dhclientmon WD=-1, Errno=No such file or directory (ENOENT)
Mar 17 05:52:22 virt04.gamull.com vdsm[28417]: ERROR Unhandled exception
                                               Traceback (most recent call last):
                                                 File "/usr/lib/python2.7/site-packages/vdsm/common/logutils.py", line 197, in wrapper
                                                   return f(*a, **kw)
                                                 File "/usr/lib/python2.7/site-packages/vdsm/network/dhclient_monitor.py", line 139, in _monitor_dhcp_responses
                                                   for filePath in sorted(os.listdir(MONITOR_FOLDER)):
                                               OSError: [Errno 2] No such file or directory: '/var/run/vdsm/dhclientmon'
Mar 17 05:52:22 virt04.gamull.com vdsm[28417]: WARN Not ready yet, ignoring event '|virt|VM_status|1e424040-ba67-4138-bd71-63cd2afdd5dd' args={'1e424040-ba67-4138-bd71-63cd2afdd5dd': {'status': 'Powering up', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '127.0.0.1', 'type': 'vnc', 'port': '5900'}], 'hash': '1638793806033292744', 'cpuUser': '0.00', 'monitorResponse': '0', 'elapsedTime': '0', 'cpuSys': '0.00', 'vcpuPeriod': 100000L, 'timeOffset': '0', 'clientIp': '', 'pauseCode': 'NOERR', 'vcpuQuota': '-1'}}
Mar 17 05:52:23 virt04.gamull.com vdsm[28417]: WARN MOM not available.
Mar 17 05:52:23 virt04.gamull.com vdsm[28417]: WARN MOM not available, KSM stats will be missing.
Mar 17 05:52:26 virt04.gamull.com python[28563]: ansible-ovirt_hosts_facts Invoked with pattern=name=virt04.gamull.com fetch_nested=False nested_attributes=[] auth={'ca_file': None, 'url': 'https://engine.gamull.com/ovirt-engine/api', 'insecure': True, 'kerberos': False, 'compress': True, 'headers': None, 'token': 'RlqrCZ1HxywSdd4C0vQqT2lgui9BMWjX0D3xNH7evDw1lldDEK1W0BoB2OQLLl5KmOlHz9-vRwgtQq0Kznt3TA', 'timeout': 0}
Mar 17 05:52:26 virt04.gamull.com systemd[1]: mom-vdsm.service holdoff time over, scheduling restart.
Mar 17 05:52:26 virt04.gamull.com systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 17 05:52:26 virt04.gamull.com systemd[1]: Started MOM instance configured for VDSM purposes.
Mar 17 05:52:26 virt04.gamull.com systemd[1]: Starting MOM instance configured for VDSM purposes...
Mar 17 05:52:29 virt04.gamull.com sshd[28584]: Accepted publickey for root from 192.168.122.10 port 36596 ssh2: RSA SHA256:5fIUyD//xr7SwVrjEM4Rg5ty8QG+AAidKiRbODDR+pk

Comment 4 Tom Gamull 2018-03-17 10:02:49 UTC
looks like the engine.log on engine.gamull.com is indicating the ovirtmgmt interface is invalid.  will retry without ovirtmgmt configured before setup

2018-03-17 05:54:43,790-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] Exception: org.ovirt.engine.core.bll.network.NetworkConfigurator$NetworkConfiguratorException: Interface ovirtmgmt is invalid for management network
        at org.ovirt.engine.core.bll.network.NetworkConfigurator.findNicToSetupManagementNetwork(NetworkConfigurator.java:166) [bll.jar:]
        at org.ovirt.engine.core.bll.network.NetworkConfigurator.createManagementNetworkIfRequired(NetworkConfigurator.java:84) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand.configureManagementNetwork(InstallVdsInternalCommand.java:297) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand.installHost(InstallVdsInternalCommand.java:216) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand.executeCommand(InstallVdsInternalCommand.java:112) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1133) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1285) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1934) [bll.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:164) [utils.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:103) [utils.jar:]
        at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1345) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:400) [bll.jar:]
        at org.ovirt.engine.core.bll.executor.DefaultBackendActionExecutor.execute(DefaultBackendActionExecutor.java:13) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:468) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:450) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:656) [bll.jar:]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0_161]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [rt.jar:1.8.0_161]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_161]
        at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_161]
        at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:437)
        at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:70) [wildfly-weld-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2]
        at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:80) [wildfly-weld-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2]
        at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:93) [wildfly-weld-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2]
        at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43) [wildfly-ejb3-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:437)
        at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73) [weld-core-impl.jar:2.3.3.Final-redhat-1]
        at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:83) [wildfly-weld-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.as.ee.concurrent.ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.java:45) [wildfly-ee-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:21)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61)
        at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:52)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53) [wildfly-ejb3-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:263) [wildfly-ejb3-7.0.0.GA-redhat-2.jar:7.0.0.GA-redhat-2]

.....
2018-03-17 05:54:43,799-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] Host installation failed for host '9eb58b8a-44f9-412c-9061-754d72d06a1c', 'virt04.gamull.com': Failed to configure management network on the host
2018-03-17 05:54:43,809-04 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] START, SetVdsStatusVDSCommand(HostName = virt04.gamull.com, SetVdsStatusVDSCommandParameters:{hostId='9eb58b8a-44f9-412c-9061-754d72d06a1c', status='NonOperational', nonOperationalReason='NONE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 28c858a4
2018-03-17 05:54:43,818-04 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] FINISH, SetVdsStatusVDSCommand, log id: 28c858a4
2018-03-17 05:54:43,831-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] EVENT_ID: VDS_INSTALL_FAILED(505), Host virt04.gamull.com installation failed. Failed to configure management network on the host.
2018-03-17 05:54:43,846-04 INFO  [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [590ce3d9] Lock freed to object 'EngineLock:{exclusiveLocks='[9eb58b8a-44f9-412c-9061-754d72d06a1c=VDS]', sharedLocks=''}'
2018-03-17 05:54:46,991-04 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [] START, GetHardwareInfoVDSCommand(HostName = virt04.gamull.com, VdsIdAndVdsVDSCommandParametersBase:{hostId='9eb58b8a-44f9-412c-9061-754d72d06a1c', vds='Host[virt04.gamull.com,9eb58b8a-44f9-412c-9061-754d72d06a1c]'}), log id: 79a82d82
2018-03-17 05:54:47,050-04 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [] FINISH, GetHardwareInfoVDSCommand, log id: 79a82d82
2018-03-17 05:54:47,459-04 INFO  [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2cae1c56] Running command: SetNonOperationalVdsCommand internal: true. Entities affected :  ID: 9eb58b8a-44f9-412c-9061-754d72d06a1c Type: VDS
2018-03-17 05:54:47,466-04 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2cae1c56] START, SetVdsStatusVDSCommand(HostName = virt04.gamull.com, SetVdsStatusVDSCommandParameters:{hostId='9eb58b8a-44f9-412c-9061-754d72d06a1c', status='NonOperational', nonOperationalReason='NETWORK_UNREACHABLE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 459189d1
2018-03-17 05:54:47,475-04 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2cae1c56] FINISH, SetVdsStatusVDSCommand, log id: 459189d1
2018-03-17 05:54:47,701-04 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2cae1c56] Host 'virt04.gamull.com' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt'
2018-03-17 05:54:47,816-04 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2cae1c56] EVENT_ID: VDS_SET_NONOPERATIONAL_NETWORK(519), Host virt04.gamull.com does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt'
2018-03-17 05:54:47,984-04 INFO  [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [2d5e2769] Running command: HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities affected :  ID: 9eb58b8a-44f9-412c-9061-754d72d06a1c Type: VDS
2018-03-17 05:54:48,063-04 INFO  [org.ovirt.engine.core.bll.UpdateClusterCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [ebcca4] Lock Acquired to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='[]'}'
2018-03-17 05:54:48,317-04 INFO  [org.ovirt.engine.core.bll.UpdateClusterCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [ebcca4] Running command: UpdateClusterCommand internal: true. Entities affected :  ID: ccfabfac-29c7-11e8-a8a8-00163e710c5a Type: ClusterAction group EDIT_CLUSTER_CONFIGURATION with role type ADMIN
2018-03-17 05:54:48,370-04 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [ebcca4] EVENT_ID: SYSTEM_UPDATE_CLUSTER(835), Host cluster Default was updated by system
2018-03-17 05:54:48,372-04 INFO  [org.ovirt.engine.core.bll.UpdateClusterCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [ebcca4] Lock freed to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='[]'}'
2018-03-17 05:54:48,400-04 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [ebcca4] EVENT_ID: VDS_DETECTED(13), Status of host virt04.gamull.com was set to NonOperational.

Comment 5 Tom Gamull 2018-03-17 11:30:42 UTC
Latest is still errors on host

relevant log is engine.log from engine

2018-03-17 07:27:57,483-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-24) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM virt04.gamull.com command GetCapabilitiesVDS failed: Message timeout which can be caused by communication issues
2018-03-17 07:27:57,484-04 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-24) [] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand' return value 'org.ovirt.engine.core.vdsbroker.vdsbroker.VDSInfoReturn@1fd4e0c1'
2018-03-17 07:27:57,484-04 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-24) [] HostName = virt04.gamull.com
2018-03-17 07:27:57,485-04 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-24) [] Command 'GetCapabilitiesVDSCommand(HostName = virt04.gamull.com, VdsIdAndVdsVDSCommandParametersBase:{hostId='a55ba422-bd50-47c2-85aa-bb4427d4621c', vds='Host[virt04.gamull.com,a55ba422-bd50-47c2-85aa-bb4427d4621c]'})' execution failed: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues
2018-03-17 07:27:57,485-04 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-24) [] Failure to refresh host 'virt04.gamull.com' runtime info: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues
2018-03-17 07:28:00,494-04 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [6ae12f83] Connecting to virt04.gamull.com/10.1.10.84

Comment 7 Nikolai Sednev 2018-05-14 11:25:02 UTC
I've also tried to deploy Niode 0 over FC using tagged interface and failed with:

[ INFO  ] TASK [Check host status]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The host has been set in non_operational status, please check engine logs, fix accordingly and re-deploy.\n"}
[ INFO  ] TASK [include_tasks]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [Remove local vm dir]
[ INFO  ] changed: [localhost]
[ INFO  ] TASK [Notify the user about a failure]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.\n"}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook
[ INFO  ] Stage: Clean up
[ INFO  ] Cleaning temporary resources
[ INFO  ] TASK [Gathering Facts]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [include_tasks]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [Remove local vm dir]
[ INFO  ] ok: [localhost]
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180514134621.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch.
          Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180514133147-32lu0a.log


In vdsm.log I saw:
2018-05-14 13:46:03,387+0300 ERROR (jsonrpc/0) [vds] connectivity check failed (API:1530)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1527, in setupNetworks
    supervdsm.getProxy().setupNetworks(networks, bondings, options)
  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 55, in __call__
    return callMethod()
  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 53, in <lambda>
    **kwargs)
  File "<string>", line 2, in setupNetworks
  File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod
    raise convert_to_error(kind, result)
ConfigNetworkError: (10, 'connectivity check failed')
2018-05-14 13:46:03,388+0300 INFO  (jsonrpc/0) [api.network] FINISH setupNetworks return={'status': {'message': 'conne
ctivity check failed', 'code': 10}} from=::ffff:192.168.122.182,49644, flow_id=10bd1361 (api:52)
2018-05-14 13:46:03,388+0300 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Host.setupNetworks failed (error 10) i
n 159.95 seconds (__init__:573)
2018-05-14 13:46:03,407+0300 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call Host.confirmConnectivity succeeded in 
0.00 seconds (__init__:573)

Tested on these components:
ovirt-engine-setup-4.2.3.4-0.1.el7.noarch
rhvm-appliance-4.2-20180504.0.el7.noarch
ovirt-hosted-engine-ha-2.2.11-1.el7ev.noarch
ovirt-hosted-engine-setup-2.2.20-1.el7ev.noarch
Linux 3.10.0-862.2.3.el7.x86_64 #1 SMP Mon Apr 30 12:37:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

Attaching the sosreports from both engine and host here, just in case.

Comment 8 Nikolai Sednev 2018-05-14 11:26:03 UTC
Created attachment 1436128 [details]
engine logs

Comment 9 Nikolai Sednev 2018-05-14 11:27:20 UTC
Created attachment 1436129 [details]
sosreport from puma18

Comment 11 Doron Fediuck 2018-05-14 12:18:58 UTC
The failure says-
'connectivity check failed', 'code': 10

Did you change anything in the network or made sure there's proper network defined?

Comment 12 Nikolai Sednev 2018-05-14 12:29:24 UTC
(In reply to Doron Fediuck from comment #11)
> The failure says-
> 'connectivity check failed', 'code': 10
> 
> Did you change anything in the network or made sure there's proper network
> defined?

Changed in the network? 
I think the only change was to configure VLAN and use it as interface during deployment.

Comment 13 Simone Tiraboschi 2018-05-14 12:38:31 UTC
The issue is here:

2018-05-14 13:46:04,908+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Failed in 'HostSetupNetworksVDS' method
2018-05-14 13:46:04,922+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM puma18-vlan404.scl.lab.tlv.redhat.com command HostSetupNetworksVDS failed: connectivity check failed
2018-05-14 13:46:04,923+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Error: VDSGenericException: VDSErrorException: Failed to HostSetupNetworksVDS, error = connectivity check failed, code = 10
2018-05-14 13:46:04,923+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Command 'HostSetupNetworksVDSCommand(HostName = puma18-vlan404.scl.lab.tlv.redhat.com, HostSetupNetworksVdsCommandParameters:{hostId='749e6c33-7a22-4772-9e7b-90048b07a317', vds='Host[puma18-vlan404.scl.lab.tlv.redhat.com,749e6c33-7a22-4772-9e7b-90048b07a317]', rollbackOnFailure='true', connectivityTimeout='120', networks='[HostNetwork:{defaultRoute='true', bonding='false', networkName='ovirtmgmt', vdsmName='ovirtmgmt', nicName='enp5s0f1', vlan='404', mtu='0', vmNetwork='true', stp='false', properties='null', ipv4BootProtocol='DHCP', ipv4Address='null', ipv4Netmask='null', ipv4Gateway='null', ipv6BootProtocol='AUTOCONF', ipv6Address='null', ipv6Prefix='null', ipv6Gateway='null', nameServers='null'}]', removedNetworks='[]', bonds='[]', removedBonds='[]', clusterSwitchType='LEGACY'})' execution failed: VDSGenericException: VDSErrorException: Failed to HostSetupNetworksVDS, error = connectivity check failed, code = 10
2018-05-14 13:46:04,924+03 INFO  [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Host setup networks finished. Lock released. Monitoring can run now for host 'puma18-vlan404.scl.lab.tlv.redhat.com' from data-center 'Default'
2018-05-14 13:46:04,924+03 ERROR [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Command 'org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to HostSetupNetworksVDS, error = connectivity check failed, code = 10 (Failed with error noConPeer and code 10)
2018-05-14 13:46:04,927+03 INFO  [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] Lock freed to object 'EngineLock:{exclusiveLocks='[HOST_NETWORK749e6c33-7a22-4772-9e7b-90048b07a317=HOST_NETWORK]', sharedLocks=''}'
2018-05-14 13:46:04,939+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [10bd1361] EVENT_ID: SETUP_NETWORK_FAILED_FOR_MANAGEMENT_NETWORK_CONFIGURATION(1,120), Failed to configure management network on host puma18-vlan404.scl.lab.tlv.redhat.com due to setup networks failure.

The host failed to contact back the engine to test its network configuration, and so it rolled back the network configuration and we lost the VM.

Let's start checking if reproducible and how.

Comment 14 Nikolai Sednev 2018-05-14 16:58:10 UTC
puma19 ~]# ifconfig
enp5s0f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.35.163.56  netmask 255.255.252.0  broadcast 10.35.163.255
        inet6 fe80::73b1:29ec:5152:1bdf  prefixlen 64  scopeid 0x20<link>
        ether 44:1e:a1:73:39:60  txqueuelen 1000  (Ethernet)
        RX packets 14495  bytes 913716 (892.3 KiB)
        RX errors 0  dropped 1  overruns 0  frame 0
        TX packets 15  bytes 1738 (1.6 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device memory 0xfbe60000-fbe7ffff  

enp5s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.35.160.47  netmask 255.255.252.0  broadcast 10.35.163.255
        inet6 fe80::461e:a1ff:fe73:3961  prefixlen 64  scopeid 0x20<link>
        ether 44:1e:a1:73:39:61  txqueuelen 1000  (Ethernet)
        RX packets 14587  bytes 949847 (927.5 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 642  bytes 162129 (158.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device memory 0xfbee0000-fbefffff  

enp5s0f1.404: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.35.147.67  netmask 255.255.255.240  broadcast 10.35.147.79
        inet6 fe80::44b0:f299:a9aa:af77  prefixlen 64  scopeid 0x20<link>
        ether 00:16:3e:7b:b8:65  txqueuelen 1000  (Ethernet)
        RX packets 4  bytes 928 (928.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 10  bytes 1868 (1.8 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 32  bytes 2928 (2.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 32  bytes 2928 (2.8 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

virbr0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 192.168.122.1  netmask 255.255.255.0  broadcast 192.168.122.255
        ether 52:54:00:38:8c:04  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

puma19 ~]# multipath -ll 
3514f0c5a5160177c dm-0 XtremIO ,XtremApp        
size=100G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='queue-length 0' prio=1 status=active
  |- 6:0:0:1 sdb 8:16 active ready running
  `- 6:0:1:1 sdc 8:32 active ready running


ovirtmgmt bridge was not created.
puma19-vlan404 NetworkManager[778]: <warn>  [1526313643.1282] iff
cfg-rh: cannot load /etc/sysconfig/network-scripts/ifcfg-enp5s0f1.404 due to conn
flicting UUID for /etc/sysconfig/network-scripts/ifcfg-VLAN_connection_1 (eb6f0bb
28-957f-4a5d-8d52-a5c12e7496f8,"VLAN connection 1")
May 14 19:00:43 puma19-vlan404 NetworkManager[778]: <warn>  [1526313643.5656] iff
cfg-rh: cannot load /etc/sysconfig/network-scripts/ifcfg-enp5s0f1.404 due to conn
flicting UUID for /etc/sysconfig/network-scripts/ifcfg-VLAN_connection_1 (eb6f0bb
28-957f-4a5d-8d52-a5c12e7496f8,"VLAN connection 1")

Comment 15 Nikolai Sednev 2018-05-14 17:07:25 UTC
Created attachment 1436368 [details]
logs from puma19

Comment 16 Nikolai Sednev 2018-05-14 17:08:25 UTC
Created attachment 1436369 [details]
logs from engine

Comment 17 Nikolai Sednev 2018-05-14 17:09:28 UTC
Logs from comments #15 and #16 are from another clean deployment over VLAN on another host.

Comment 18 Yaniv Kaul 2018-05-23 10:57:24 UTC
Nikolai, how was the VLAN created? Can you explain the conflict:
cannot load /etc/sysconfig/network-scripts/ifcfg-enp5s0f1.404 due to conn
flicting UUID for /etc/sysconfig/network-scripts/ifcfg-VLAN_connection_1 (eb6f0bb
28-957f-4a5d-8d52-a5c12e7496f8,"VLAN connection 1"

?

Comment 19 Nikolai Sednev 2018-05-23 11:47:20 UTC
(In reply to Yaniv Kaul from comment #18)
> Nikolai, how was the VLAN created? Can you explain the conflict:
> cannot load /etc/sysconfig/network-scripts/ifcfg-enp5s0f1.404 due to conn
> flicting UUID for /etc/sysconfig/network-scripts/ifcfg-VLAN_connection_1
> (eb6f0bb
> 28-957f-4a5d-8d52-a5c12e7496f8,"VLAN connection 1"
> 
> ?

nmtui.
I'm not sure why this conflict appeared.

Comment 20 Simone Tiraboschi 2018-05-23 13:29:25 UTC
(In reply to Nikolai Sednev from comment #14)
> 
> enp5s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>         ether 44:1e:a1:73:39:61  txqueuelen 1000  (Ethernet)
> 
> enp5s0f1.404: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>         ether 00:16:3e:7b:b8:65  txqueuelen 1000  (Ethernet)

Please note that you were also spoofing the mac address over the vlan interface.

Comment 21 Nikolai Sednev 2018-05-23 13:40:17 UTC
(In reply to Simone Tiraboschi from comment #20)
> (In reply to Nikolai Sednev from comment #14)
> > 
> > enp5s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> >         ether 44:1e:a1:73:39:61  txqueuelen 1000  (Ethernet)
> > 
> > enp5s0f1.404: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> >         ether 00:16:3e:7b:b8:65  txqueuelen 1000  (Ethernet)
> 
> Please note that you were also spoofing the mac address over the vlan
> interface.

Should it be a problem?
I've received appropriate IP address through that VLAN to spoofed MAC, hence network connectivity was fine, otherwise I could not even get an IP address to that VLAN interface.

Comment 24 Simone Tiraboschi 2018-07-05 10:33:17 UTC
Closing as insufficient data since I never really reproduced this on a clean environment.
Please reopen this if we find a clean reproducer.

Comment 25 D. Tervooren 2018-07-26 10:59:13 UTC
I think I got some kind of the same Problem on my clean system.

My configuration is 

4 NICS => bond0 (mode balance-rr) => bond.5 => (host ip & engine ip)

I will attach a log after this Post. The error occurs in line 2123
Maybe this is a reproduction of this? Tried this 3 times.

Comment 26 D. Tervooren 2018-07-26 11:00:45 UTC
Created attachment 1470697 [details]
ovirt hosted engine setup non_operational log

Error occurs in line 2123

Comment 27 Simone Tiraboschi 2018-07-26 11:58:12 UTC
(In reply to D. Tervooren from comment #25)
> I think I got some kind of the same Problem on my clean system.
> 
> My configuration is 
> 
> 4 NICS => bond0 (mode balance-rr) => bond.5 => (host ip & engine ip)

Your issue is in the vlan over bond interface name: vdsm is pretty picky on this, see also:
https://bugzilla.redhat.com/show_bug.cgi?id=1379833

If your bond is called bond0, vlan 5 over bond0 should be called exactly bond0.5 and nothing else.

Can you please report back if it's correctly working with bond0.5 ?

Comment 28 D. Tervooren 2018-07-26 19:23:24 UTC
(In reply to Simone Tiraboschi from comment #27)
> (In reply to D. Tervooren from comment #25)
> > I think I got some kind of the same Problem on my clean system.
> > 
> > My configuration is 
> > 
> > 4 NICS => bond0 (mode balance-rr) => bond.5 => (host ip & engine ip)
> 
> Your issue is in the vlan over bond interface name: vdsm is pretty picky on
> this, see also:
> https://bugzilla.redhat.com/show_bug.cgi?id=1379833
> 
> If your bond is called bond0, vlan 5 over bond0 should be called exactly
> bond0.5 and nothing else.
> 
> Can you please report back if it's correctly working with bond0.5 ?

Hi Simone,

I'm sorry that was a typo in my original post. The Interface was named bond0.5 of course.

Anyway I got it working now by testing some stuff with NIC configuration. I found a file in the GitHub repo (https://github.com/oVirt/ovirt-hosted-engine-setup/blob/master/src/ansible/get_network_interfaces.yml) which on line 15 defines a variable called 'acceptable_bond_modes'.
My bond was running in balance-rr which is not contained. I guess ansible did not recognize this correctly because of the vlan on top of it.

I now decided move the Engine-Bridge on a dedicated untagged bond in balance-xor mode which Worked for me.


Note You need to log in before you can comment on or make changes to this bug.