Bug 1483935 - [RHSC] Cluster import fails - Unable to create ovirt-mgmt interface on rhgs nodes, .bak file exists!
Summary: [RHSC] Cluster import fails - Unable to create ovirt-mgmt interface on rhgs n...
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: vdsm
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: ---
Assignee: Sahina Bose
QA Contact: Sweta Anandpara
Depends On:
TreeView+ depends on / blocked
Reported: 2017-08-22 10:12 UTC by Sweta Anandpara
Modified: 2018-10-24 06:21 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2018-10-24 06:21:43 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1441530 0 high CLOSED setupNetworks fails after firewalld creates .bak files 2021-02-22 00:41:40 UTC

Internal Links: 1441530

Description Sweta Anandpara 2017-08-22 10:12:28 UTC
Description of problem:
Had a fresh newly installed 4node RHGS cluster (layered install on RHEL 7.4), and it was imported into Console management node. 
The installation failed in network-setup, and the hosts failed to come up. A traceback was seen in vdsm and supervdsm logs.

A .bak file did exist in /etc/sysconfig/network-scripts/. Removal of that file and reinstall of the nodes did work - the nodes became operational. However, there was no mention of the existence of .bak file anywhere in the traceback. 

The reason behind the existence of .bak file is unknown as of now. I would not expect it to be there as it was a freshly installed setup. The knowledge of this bug https://bugzilla.redhat.com/show_bug.cgi?id=1441530 prompted us to remove the .bak file and try it again, which worked.

Version-Release number of selected component (if applicable):
glusterfs-3.8.4-41 and vdsm-4.17.33-1.2.el7rhgs.noarch

How reproducible:
Hit it once

Steps to Reproduce:
1. Have a .bak file present on RHGS node, and import that into Console. 

Additional info:

Traceback in vdsm.log:

Traceback (most recent call last):
  File "/usr/share/vdsm/API.py", line 1650, in _rollback
    yield rollbackCtx
  File "/usr/share/vdsm/API.py", line 1502, in setupNetworks
    supervdsm.getProxy().setupNetworks(networks, bondings, options)
  File "/usr/share/vdsm/supervdsm.py", line 50, in __call__
    return callMethod()
  File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda>
  File "<string>", line 2, in setupNetworks
  File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod
    raise convert_to_error(kind, result)
ConfigNetworkError: (10, 'connectivity check failed')

Traceback in supervdsm.log:

Traceback (most recent call last):
  File "/usr/share/vdsm/supervdsmServer", line 118, in wrapper
    res = func(*args, **kwargs)
  File "/usr/share/vdsm/supervdsmServer", line 243, in setupNetworks
    return setupNetworks(networks, bondings, **options)
  File "/usr/share/vdsm/network/api.py", line 943, in setupNetworks
    options, logger)
  File "/usr/share/vdsm/network/api.py", line 800, in _check_connectivity
    'connectivity check failed')
ConfigNetworkError: (10, 'connectivity check failed')

Ovirt engine logs:

2017-08-22 15:35:17,790 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (DefaultQuartzScheduler_Worker-84) [29d34180] Host dhcp37-94.lab.eng.blr.redhat.com is set to Non-Operational, it is missing the following networks: ovirtmgmt
2017-08-22 15:35:17,865 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-84) [29d34180] Correlation ID: 29d34180, Job ID: 75863589-bc76-413e-8957-88257e8e718e, Call Stack: null, Custom Event ID: -1, Message: Host dhcp37-94.lab.eng.blr.redhat.com does not comply with the cluster rhgs33_rh7_4nodeNew networks, the following networks are missing on host: 'ovirtmgmt'
2017-08-22 15:35:17,984 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler_Worker-84) [29d34180] START, GlusterServersListVDSCommand(HostName = dhcp37-94.lab.eng.blr.redhat.com, HostId = 3407832b-5093-4227-83e3-9726f7b4ed31), log id: 651da351
2017-08-22 15:35:18,307 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler_Worker-84) [29d34180] FINISH, GlusterServersListVDSCommand, return: [, dhcp37-78.lab.eng.blr.redhat.com:CONNECTED, dhcp37-86.lab.eng.blr.redhat.com:CONNECTED, dhcp37-98.lab.eng.blr.redhat.com:CONNECTED], log id: 651da351
2017-08-22 15:35:18,328 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-84) [29d34180] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Status of host dhcp37-94.lab.eng.blr.redhat.com was set to NonOperational.
2017-08-22 15:35:19,467 INFO  [org.ovirt.engine.core.bll.HandleVdsVersionCommand] (DefaultQuartzScheduler_Worker-84) [4c65801b] Running command: HandleVdsVersionCommand internal: true. Entities affected :  ID: 3407832b-5093-4227-83e3-9726f7b4ed31 Type: VDS
2017-08-22 15:35:19,472 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-84) [4c65801b] Host 3407832b-5093-4227-83e3-9726f7b4ed31 : dhcp37-94.lab.eng.blr.redhat.com is already in NonOperational status for reason NETWORK_UNREACHABLE. SetNonOperationalVds command is skipped.

Comment 2 Sweta Anandpara 2017-08-22 10:42:49 UTC
Sosreports and vdsm and ovirtengine logs are copied @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/<bugnumber>/

[qe@rhsqe-repo 1483935]$ pwd
[qe@rhsqe-repo 1483935]$ hostname
[qe@rhsqe-repo 1483935]$ 
[qe@rhsqe-repo 1483935]$ ll
total 57476
drwxr-xr-x. 7 qe qe     4096 Aug 22 16:08 ovirt-engine
-rwxr-xr-x. 1 qe qe 14664612 Aug 22 16:07 sosreport-dhcp37-78.lab.eng.blr.redhat.com-20170822154252.tar.xz
-rwxr-xr-x. 1 qe qe 14843032 Aug 22 16:07 sosreport-dhcp37-86.lab.eng.blr.redhat.com-20170822154257.tar.xz
-rwxr-xr-x. 1 qe qe 14641280 Aug 22 16:07 sosreport-dhcp37-94.lab.eng.blr.redhat.com-20170822154336.tar.xz
-rwxr-xr-x. 1 qe qe 14677568 Aug 22 16:07 sosreport-dhcp37-98.lab.eng.blr.redhat.com-20170822154341.tar.xz
drwxr-xr-x. 3 qe qe     4096 Aug 22 16:09 vdsm_dhcp37_78
drwxr-xr-x. 3 qe qe     4096 Aug 22 16:09 vdsm_dhcp37_86
drwxr-xr-x. 3 qe qe     4096 Aug 22 16:09 vdsm_dhcp37_94
drwxr-xr-x. 3 qe qe     4096 Aug 22 16:09 vdsm_dhcp37_98
[qe@rhsqe-repo 1483935]$

Comment 6 Sahina Bose 2018-10-24 06:21:43 UTC
Closing as there's no further enhancements planned on RHGS-C

Note You need to log in before you can comment on or make changes to this bug.