Bug 971477 - [RHSC] Host (RHS Anshi) goes to Non-operational state after coming UP.
[RHSC] Host (RHS Anshi) goes to Non-operational state after coming UP.
Status: CLOSED NOTABUG
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: rhsc (Show other bugs)
2.1
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Timothy Asir
Shruti Sampat
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-06 11:48 EDT by Shruti Sampat
Modified: 2013-07-08 08:13 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-08 08:13:13 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
engine logs (3.34 MB, text/x-log)
2013-06-06 12:52 EDT, Shruti Sampat
no flags Details
vdsm logs (4.64 MB, text/x-log)
2013-06-06 13:02 EDT, Shruti Sampat
no flags Details

  None (edit)
Description Shruti Sampat 2013-06-06 11:48:19 EDT
Description of problem:
---------------------------------------
After being added to a 3.1 cluster, an Anshi node, goes to non-operational state after coming UP initially. Both glusterd and vdsmd are running.

Any message regarding the change in state of the host from UP to non-operational is not seen in the Events log.

The following message is seen in the Events log multiple times -

Bridged network ovirtmgmt is attached to multiple interfaces: eth2,eth0 on Host rhs-client20.lab.eng.blr.redhat.com.

The following is seen in the engine logs -

2013-06-06 20:52:23,003 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-16) Host 'rhs-client20.lab.eng.blr
.redhat.com' moved to Non-Operational state because interface/s 'eth0, ' are down which needed by network/s 'ovirtmgmt, ' in the current cluster

Version-Release number of selected component (if applicable):
Red Hat Storage Console Version: 2.1.0-0.bb2.el6rhs
glusterfs 3.3.0.10rhs
vdsm-4.9.6-23.el6rhs.x86_64

How reproducible:
Intermittent

Steps to Reproduce:
1. Install RHS Anshi iso on storage server.
2. Update glusterfs to glusterfs 3.3.0.10rhs and vdsm to vdsm-4.9.6-23.el6rhs.x86_64.
3. Add host to a 3.1 cluster via the Console.

Actual results:
Host comes up initially. Then goes to Non-operational state. On trying to activate the host, it again comes up. Then goes to Non-operational state again and so on.

Expected results:
Host should remain UP.

Additional info:
The host is a physical machine. The contents of the file /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt -

[root@rhs-client20 u5_rpms]# cat /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt 
DEVICE=ovirtmgmt
TYPE=Bridge
ONBOOT=yes
DELAY=0
BOOTPROTO=dhcp
NM_CONTROLLED=no

The contents of /etc/sysconfig/network-scripts/ifcfg-eth2

[root@rhs-client20 u5_rpms]# cat /etc/sysconfig/network-scripts/ifcfg-eth2
DEVICE="eth2"
BRIDGE="ovirtmgmt"
BOOTPROTO="dhcp"
HWADDR="00:25:90:93:62:02"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
NM_CONTROLLED="yes"
ONBOOT="yes"
TYPE="Ethernet"

The contents of /etc/sysconfig/network-scripts/ifcfg-eth0 - 

[root@rhs-client20 u5_rpms]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE="eth0"
HWADDR="00:25:90:7C:2C:7A"
NM_CONTROLLED="yes"
ONBOOT="yes"
Comment 2 Shruti Sampat 2013-06-06 12:52:56 EDT
Created attachment 757756 [details]
engine logs
Comment 3 Shruti Sampat 2013-06-06 13:02:29 EDT
Created attachment 757758 [details]
vdsm logs
Comment 7 Dan Kenigsberg 2013-06-19 06:26:22 EDT
What does `brctl show` have on your faulty host? (just to rule out that vdsm is lying about the ovirtmgmt being connected to two nics)

{'ovirtmgmt': {'addr': '10.70.36.44', 'cfg': {'DELAY': '0', 'NM_CONTROLLED': 'no', 'BOOTPROTO': 'dhcp', 'DEVICE': 'ovirtmgmt', 'TYPE': 'Bridge', 'ONBOOT': 'yes'}, 'mtu': '1500', 'netmask': '255.255.254.0', 'stp': 'off', 'ports': ['eth0', 'eth2']}}

Does it reproduce on any other system?
Does it go away once you manually

  brctl delif ovirtmgmt eth0
Comment 8 Shruti Sampat 2013-06-19 07:04:09 EDT
This is the output of `brctl show` - 

[root@rhs-client20 ~]# brctl show
bridge name     bridge id               STP enabled     interfaces
ovirtmgmt               8000.0025907c2c7a       no              eth0
                                                        eth2

Yes, it goes away after doing 'brctl delif ovirtmgmt eth0'.

I will try with another machine and let you know if it happens again.
Comment 10 Shruti Sampat 2013-07-08 08:13:13 EDT
Closing as NOTABUG, because I am unable to reproduce the issue.

Note You need to log in before you can comment on or make changes to this bug.