Bug 1567068

Summary: [Ganesha] Observing "Unable to find nic or netmask" message in pcs status while performing add node/node reboot operation
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Manisha Saini <msaini>
Component: common-haAssignee: Kaleb KEITHLEY <kkeithle>
Status: CLOSED CURRENTRELEASE QA Contact: Manisha Saini <msaini>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: amukherj, dang, grajoria, jthottan, kgaillot, msaini, oalbrigt, pasik, rhs-bugs, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-19 10:38:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Manisha Saini 2018-04-13 11:40:08 UTC
Description of problem:

While performing add-node/reboot-node operation ,Observing "Unable to find nic or netmask" messages in pcs status under Failed actions

* dhcp46-116.lab.eng.blr.redhat.com-cluster_ip-1_start_0 on dhcp37-103.lab.eng.blr.redhat.com 'unknown error' (1): call=137, status=complete, exitreason='Unable to find nic or netmask.',



Version-Release number of selected component (if applicable):
# rpm -qa | grep ganesha
nfs-ganesha-gluster-2.5.5-4.el7rhgs.x86_64
glusterfs-ganesha-3.12.2-7.el7rhgs.x86_64
nfs-ganesha-2.5.5-4.el7rhgs.x86_64

#pacemaker-1.1.18-11.el7.x86_64
#pcs-0.9.162-5.el7_5.1.x86_64
#corosync-2.4.3-2.el7.x86_64

# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.5 (Maipo)

How reproducible:
2/2

Steps to Reproduce:
1.Create 4 node ganesha cluster
2.Expand the cluster from 4 node to 8 node by adding node one by one via gdeploy
3.Check pcs status

===================================

# pcs status
Cluster name: ganesha-ha-360
Stack: corosync
Current DC: dhcp47-193.lab.eng.blr.redhat.com (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Fri Apr 13 06:35:48 2018
Last change: Fri Apr 13 00:48:05 2018 by root via crm_attribute on dhcp46-116.lab.eng.blr.redhat.com

8 nodes configured
48 resources configured

Online: [ dhcp37-103.lab.eng.blr.redhat.com dhcp37-121.lab.eng.blr.redhat.com dhcp37-136.lab.eng.blr.redhat.com dhcp37-218.lab.eng.blr.redhat.com dhcp46-116.lab.eng.blr.redhat.com dhcp46-184.lab.eng.blr.redhat.com dhcp47-193.lab.eng.blr.redhat.com dhcp47-2.lab.eng.blr.redhat.com ]

Full list of resources:

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ dhcp37-103.lab.eng.blr.redhat.com dhcp37-121.lab.eng.blr.redhat.com dhcp37-136.lab.eng.blr.redhat.com dhcp37-218.lab.eng.blr.redhat.com dhcp46-116.lab.eng.blr.redhat.com dhcp46-184.lab.eng.blr.redhat.com dhcp47-193.lab.eng.blr.redhat.com dhcp47-2.lab.eng.blr.redhat.com ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ dhcp37-103.lab.eng.blr.redhat.com dhcp37-121.lab.eng.blr.redhat.com dhcp37-136.lab.eng.blr.redhat.com dhcp37-218.lab.eng.blr.redhat.com dhcp46-116.lab.eng.blr.redhat.com dhcp46-184.lab.eng.blr.redhat.com dhcp47-193.lab.eng.blr.redhat.com dhcp47-2.lab.eng.blr.redhat.com ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ dhcp37-103.lab.eng.blr.redhat.com dhcp37-121.lab.eng.blr.redhat.com dhcp37-136.lab.eng.blr.redhat.com dhcp37-218.lab.eng.blr.redhat.com dhcp46-116.lab.eng.blr.redhat.com dhcp46-184.lab.eng.blr.redhat.com dhcp47-193.lab.eng.blr.redhat.com dhcp47-2.lab.eng.blr.redhat.com ]
 Resource Group: dhcp37-103.lab.eng.blr.redhat.com-group
     dhcp37-103.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp37-103.lab.eng.blr.redhat.com
     dhcp37-103.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp37-103.lab.eng.blr.redhat.com
     dhcp37-103.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp37-103.lab.eng.blr.redhat.com
 Resource Group: dhcp37-121.lab.eng.blr.redhat.com-group
     dhcp37-121.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp37-121.lab.eng.blr.redhat.com
     dhcp37-121.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp37-121.lab.eng.blr.redhat.com
     dhcp37-121.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp37-121.lab.eng.blr.redhat.com
 Resource Group: dhcp37-218.lab.eng.blr.redhat.com-group
     dhcp37-218.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp37-218.lab.eng.blr.redhat.com
     dhcp37-218.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp37-218.lab.eng.blr.redhat.com
     dhcp37-218.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp37-218.lab.eng.blr.redhat.com
 Resource Group: dhcp46-116.lab.eng.blr.redhat.com-group
     dhcp46-116.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp46-116.lab.eng.blr.redhat.com
     dhcp46-116.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp46-116.lab.eng.blr.redhat.com
     dhcp46-116.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp46-116.lab.eng.blr.redhat.com
 Resource Group: dhcp46-184.lab.eng.blr.redhat.com-group
     dhcp46-184.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp46-184.lab.eng.blr.redhat.com
     dhcp46-184.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp46-184.lab.eng.blr.redhat.com
     dhcp46-184.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp46-184.lab.eng.blr.redhat.com
 Resource Group: dhcp47-193.lab.eng.blr.redhat.com-group
     dhcp47-193.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp47-193.lab.eng.blr.redhat.com
     dhcp47-193.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp47-193.lab.eng.blr.redhat.com
     dhcp47-193.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp47-193.lab.eng.blr.redhat.com
 Resource Group: dhcp47-2.lab.eng.blr.redhat.com-group
     dhcp47-2.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp47-2.lab.eng.blr.redhat.com
     dhcp47-2.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp47-2.lab.eng.blr.redhat.com
     dhcp47-2.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp47-2.lab.eng.blr.redhat.com
 Resource Group: dhcp37-136.lab.eng.blr.redhat.com-group
     dhcp37-136.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp37-136.lab.eng.blr.redhat.com
     dhcp37-136.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp37-136.lab.eng.blr.redhat.com
     dhcp37-136.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp37-136.lab.eng.blr.redhat.com

Failed Actions:
* dhcp37-103.lab.eng.blr.redhat.com-cluster_ip-1_start_0 on dhcp46-116.lab.eng.blr.redhat.com 'unknown error' (1): call=143, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Thu Apr 12 12:55:44 2018', queued=0ms, exec=91ms
* nfs-grace_monitor_5000 on dhcp46-116.lab.eng.blr.redhat.com 'unknown error' (1): call=177, status=Timed Out, exitreason='',
    last-rc-change='Fri Apr 13 06:12:02 2018', queued=0ms, exec=0ms
* nfs-mon_monitor_10000 on dhcp46-116.lab.eng.blr.redhat.com 'unknown error' (1): call=13, status=Timed Out, exitreason='',
    last-rc-change='Fri Apr 13 06:12:06 2018', queued=0ms, exec=0ms
* dhcp46-116.lab.eng.blr.redhat.com-nfs_block_monitor_10000 on dhcp46-116.lab.eng.blr.redhat.com 'unknown error' (1): call=174, status=Timed Out, exitreason='',
    last-rc-change='Fri Apr 13 06:12:04 2018', queued=0ms, exec=0ms
* dhcp37-136.lab.eng.blr.redhat.com-cluster_ip-1_start_0 on dhcp46-116.lab.eng.blr.redhat.com 'unknown error' (1): call=141, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Thu Apr 12 12:55:44 2018', queued=1ms, exec=118ms
* dhcp37-103.lab.eng.blr.redhat.com-cluster_ip-1_start_0 on dhcp47-2.lab.eng.blr.redhat.com 'unknown error' (1): call=141, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Thu Apr 12 12:55:41 2018', queued=0ms, exec=102ms
* dhcp37-136.lab.eng.blr.redhat.com-cluster_ip-1_start_0 on dhcp47-2.lab.eng.blr.redhat.com 'unknown error' (1): call=143, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Thu Apr 12 12:55:41 2018', queued=0ms, exec=92ms
* dhcp37-103.lab.eng.blr.redhat.com-cluster_ip-1_start_0 on dhcp47-193.lab.eng.blr.redhat.com 'unknown error' (1): call=143, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Thu Apr 12 12:55:42 2018', queued=0ms, exec=95ms
* dhcp37-136.lab.eng.blr.redhat.com-cluster_ip-1_start_0 on dhcp47-193.lab.eng.blr.redhat.com 'unknown error' (1): call=141, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Thu Apr 12 12:55:42 2018', queued=0ms, exec=100ms
* dhcp46-116.lab.eng.blr.redhat.com-cluster_ip-1_start_0 on dhcp37-121.lab.eng.blr.redhat.com 'unknown error' (1): call=167, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Thu Apr 12 14:16:55 2018', queued=0ms, exec=84ms
* dhcp46-116.lab.eng.blr.redhat.com-cluster_ip-1_start_0 on dhcp37-218.lab.eng.blr.redhat.com 'unknown error' (1): call=130, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Thu Apr 12 14:16:54 2018', queued=0ms, exec=103ms
* dhcp46-116.lab.eng.blr.redhat.com-cluster_ip-1_start_0 on dhcp37-103.lab.eng.blr.redhat.com 'unknown error' (1): call=137, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Thu Apr 12 14:16:56 2018', queued=0ms, exec=85ms
* dhcp46-116.lab.eng.blr.redhat.com-cluster_ip-1_start_0 on dhcp37-136.lab.eng.blr.redhat.com 'unknown error' (1): call=135, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Thu Apr 12 14:16:57 2018', queued=0ms, exec=86ms


Daemon Status:
  corosync: active/disabled
  pacemaker: active/enabled
  pcsd: active/enabled


========================================== 




Actual results:
Observing "Unable to find nic or netmask" messages in pcs status while performing add-node and node-reboot


Expected results:
No such messages should be seen in pcs status


Additional info:

Comment 2 Daniel Gryniewicz 2018-04-13 12:17:13 UTC
This is not an error message that originates from Ganesha.

Comment 3 Ken Gaillot 2018-04-17 00:35:24 UTC
The ocf:heartbeat:IPaddr2 resource agent (which adds the floating IP address) can accept "nic" and "cidr_netmask" parameters to specify particular ones. If they are not specified, it will try to determine them from the IP address. That isn't possible in this case -- maybe the IP is added before the interface is up?

Adding the resource-agents maintainer Oyvind Albrigtsen for additional insight.

Comment 4 Oyvind Albrigtsen 2018-04-17 07:20:09 UTC
Can you add info from running "ip a", "pcs resource show cluster_ip-1" and "pcs resource debug-start --full cluster_ip-1"?