Bug 678947 - rdma try to stop nonconfigured ib interfaces
Summary: rdma try to stop nonconfigured ib interfaces
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: rdma
Version: 6.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Doug Ledford
QA Contact: Infiniband QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-02-21 03:03 UTC by Masahiro Matsuya
Modified: 2018-11-14 14:57 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-06 15:33:00 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2011:1639 0 normal SHIPPED_LIVE RDMA stack bug fix and enhancement update 2011-12-06 00:50:39 UTC

Description Masahiro Matsuya 2011-02-21 03:03:44 UTC
Description of problem:

This host has two infiniband ports (ib0 and ib1), only the first one (ib0) is used and configured. This means that /etc/sysconfig/network-scripts/ifcfg-ib0 is present but ifcfg-ib1 is not present.

In this situation, ib0 only is made UP with "service rdma start" properly.
But, when "service rdma stop" is executed, the following error is outputed and fails on the way.

# service rdma stop
Unloading OpenIB kernel modules:/etc/init.d/rdma: line 308: /etc/sysconfig/network-scripts/ifcfg-ib1: No such file or directory

This is because rdma init script runs ifcfg-ib1, although ib1 is DOWN.

Version-Release number of selected component (if applicable):
RHEL6.0
rdma-1.0-9.el6

How reproducible:
Always

Steps to Reproduce:
1. get a machine with two infiniband ports
2. don't put ifcfg-ib1
3. service rdma start
4. service rdma stop
  
Actual results:
# service rdma stop
Unloading OpenIB kernel modules:/etc/init.d/rdma: line 308: /etc/sysconfig/network-scripts/ifcfg-ib1: No such file or directory

Expected results:
"service rdma stop" doesn't fail.

Comment 2 Doug Ledford 2011-02-21 14:05:12 UTC
Your patch won't work.  The service rdma stop action will unload the ipoib module, which means it doesn't matter whether or not you have an ifcfg-ib1 file in /etc/sysconfig/network-scripts, that interface *must* be downed before the ipoib module can be unloaded.  I'll think about the issue you have reported and possible solutions, but the patch as you posted will break the unloading of the service under a number of conditions.

Comment 3 Doug Ledford 2011-08-04 17:39:28 UTC
I've fixed this in a manner that will always down whatever interfaces exist prior to unloading the ipoib kernel module regardless of config file, but if the config file is present, it will use it.

Comment 6 Honggang LI 2011-08-31 01:22:42 UTC
This issue had been verified on RHEL-6.2 X86_64.

[root@rdma4 network-scripts]# uname -a
Linux rdma4.rhts.eng.bos.redhat.com 2.6.32-191.el6.x86_64 #1 SMP Wed Aug 17 20:22:22 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
[root@rdma4 network-scripts]# pwd
/etc/sysconfig/network-scripts
[root@rdma4 network-scripts]# ls
ifcfg-eth0  ifdown       ifdown-ippp  ifdown-ppp     ifup          ifup-ib      ifup-isdn   ifup-post    ifup-tunnel       network-functions
ifcfg-eth1  ifdown-bnep  ifdown-ipv6  ifdown-routes  ifup-aliases  ifup-ib-lhg  ifup-lhg    ifup-ppp     ifup-wireless     network-functions-ipv6
ifcfg-ib0   ifdown-eth   ifdown-isdn  ifdown-sit     ifup-bnep     ifup-ippp    ifup-plip   ifup-routes  init.ipv6-global  original.ifcfg-ib0
ifcfg-lo    ifdown-ib    ifdown-post  ifdown-tunnel  ifup-eth      ifup-ipv6    ifup-plusb  ifup-sit     net.hotplug
[root@rdma4 network-scripts]# ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:15:C5:E8:B1:96  
          inet addr:10.16.64.189  Bcast:10.16.71.255  Mask:255.255.248.0
          inet6 addr: fec0::f101:215:c5ff:fee8:b196/64 Scope:Site
          inet6 addr: fec0:0:a10:4000:215:c5ff:fee8:b196/64 Scope:Site
          inet6 addr: fe80::215:c5ff:fee8:b196/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:713514 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15690 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:64087335 (61.1 MiB)  TX bytes:2163718 (2.0 MiB)
          Interrupt:16 Memory:f8000000-f8012800 

eth1      Link encap:Ethernet  HWaddr 00:15:C5:E8:B1:98  
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:16 Memory:f4000000-f4012800 

Ifconfig uses the ioctl access method to get the full address information, which limits hardware addresses to 8 bytes.
Because Infiniband address has 20 bytes, only the first 8 bytes are displayed correctly.
Ifconfig is obsolete! For replacement check ip.
ib0       Link encap:InfiniBand  HWaddr 80:00:00:03:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00  
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:13 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4 errors:0 dropped:17 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:728 (728.0 b)  TX bytes:264 (264.0 b)

Ifconfig uses the ioctl access method to get the full address information, which limits hardware addresses to 8 bytes.
Because Infiniband address has 20 bytes, only the first 8 bytes are displayed correctly.
Ifconfig is obsolete! For replacement check ip.
ib1       Link encap:InfiniBand  HWaddr 80:00:00:03:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00  
          BROADCAST MULTICAST  MTU:2044  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:12 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1496 (1.4 KiB)  TX bytes:1496 (1.4 KiB)

[root@rdma4 network-scripts]# ibstat
CA 'qib0'
	CA type: InfiniPath_QLE7140
	Number of ports: 1
	Firmware version: 
	Hardware version: 1
	Node GUID: 0x001175000068709f
	System image GUID: 0x001175000068709f
	Port 1:
		State: Active
		Physical state: LinkUp
		Rate: 10
		Base lid: 3
		LMC: 0
		SM lid: 1
		Capability mask: 0x06610868
		Port GUID: 0x001175000068709f
		Link layer: InfiniBand
CA 'qib1'
	CA type: InfiniPath_QLE7140
	Number of ports: 1
	Firmware version: 
	Hardware version: 1
	Node GUID: 0x0011750000687070
	System image GUID: 0x001175000068709f
	Port 1:
		State: Initializing
		Physical state: LinkUp
		Rate: 10
		Base lid: 65535
		LMC: 0
		SM lid: 65535
		Capability mask: 0x06610868
		Port GUID: 0x0011750000687070
		Link layer: InfiniBand
[root@rdma4 network-scripts]# /sbin/service rdma stop
Unloading OpenIB kernel modules:[  OK  ]
[root@rdma4 network-scripts]# /sbin/service rdma start
Loading OpenIB kernel modules:[  OK  ]
[root@rdma4 network-scripts]# /sbin/service rdma stop
Unloading OpenIB kernel modules:[  OK  ]
[root@rdma4 network-scripts]# lsmod | grep -i ib
speedstep_lib           5367  1 p4_clockmod
[root@rdma4 network-scripts]# mv /root/ifcfg-ib1 .
[root@rdma4 network-scripts]# ifup ib0
Device ib0 does not seem to be present, delaying initialization.
[root@rdma4 network-scripts]# /sbin/service rdma start 
Loading OpenIB kernel modules:[  OK  ]
[root@rdma4 network-scripts]# lsmod | grep -i ib
ib_ipoib               78964  0 
ib_ucm                 12535  0 
ib_uverbs              30899  2 rdma_ucm,ib_ucm
ib_umad                11955  0 
ib_cm                  36419  3 ib_ipoib,ib_ucm,rdma_cm
ib_addr                 6121  1 rdma_cm
ib_sa                  22820  4 ib_ipoib,rdma_ucm,rdma_cm,ib_cm
ib_qib                370240  0 
ib_mad                 40350  4 ib_umad,ib_cm,ib_sa,ib_qib
ib_core                66077  12 rds_rdma,ib_ipoib,rdma_ucm,ib_ucm,ib_uverbs,ib_umad,rdma_cm,ib_cm,iw_cm,ib_sa,ib_qib,ib_mad
speedstep_lib           5367  1 p4_clockmod
ipv6                  321844  52 ib_ipoib,ib_addr
[root@rdma4 network-scripts]# /sbin/service rdma stop
Unloading OpenIB kernel modules:[  OK  ]

Comment 7 errata-xmlrpc 2011-12-06 15:33:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2011-1639.html


Note You need to log in before you can comment on or make changes to this bug.