Bug 965829

Summary: unable to restart rdma service
Product: Red Hat Enterprise Linux 7 Reporter: Edward Mascarenhas <edward.mascarenhas>
Component: rdmaAssignee: Doug Ledford <dledford>
Status: CLOSED NOTABUG QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.0CC: dbayly, edward.mascarenhas, keve.a.gabbert, knweiss
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-05-21 21:12:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Edward Mascarenhas 2013-05-21 20:00:19 UTC
Description of problem:
On RHEL 7 Alpha 3,

i am unable to restart the rdma service. Everytime "service rdma restart"
command is executed, an error message is thrown on screen.

Redirecting to /bin/systemctl restart  rdma.service
Failed to issue method call: Operation refused, unit rdma.service may be
requested by dependency on



Version-Release number of selected component (if applicable):


How reproducible:
Very

Steps to Reproduce:
1.service rdma restart
2.
3.

Actual results:

Redirecting to /bin/systemctl restart  rdma.service
Failed to issue method call: Operation refused, unit rdma.service may be
requested by dependency on

Expected results:

No errors
Additional info:
dmesg lsof | grep ib_qib shows the following output

[root@ ~]# dmesg lsof | grep ib_qib
[   13.894950] ib_qib 0000:07:00.0: irq 67 for MSI/MSI-X
[   13.894958] ib_qib 0000:07:00.0: irq 68 for MSI/MSI-X
[   13.894966] ib_qib 0000:07:00.0: irq 69 for MSI/MSI-X
[   13.894973] ib_qib 0000:07:00.0: irq 70 for MSI/MSI-X
[   13.894980] ib_qib 0000:07:00.0: irq 71 for MSI/MSI-X
[   13.894987] ib_qib 0000:07:00.0: irq 72 for MSI/MSI-X
[   13.894992] ib_qib 0000:07:00.0: irq 73 for MSI/MSI-X
[   18.250874] ib_qib 0000:07:00.0: IB0:1 got a lid: 0x7


lsmod | grep ib_qib shows that one process is using the module and hence rmmod
doesnt remove it either.

ib_qib                367196  1
ib_mad                 47134  4 ib_cm,ib_sa,ib_qib,ib_umad
ib_core                78541  11
rdma_cm,ib_cm,ib_sa,iw_cm,ib_mad,ib_qib,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib

libkmod: kmod_module_remove_module:
 could not remove 'ib_qib': Resource temporarily unavailable
Error: could not remove module ib_qib: Resource temporarily unavailable
[root@ ~]# modprobe -r ib_qib
FATAL: Module ib_qib is in use.

even after stopping ibacm, we were not able to restart rdma service.

[root@ ~]# lsmod | grep qib
ib_qib                367196  0
ib_mad                 47134  4 ib_cm,ib_sa,ib_qib,ib_umad
ib_core                78541  11
rdma_cm,ib_cm,ib_sa,iw_cm,ib_mad,ib_qib,ib_ucm,                                
                                                                               
              ib_umad,ib_uverbs,rdma_ucm,ib_ipoib
[root@ ~]# systemctl restart rdma.service
Failed to issue method call: Operation refused, unit rdma.service may be
requested by dependency only.

Comment 2 Doug Ledford 2013-05-21 21:12:40 UTC
As of rhel7, the rdma stack is no longer restartable.  The rdma stack is now a 1-way operation.  When udev detects rdma capable hardware, the stack is automatically started.  There is no method to remove the stack items except to manually shut down every rdma using application and then down all the IPoIB interfaces, RDS rdma transport, NFS rdma transport, iSER, iSER target, SRP, SRP target, and anything else that uses the rdma devices, then you can manually remove the modules from the kernel.

This action was taken because the rdma stack was deemed to be at a production level, which means it should not need to be restarted under normal circumstances.  This brings the rdma stack more in line with all the other kernel stacks that need more than just a simple modprobe to run (such as the SCSI stack, or FCoE stack, etc), all of which are non-restartable.