Description of problem: For some reason, removing infiniband modules in the 5.3 kernel is failing if the modules are loaded during start up time. If they are loaded after startup, they remove fine. # service openibd start Loading OpenIB kernel modules: [ OK ] # rmmod ib_ipath # rmmod ib_mthca # ibstat # # chkconfig --levels 345 openibd on # reboot Broadcast message from root (pts/0) (Thu Nov 6 11:45:35 2008): The system is going down for reboot NOW! # Connection to dhcp71-141.rhts.bos.redhat.com closed by remote host. $ ssh root.bos.redhat.com root.bos.redhat.com's password: Last login: Thu Nov 6 11:43:36 2008 from vpn-10-69.bos.redhat.com Agent pid 3974 Identity added: /root/.ssh/id_rsa.dhcp71-141.rhts.bos.redhat.com (/root/.ssh/id_rsa.dhcp71-141.rhts.bos.redhat.com) # ibstat CA 'ipath0' CA type: InfiniPath_QLE7140 Number of ports: 1 Firmware version: Hardware version: 1 Node GUID: 0x001175000068709f System image GUID: 0x001175000068709f Port 1: State: Active Physical state: LinkUp Rate: 10 Base lid: 6 LMC: 0 SM lid: 9 Capability mask: 0x02010800 Port GUID: 0x001175000068709f CA 'mthca0' CA type: MT25204 Number of ports: 1 Firmware version: 1.2.0 Hardware version: a0 Node GUID: 0x0002c9000100d050 System image GUID: 0x0002c9000100d053 Port 1: State: Active Physical state: LinkUp Rate: 10 Base lid: 3 LMC: 0 SM lid: 9 Capability mask: 0x02510a68 Port GUID: 0x0002c9000100d051 # rmmod ib_ipath ERROR: Module ib_ipath is in use # service openibd stop Unloading OpenIB kernel modules: Failed to unload rdma_ucm Failed to unload ib_uverbs Failed to unload rdma_cm Failed to unload ib_cm Failed to unload iw_cm Failed to unload ib_ipath Failed to unload ib_mthca Failed to unload ib_addr Failed to unload ib_sa Failed to unload ib_mad Failed to unload ib_core [FAILED] Version-Release number of selected component (if applicable): # uname -a Linux dhcp71-142.rhts.bos.redhat.com 2.6.18-121.el5 #1 SMP Mon Oct 27 21:46:55 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux How reproducible: Very.. Steps to Reproduce: 1. chkconfig --level 345 openibd on 2. reboot 3. try to stop openibd or unload a module. 4. chkconfig --level 345 openibd off 5. reboot 6. service openibd start 7. service openibd stop or try to rmmod an infiniband module
Are you certain that the problem isn't just that something is still using the openib stack when you try to down it? Specifically, if there is iSER or SRP or opensm or any of those things set to start at startup, then when you have openibd enabled at startup they will succeed and when you don't they won't. When they succeed, if you don't first stop them before trying to stop the openibd stack, then the openibd stack will still be in use. Plus, on my machines here, where I don't have anything besides the openibd stack configured to come up, I can readily shut down the openibd stack even if it's brought up at boot up.
Hmm, I was playing around with iser at the time. I'll recheck this while iser is sure to be down.
ditto ... After taking tgtd off the startup list, I no longer have issues unloading infiniband modules.