Red Hat Bugzilla – Bug 1292919
Director introspection caused disruption of IBM server IMM (IPMI) network interfaces
Last modified: 2016-10-14 15:49:45 EDT
Description of problem:
Introspection caused a reset in IBM server IMM (IPMI) modules which caused them to loose network connectivity when configured to use the server NIC. Configuring the server's IMM (IPMI) to use there embedded NICs eliminated the issue
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Configure you IMM (IPMI) modules to use a server NIC and not the embedded nic
2. Run introspection
The rest of the modules caused them to not be able to get back their DHCP reservations and become unavilable.
No disruption or change of a configured IMM (IPMI) as a side affect of introspection.
Servers: IBM x3550 M5
FW: DSA 10.0, IMM2 1.02, UEFI 1.03, Bootcode 1.38
In a pool of 8 IBM x3550 servers, after interspection we found that 3 server’s IMM (IPMI) management interfaces were unreachable on the network
Further investigation we found that in the case of the impacted servers, that their IMM module were configured use the server’s native first NIC instead of the IMM module’s own dedicated NIC. And that those IMMs had a DHCP provided IP address key to their MAC addresses in the network environment. It seems something in the introspection caused the IMMs to reset. After the reset modules did not manage to get an IP address back.
The 7 unaffected servers were configured to use the IMM’s embedded NIC.
The same equipment had been used to do a 7.0 OSP install 8 weeks ago and during that introspection the IMM outage did not occur.
We’ve now switch the 3 servers over to use the embedded IMM nic, and have been able to continue our OSP 8.0 beta2 install.
Reference to Red Hat internal CEE Lab tickets,
2015-12-09 11:04:31 EST - Brian RhatiganCustomer update
The IMMs are available now. I found the 3 servers were powered off so I powered them up and did find an IMM configuration issue probably going back to when they were first deployed as ceph servers. The IMM was configured to "share" the first ethernet interface (even though there is a dedicated interface for the IMM *and* it was cabled). So I converted them to use the dedicated interface (which then caused their MAC address to change). I updated the DHCP server with their new MAC and they instantly pulled the appropriate IP address. I would imagine the other servers have the same configuration issue as the same person set them up so I can't blame the issue on it as they seem to be working fine. But all 3 that I looked at were unable to pull an IP address prior to me making the changes...
This might be something to consider fixing down the road if we see the issue crop up again on the other servers.
Anyway, you should be good to go. Closing this ticket.
If you feel that your request should not be closed yet, please reply to this email and let us know. We want to make sure we are providing a resolution as quickly as possible.
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.