Red Hat Bugzilla – Bug 857106
ipmitool sol sessions stop responding
Last modified: 2016-09-20 00:34:00 EDT
Description of problem: I use ipmitool with conserver to manage remote Fedora systems. This works fine initially, but eventually the ipmitool session becomes unresponsive. When a system crashes, it is helpful to have a log of any message that may have been printed on the console. But I am not getting these messages because the SOL session has frozen. A restart fixes the problem. Using the usesolkeepalive gives even worse results than the default keepalive method.
Version-Release number of selected component (if applicable):
Run "ipmitool -I lanplus -H <host> sol activate" and wait a while.
Steps to Reproduce:
1. Run "ipmitool -I lanplus -H <host> sol activate"
2. Wait a while for the session to become unresponsive.
3. Try adding the "usesolkeepalive" argument and see that it does not help.
The session freezes after a while.
The keepalive behavior should detect that the session has become unresponsive, and the program should exit or reinitialize the session.
I reproduced something what might be fixed but I am not sure that we both observe same problem/protocol and target host capabilities.
What have I reproduced can be called: "The sol complains of send error forever until input activity."
scenario: activate sol session, add output filtering drop rule for the ipmitool socket, deactivate the session using other ipmitool instance, remove the output drop filtering rule
--- investigation comments below (current, not f14)
I see the -N and -R command line options might be duplicate with the SOL_KEEPALIVE_TIMEOUT and SOL_KEEPALIVE_RETRIES and also with the MAX_SOL_RETRY (in the ipmi_sol_red_pill)
And the _keepalive_retries is really never used for anything.
That the ipmi_sol_red_pill() never receives anything else then 0 from both either the ipmi_sol_keepalive_using_getdeviceid() or ipmi_sol_keepalive_using_sol()
the send_sol returns NULL / rsp pointer
the keepalive returns -1 / 0
For inactive session keepalive with get device id command there is probably no indication of session active/inactive
But in case of network or target host issues when there is no reply the ipmitool should be able to terminate.
counter scenario for the change of this behavior might be setup with console server where admin can not reach the console server and deactivates it's session directly at the managed target for manual override. the buggy behaviour prevents race in such conditions.
Created attachment 653908 [details]
This bug is currently assigned to an unsupported release. If you think this bug is still valid and should remain open, please re-assign it to a supported release (F22, F23) or to rawhide.
Bugs which will be assigned to an unsupported release are going to be closed as EOL (End Of Life) on January 26th, 2016.