Bug 857106

Summary: ipmitool sol sessions stop responding
Product: [Fedora] Fedora Reporter: Andrew J. Schorr <aschorr>
Component: ipmitoolAssignee: Ales Ledvinka <aledvink>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 14CC: aledvink, tsmetana
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-01-26 09:30:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
rc1 none

Description Andrew J. Schorr 2012-09-13 15:12:57 UTC
Description of problem: I use ipmitool with conserver to manage remote Fedora systems.  This works fine initially, but eventually the ipmitool session becomes unresponsive.  When a system crashes, it is helpful to have a log of any message that may have been printed on the console.  But I am not getting these messages because the SOL session has frozen.  A restart fixes the problem.  Using the usesolkeepalive gives even worse results than the default keepalive method.


Version-Release number of selected component (if applicable):
1.8.11

How reproducible:
Run "ipmitool -I lanplus -H <host> sol activate" and wait a while.

Steps to Reproduce:
1. Run "ipmitool -I lanplus -H <host> sol activate"
2. Wait a while for the session to become unresponsive.
3. Try adding the "usesolkeepalive" argument and see that it does not help.
  
Actual results:
The session freezes after a while.

Expected results:
The keepalive behavior should detect that the session has become unresponsive, and the program should exit or reinitialize the session.

Additional info:

Comment 1 Ales Ledvinka 2012-11-21 17:14:04 UTC
I reproduced something what might be fixed but I am not sure that we both observe same problem/protocol and target host capabilities.

What have I reproduced can be called: "The sol complains of send error forever until input activity."

scenario: activate sol session, add output filtering drop rule for the ipmitool socket, deactivate the session using other ipmitool instance, remove the output drop filtering rule

--- investigation comments below (current, not f14)
I see the -N and -R command line options might be duplicate with the SOL_KEEPALIVE_TIMEOUT and SOL_KEEPALIVE_RETRIES and also with the MAX_SOL_RETRY (in the ipmi_sol_red_pill)
And the _keepalive_retries is really never used for anything.
That the ipmi_sol_red_pill() never receives anything else then 0 from both either the ipmi_sol_keepalive_using_getdeviceid() or ipmi_sol_keepalive_using_sol()

the send_sol returns  NULL / rsp pointer
the keepalive returns  -1  /  0

For inactive session keepalive with get device id command there is probably no indication of session active/inactive
But in case of network or target host issues when there is no reply the ipmitool should be able to terminate.

counter scenario for the change of this behavior might be setup with console server where admin can not reach the console server and deactivates it's session directly at the managed target for manual override. the buggy behaviour prevents race in such conditions.

Comment 2 Ales Ledvinka 2012-11-29 01:15:45 UTC
Created attachment 653908 [details]
rc1

Comment 3 Jan Kurik 2015-12-22 11:35:17 UTC
This bug is currently assigned to an unsupported release. If you think this bug is still valid and should remain open, please re-assign it to a supported release (F22, F23) or to rawhide.

Bugs which will be assigned to an unsupported release are going to be closed as EOL (End Of Life) on January 26th, 2016.