Bug 1290985 - ipmitool sel list last X fails in cases involving id wrap
ipmitool sel list last X fails in cases involving id wrap
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: ipmitool (Show other bugs)
7.2
Unspecified Unspecified
unspecified Severity medium
: rc
: ---
Assigned To: Josef Ridky
Rachel Sibley
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-12 07:32 EST by Martin Poole
Modified: 2016-11-03 19:30 EDT (History)
4 users (show)

See Also:
Fixed In Version: ipmitool-1.8.15-7.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-11-03 19:30:48 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:2159 normal SHIPPED_LIVE ipmitool bug fix and enhancement update 2016-11-03 09:14:05 EDT

  None (edit)
Description Martin Poole 2015-12-12 07:32:07 EST
Description of problem:

ipmitool-1.8.13-8.el7_1


ipmitool sel list last X

can fail or print the wrong entries in various ways if the event id has wrapped.

Obvious fail example.

# ipmitool sel list last 5
fffe | 11/25/2015 | 04:13:22 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
   1 | 11/25/2015 | 04:13:27 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
   2 | 11/25/2015 | 04:13:29 | Power Supply #0x61 | Lower Non-critical going low  | Asserted





There are number of behaviours observed across different hardware as the ipmi event log fills.

The following variations have been seen so far

1. event logging stops at id = (number of entries-1) with "log full".
   hw - dell r710

 1ff | 12/02/2015 | 06:02:32 | Event Logging Disabled #0x72 | Log full | Asserted


2. on reaching (number of events) event logging does LRU expiry until entry 0xfffe is reached at which point logging stops.
   hw - dell r720

fffe | 12/02/2015 | 06:02:32 | Event Logging Disabled #0x72 | Log full | Asserted


3. on reaching (number of events) event logging does LRU expiry until entry id 0xfffe is reached at which point the id wraps and start at 1 again.
   hw - NEC Express5800/R120f-1M [N8100-2203Y]

fffc | 11/25/2015 | 04:13:17 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
fffd | 11/25/2015 | 04:13:19 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
fffe | 11/25/2015 | 04:13:22 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
   1 | 11/25/2015 | 04:13:27 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
   2 | 11/25/2015 | 04:13:29 | Power Supply #0x61 | Lower Non-critical going low  | Asserted



4. on reaching (number of events-1) the id wraps and start at 1 again.
   hw - ibm-x3650m4

 1fd | 11/30/2015 | 09:55:38 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1fe | 11/30/2015 | 09:55:38 | Power Unit #0x01 | Power off/down | Asserted
 1ff | 12/01/2015 | 12:12:16 | System Firmware Progress #0xf5 | Unspecified | Asserted
   1 | 12/01/2015 | 12:12:16 | System Firmware Progress #0xf5 | System boot initiated | Asserted
   2 | 12/01/2015 | 12:12:16 | System Firmware Progress #0xf5 | Unspecified | Asserted



Of these variations it is numbers 3 & 4 that have the problem with wrapping and selection of last 'n' events.

For number 3


# ipmitool sel list last 5
fffe | 11/25/2015 | 04:13:22 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
   1 | 11/25/2015 | 04:13:27 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
   2 | 11/25/2015 | 04:13:29 | Power Supply #0x61 | Lower Non-critical going low  | Asserted

# ipmitool sel list last 4

# ipmitool sel list last 3
f1ca | 11/25/2015 | 04:04:48 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
   : (snip)
fffc | 11/25/2015 | 04:13:17 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
fffd | 11/25/2015 | 04:13:19 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
fffe | 11/25/2015 | 04:13:22 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
   1 | 11/25/2015 | 04:13:27 | Power Supply #0x61 | Lower Non-critical going low  | Asserted
   2 | 11/25/2015 | 04:13:29 | Power Supply #0x61 | Lower Non-critical going low  | Asserted




For number 4 the following is seen.

root@ibm-x3650m4-8 ~]# ipmitool sel list last 15
Get SEL Entry fffe command failed: Requested sensor, data, or record not found
Get SEL Entry fffe command failed: Requested sensor, data, or record not found

[root@ibm-x3650m4-8 ~]# ipmitool sel list last 14

[root@ibm-x3650m4-8 ~]# ipmitool sel list last 13
 1f5 | 11/30/2015 | 09:55:37 | Entity Presence #0x89 | Absent | Deasserted
 1f6 | 11/30/2015 | 09:55:37 | Power Unit #0x01 | Power off/down | Asserted
 1f7 | 11/30/2015 | 09:55:37 | Power Unit #0x01 | Power off/down | Deasserted
 1f8 | 11/30/2015 | 09:55:37 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1f9 | 11/30/2015 | 09:55:37 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1fa | 11/30/2015 | 09:55:37 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1fb | 11/30/2015 | 09:55:37 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1fc | 11/30/2015 | 09:55:38 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1fd | 11/30/2015 | 09:55:38 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1fe | 11/30/2015 | 09:55:38 | Power Unit #0x01 | Power off/down | Asserted
 1ff | 12/01/2015 | 12:12:16 | System Firmware Progress #0xf5 | Unspecified | Asserted
   1 | 12/01/2015 | 12:12:16 | System Firmware Progress #0xf5 | System boot initiated | Asserted
   2 | 12/01/2015 | 12:12:16 | System Firmware Progress #0xf5 | Unspecified | Asserted
   3 | 12/02/2015 | 06:27:01 | Power Unit #0x01 | Power off/down | Asserted
   4 | 12/02/2015 | 09:32:14 | Power Unit #0x01 | Power off/down | Deasserted
   5 | 12/02/2015 | 09:32:14 | System Firmware Progress #0xf5 | Unspecified | Asserted
   6 | 12/02/2015 | 09:36:38 | System Firmware Progress #0xf5 | System boot initiated | Asserted
   7 | 12/02/2015 | 09:43:04 | Power Unit #0x01 | Power off/down | Asserted
   8 | 12/02/2015 | 09:43:15 | Power Unit #0x01 | Power off/down | Deasserted
   9 | 12/02/2015 | 09:43:25 | System Firmware Progress #0xf5 | Unspecified | Asserted
   a | 12/02/2015 | 09:47:19 | System Firmware Progress #0xf5 | System boot initiated | Asserted
   b | 12/02/2015 | 09:55:57 | System Firmware Progress #0xf5 | Unspecified | Asserted
   c | 12/02/2015 | 09:59:30 | System Firmware Progress #0xf5 | System boot initiated | Asserted
Comment 4 Boris Ranto 2016-01-08 10:31:19 EST
OK, to sum it up a bit. I've encountered several issues in code, here:

1) the way we compute delta for a negative count (last X), we get the ID of the first entry, the id of the next entry and we simply divide it, this breaks if we e.g. remove the second entry from the sel log as it will always request ~double the entries

2) the way we compute first next_id for a negative count
  a)  if your last entry id is X and you request X+1 entries (assuming delta is 1), nothing gets printed out because we get next_id 0xffff and we won't even enter the while loop
  b) if the highest entry the sel log achieved is << 0xfffe and we request more entries than is the ID of the latest one, we try to access events with IDs like 0xffXY which obviously do not exist
Comment 5 Boris Ranto 2016-01-11 08:18:13 EST
I've created an upstream bug report with two possible patches (one just lowers the impact of log rotation here and does not fix all the cases, it is more of a best effort solution, the other should provide a proper fix):

https://sourceforge.net/p/ipmitool/bugs/408/

We'll see what upstream has to say about that.
Comment 13 Boris Ranto 2016-05-02 02:32:56 EDT
The patch was accepted upstream, we can probably still make it into the 7.3 time-frame.

@Rachel: Are you ok with getting this into 7.3? Could you please qa_ack this if you are?
Comment 15 Rachel Sibley 2016-05-02 08:18:19 EDT
Hi Boris, yes I have provided qa_ack+
Comment 18 Rachel Sibley 2016-06-08 14:50:15 EDT
ALL TESTS PASSED

Before:
=================================
 ~]# rpm -q ipmitool
ipmitool-1.8.13-8.el7_1.x86_64

 ~]# ipmitool sel list last 5
 1f7 | 06/08/2016 | 16:22:11 | Temperature #0x30 | Upper Critical going high | Asserted
 1f8 | 06/08/2016 | 16:22:16 | Power Unit #0x01 | Power off/down | Asserted
 1f9 | 06/08/2016 | 16:28:02 | Power Unit #0x01 | Power off/down | Deasserted
 1fa | 06/08/2016 | 16:28:12 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1fb | 06/08/2016 | 16:32:26 | System Firmware Progress #0xf5 | System boot initiated | Asserted

- Deleted 3rd before last:
 ~]# ipmitool sel delete 505
Deleted entry 505

 ~]# ipmitool sel list last 5
 1f7 | 06/08/2016 | 16:22:11 | Temperature #0x30 | Upper Critical going high | Asserted
 1f8 | 06/08/2016 | 16:22:16 | Power Unit #0x01 | Power off/down | Asserted
 1fa | 06/08/2016 | 16:28:12 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1fb | 06/08/2016 | 16:32:26 | System Firmware Progress #0xf5 | System boot initiated | Asserted

 ~]# ipmitool sel list | tail -n 5
 1f6 | 06/06/2016 | 15:21:11 | Entity Presence #0x8b | Absent | Deasserted
 1f7 | 06/08/2016 | 16:22:11 | Temperature #0x30 | Upper Critical going high | Asserted
 1f8 | 06/08/2016 | 16:22:16 | Power Unit #0x01 | Power off/down | Asserted
 1fa | 06/08/2016 | 16:28:12 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1fb | 06/08/2016 | 16:32:26 | System Firmware Progress #0xf5 | System boot initiated | Asserted

- Delete next to last:
 ~]# ipmitool sel delete 506
Deleted entry 505

 ~]# ipmitool sel list last 15
 1df | 05/24/2016 | 19:25:14 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1e0 | 05/24/2016 | 19:25:57 | OS Boot |  | Asserted
 1e1 | 05/24/2016 | 19:36:04 | OS Boot |  | Asserted
 1e2 | 05/24/2016 | 19:36:14 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1e3 | 05/24/2016 | 19:40:09 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1e4 | 05/26/2016 | 13:17:33 | Power Unit #0x01 | Power off/down | Asserted
 1e5 | 05/26/2016 | 13:17:49 | Power Unit #0x01 | Power off/down | Deasserted
 1e6 | 05/26/2016 | 13:17:59 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1e7 | 05/26/2016 | 13:22:04 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1e8 | 05/26/2016 | 13:31:44 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1e9 | 05/26/2016 | 13:35:49 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1ea | 06/06/2016 | 14:59:31 | Power Unit #0x01 | Power off/down | Asserted
 1eb | 06/06/2016 | 14:59:42 | Power Unit #0x01 | Power off/down | Deasserted
 1ec | 06/06/2016 | 14:59:53 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1ed | 06/06/2016 | 15:04:13 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1ee | 06/06/2016 | 15:04:57 | OS Boot |  | Asserted
 1ef | 06/06/2016 | 15:15:24 | OS Boot |  | Asserted
 1f0 | 06/06/2016 | 15:15:38 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1f1 | 06/06/2016 | 15:21:02 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1f2 | 06/06/2016 | 15:21:02 | Event Logging Disabled #0xf6 | Log almost full | Asserted
 1f3 | 06/06/2016 | 15:21:04 | Power Supply #0x7f | Redundancy Lost | Asserted
 1f5 | 06/06/2016 | 15:21:10 | Entity Presence #0x89 | Absent | Deasserted
 1f6 | 06/06/2016 | 15:21:11 | Entity Presence #0x8b | Absent | Deasserted
 1f7 | 06/08/2016 | 16:22:11 | Temperature #0x30 | Upper Critical going high | Asserted
 1f8 | 06/08/2016 | 16:22:16 | Power Unit #0x01 | Power off/down | Asserted
 1fb | 06/08/2016 | 16:32:26 | System Firmware Progress #0xf5 | System boot initiated | Asserted

 ~]# ipmitool sel list | tail -n 15
 1ea | 06/06/2016 | 14:59:31 | Power Unit #0x01 | Power off/down | Asserted
 1eb | 06/06/2016 | 14:59:42 | Power Unit #0x01 | Power off/down | Deasserted
 1ec | 06/06/2016 | 14:59:53 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1ed | 06/06/2016 | 15:04:13 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1ee | 06/06/2016 | 15:04:57 | OS Boot |  | Asserted
 1ef | 06/06/2016 | 15:15:24 | OS Boot |  | Asserted
 1f0 | 06/06/2016 | 15:15:38 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1f1 | 06/06/2016 | 15:21:02 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1f2 | 06/06/2016 | 15:21:02 | Event Logging Disabled #0xf6 | Log almost full | Asserted
 1f3 | 06/06/2016 | 15:21:04 | Power Supply #0x7f | Redundancy Lost | Asserted
 1f5 | 06/06/2016 | 15:21:10 | Entity Presence #0x89 | Absent | Deasserted
 1f6 | 06/06/2016 | 15:21:11 | Entity Presence #0x8b | Absent | Deasserted
 1f7 | 06/08/2016 | 16:22:11 | Temperature #0x30 | Upper Critical going high | Asserted
 1f8 | 06/08/2016 | 16:22:16 | Power Unit #0x01 | Power off/down | Asserted
 1fb | 06/08/2016 | 16:32:26 | System Firmware Progress #0xf5 | System boot initiated | Asserted

  ~]# ipmitool sel list last 3
Get SEL Entry 1f9 command failed: Requested sensor, data, or record not found
Get SEL Entry 1f9 command failed: Requested sensor, data, or record not found

  ~]# ipmitool sel list | tail -n 3
 1f8 | 06/08/2016 | 16:22:16 | Power Unit #0x01 | Power off/down | Asserted
 1fa | 06/08/2016 | 16:28:12 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1fb | 06/08/2016 | 16:32:26 | System Firmware Progress #0xf5 | System boot initiated | Asserted

After:
=================================
 ~]# yum update ipmitool-1.8.15-7.el7.x86_64.rpm
...

 ~]# rpm -q ipmitool
ipmitool-1.8.15-7.el7.x86_64

 ~]# ipmitool sel list last 5
 1f6 | 06/06/2016 | 15:21:11 | Entity Presence #0x8b | Absent | Deasserted
 1f7 | 06/08/2016 | 16:22:11 | Temperature #0x30 | Upper Critical going high | Asserted
 1f8 | 06/08/2016 | 16:22:16 | Power Unit #0x01 | Power off/down | Asserted
 1fa | 06/08/2016 | 16:28:12 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1fb | 06/08/2016 | 16:32:26 | System Firmware Progress #0xf5 | System boot initiated | Asserted

 ~]# ipmitool sel list last 15
 1ea | 06/06/2016 | 14:59:31 | Power Unit #0x01 | Power off/down | Asserted
 1eb | 06/06/2016 | 14:59:42 | Power Unit #0x01 | Power off/down | Deasserted
 1ec | 06/06/2016 | 14:59:53 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1ed | 06/06/2016 | 15:04:13 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1ee | 06/06/2016 | 15:04:57 | OS Boot | Installation started | Asserted
 1ef | 06/06/2016 | 15:15:24 | OS Boot | Installation completed | Asserted
 1f0 | 06/06/2016 | 15:15:38 | System Firmware Progress #0xf5 | Unspecified | Asserted
 1f1 | 06/06/2016 | 15:21:02 | System Firmware Progress #0xf5 | System boot initiated | Asserted
 1f2 | 06/06/2016 | 15:21:02 | Event Logging Disabled #0xf6 | Log almost full | Asserted
 1f3 | 06/06/2016 | 15:21:04 | Power Supply #0x7f | Redundancy Lost | Asserted
 1f5 | 06/06/2016 | 15:21:10 | Entity Presence #0x89 | Absent | Deasserted
 1f6 | 06/06/2016 | 15:21:11 | Entity Presence #0x8b | Absent | Deasserted
 1f7 | 06/08/2016 | 16:22:11 | Temperature #0x30 | Upper Critical going high | Asserted
 1f8 | 06/08/2016 | 16:22:16 | Power Unit #0x01 | Power off/down | Asserted
 1fb | 06/08/2016 | 16:32:26 | System Firmware Progress #0xf5 | System boot initiated | Asserted

 ~]# ipmitool sel list last 3
 1f7 | 06/08/2016 | 16:22:11 | Temperature #0x30 | Upper Critical going high | Asserted
 1f8 | 06/08/2016 | 16:22:16 | Power Unit #0x01 | Power off/down | Asserted
 1fb | 06/08/2016 | 16:32:26 | System Firmware Progress #0xf5 | System boot initiated | Asserted
Comment 19 Josef Ridky 2016-10-20 01:35:49 EDT
Return back to ON_QA
Comment 22 Rachel Sibley 2016-10-20 08:17:18 EDT
Moving back to Verified State as it was incorrectly moved back to Assigned when switching the package owner.
Comment 25 errata-xmlrpc 2016-11-03 19:30:48 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2159.html

Note You need to log in before you can comment on or make changes to this bug.