Bug 635646 - [abrt] sysstat-9.0.6-3.fc13: Process /usr/bin/sar was killed by signal 11 (SIGSEGV)
Summary: [abrt] sysstat-9.0.6-3.fc13: Process /usr/bin/sar was killed by signal 11 (SI...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: sysstat
Version: 13
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
Assignee: Ivana Varekova
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: abrt_hash:99c2dc5c5c1c1a1cdb1c84f950b...
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-20 12:25 UTC by Pytela, Zdenek
Modified: 2010-10-12 08:18 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-10-12 08:18:27 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: backtrace (6.83 KB, text/plain)
2010-09-20 12:25 UTC, Pytela, Zdenek
no flags Details
sar data files from the affected days (118.25 KB, application/x-gzip)
2010-10-01 17:02 UTC, Pytela, Zdenek
no flags Details
sar data from patched binaries (69.46 KB, application/x-gzip)
2010-10-05 09:20 UTC, Pytela, Zdenek
no flags Details

Description Pytela, Zdenek 2010-09-20 12:25:52 UTC
abrt version: 1.1.13
architecture: x86_64
Attached file: backtrace
cmdline: sar
component: sysstat
crash_function: sar_get_record_timestamp_struct
executable: /usr/bin/sar
kernel: 2.6.33.8-149.fc13.x86_64
package: sysstat-9.0.6-3.fc13
rating: 4
reason: Process /usr/bin/sar was killed by signal 11 (SIGSEGV)
release: Fedora release 13 (Goddard)
time: 1284985405
uid: 0

comment
-----
sar coredumps since F13 nightly updates which should not be related to sysstat
(foomatic*, im-chooser, nspr, nss*, preupgrade, system-config-language)
only lines until time 2.20.02 are displayed
sadf coredumps as well

sar 10 10
works fine, without segfault

How to reproduce
-----
1. sar (without parameters)

Comment 1 Pytela, Zdenek 2010-09-20 12:25:54 UTC
Created attachment 448439 [details]
File: backtrace

Comment 2 Ivana Varekova 2010-10-01 13:49:20 UTC
Hello,
please does sar still segfaults even on today data?
Could you attach here the affected system log (/var/log/sa directory, the file with the affected day suffix).

Comment 3 Pytela, Zdenek 2010-10-01 17:02:33 UTC
Created attachment 451062 [details]
sar data files from the affected days

Sar didn't crash since then and works almost always as expected. However, each two or three days it prints after 2:10 weird things like

01:00:00        all      0,00      0,00    8298735809331200,00    38266440620441600,00    93630287052800,00    5153960755200,00
01:00:00        all      0,00      0,00    2106251961958400,00    39252994608332800,00    41661182771200,00    1288490188800,00
End of system activity file unexpected

just with exit code 2, no segfault. These values appear only for -P ALL or 0, other processors have zero system activity printed:

01:00:00          1      0,00      0,00    100,00      0,00      0,00      0,00

Comment 4 Ivana Varekova 2010-10-04 13:21:43 UTC
There are two problems -

* the first one is sadc tool problem. It generates bogus data.
To fix this problem I need additional debug information which can be generated by patched version of sysstat. The srpm is on:
http://people.redhat.com/varekova/sysstat-9.0.6.1-11.DEBUG.src.rpm
version for x86_64 is on:
http://people.redhat.com/varekova/sysstat-9.0.6.1-11.DEBUG.x86_64.rpm
Please install this version and sent me the log file which is affected and file /tmp/sysstat-debug to which the DEBUG version saves additional data. 

* the second is sar tool problem. It reads bogus data and then segfaults - This problem is fixed in sysstat-9.0.6.1-11.fc15.

Comment 5 Pytela, Zdenek 2010-10-05 09:20:48 UTC
Created attachment 451618 [details]
sar data from patched binaries

I rebuilt and updated the sysstat package, output of sar looks good, nothing didn't happen to the logs overnight.

Attached files: the /tmp debug output of sadc, raw data and their interpretation by sar 9.0.6.1 - sar command ends with 

End of system activity file unexpected

May the former behaviour have been caused by a temporarily (short) lack of disk space?

Comment 6 Ivana Varekova 2010-10-05 12:15:35 UTC
Hello,
there can be problem in time command or in the transformation of time using localtime or gmtime functions.
I'm not sure what can cause the problem, the cause should be found using the debuginfo data.
Please if the problem happens again put her the debug log and /var log. If there will be no problem in a week I will close the bugzilla.

Comment 7 Ivana Varekova 2010-10-12 08:18:27 UTC
I'm closing this bug. If the issue appears again, please reopen it.


Note You need to log in before you can comment on or make changes to this bug.