Bug 1252382 - abrt-addon-kernel-oops keeps attempting to report the same oops
Summary: abrt-addon-kernel-oops keeps attempting to report the same oops
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: abrt
Version: 22
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: abrt
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-08-11 09:42 UTC by Berend De Schouwer
Modified: 2016-07-19 17:28 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-07-19 17:28:19 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Berend De Schouwer 2015-08-11 09:42:37 UTC
Description of problem:

abrt-dump-journal-oops keeps reporting an oops.  It's an old oops.  It's stuck in an infinite loop.

It keeps logging the oops, writing files to /var/spool/abrt/, which abrtd is deleting (I've set DropNotReportableOopses)


Version-Release number of selected component (if applicable):

abrt-addon-kerneloops-2.6.1-2.fc22.x86_64


How reproducible:

Dunno.

The oops was generated by kernel 4.1.3-201, but I've since regressed to 4.0.8-300 to prevent the infinite loop (which didn't help.)

The oops doesn't have enough information, which is why abrtd is deleting them continuously.


Steps to Reproduce:
1. generate oops (?)
2. wait a bit for abrt-dump-journal-oops to start
3.

Actual results:

Report oops once to /var/spool/abrt/


Expected results:

Infinite oops-es.


Additional info:

It started with 4.1.3-201.  oops*/kernel for new reports still says 4.1.3-201.  

I have rebooted into 4.0.8-300 to check if they were real oops-es.  The reports in /var/spool/abrt still say 4.1.3-201.

I have removed /var/lib/abrt/abrt-dump-journal-oops.state to see if it would stop, it doesn't help.

I have saved a copy of abrt-dump-journal-oops.state.  A new one is created with the exact same information (broken journal cursor ???)

running abrt-dump-journal-oops manually keeps specifying two oops-es.  -v-v-v doesn't ever say: delete.

I've looked for Oops-es using
journalctl SYSLOG_IDENTIFIER=kernel
and I can't find any.

There are several thousand of these.  All for a kernel paging failure in exactly the same place.  All with minimal information (no stacktrace, backtrace, or anything else.)

On reboot, abrtd tries to run several thousand abrt-handle-event processes simultaneously (until I set DropNotReportableOopses)

I've now run 'systemctl disable abrt-oops' to prevent this from continuing, but that's not a long-term solution.

Comment 1 Berend De Schouwer 2015-08-11 09:44:43 UTC
Sorry:


Actual results:

Infinite oops-es.


Expected results:

Report oops once to /var/spool/abrt/

Comment 2 Berend De Schouwer 2015-08-11 09:45:18 UTC
"watch cat abrt-dump-journal-oops.state" never changes.

Comment 3 Jakub Filak 2015-08-11 10:55:46 UTC
Can you please try to stop abrt-oops service, remove /var/lib/abrt/abrt-dump-journal-oops.state and start abrt-oops service. The tool should start processing journal from the end, so it should skip all the previous kernel oopses. Please report the results here.

Comment 4 Berend De Schouwer 2015-08-11 13:12:05 UTC
It creates an identical abrt-dump-journal-oops.state (same contents)

It continues reporting oops-es

They're still from the previous kernel.  abrt-server tries to match it with kernel-core-booted.

ok, a few hours later, it's stopped.

Comment 5 Berend De Schouwer 2015-08-12 07:32:14 UTC
Possibly related:

journalctl doesn't go to the end of the log ??  Paging doesn't reach the end.  Possibly corrupt files.

Snippet:

-- Logs begin at Sun 2015-05-31 15:18:16 SAST, end at Wed 2015-08-12 09:20:45 SAST. --
May 31 15:18:16 localhost.localdomain systemd[1645]: Reached target Sockets.
May 31 15:18:16 localhost.localdomain systemd[1645]: Starting Sockets.

but paging to the end using less (journalctl by default uses less):

Jul 22 14:44:06 sieve-deschouwer-co-za gdb[26608]: detected unhandled Python exception
Jul 22 14:44:20 sieve-deschouwer-co-za gnome-session[2094]: (gnome-settings-daemon:2228): GnomeDesktop-WARNING **: Error setting property 'PowerSaveMode' on interface org.gnome.Mutter.Displa

No August...


journalctl -f or -e do show August.

/var/log/journal indicates the journal was rotated yesterday, and has files for august.

There is a file that was rotated at Jul 22 14:45, so I guess that journalctl can't read past the rotation.

Comment 6 Frank Crawford 2015-12-25 00:11:37 UTC
I've also been bitten by this issue a few times.  Investigating the latest one, it seemed to related to kernel oops, which eventually hung the machine and forced me to do a hard reset.

The actual oops was in a journal file ending with a ".journal~", which indicates that it wasn't closed properly, and so I guess couldn't properly set the cursor position for it.

The only way I could stop the continual reports was to remove the journal file with the oops entry.

This was on F23 with abrt-2.7.1-1.fc23.x86_64 and abrt-addon-kerneloops-2.7.1-1.fc23.x86_64.

Comment 7 Fedora End Of Life 2016-07-19 17:28:19 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.