Bug 2128662
Summary: | Abrt does not report a segfault which is reported in journalctl. | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Lukas Ruzicka <lruzicka> | ||||||||
Component: | abrt | Assignee: | Michal Srb <msrb> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 37 | CC: | abrt-devel-list, abrt-sig, awilliam, jakub, jmilan, kparal, lbrabec, mgrabovs, michal.toman, msrb, robatino | ||||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | RejectedBlocker AcceptedFreezeException | ||||||||||
Fixed In Version: | abrt-2.15.1-5.fc38 abrt-2.15.1-5.fc37 abrt-2.15.1-6.fc38 abrt-2.15.1-6.fc37 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | |||||||||||
: | 2211640 (view as bug list) | Environment: | |||||||||
Last Closed: | 2022-10-24 17:50:46 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 2009540, 2211640 | ||||||||||
Attachments: |
|
Description
Lukas Ruzicka
2022-09-21 11:00:55 UTC
Proposed as a Blocker for 37-final by Fedora user lruzicka using the blocker tracking app because: In my opinion, this violates the "Basic Functionality Criterion" because Abrt is useless at the moment. Thanks for the bug report. I just tested this in a fully-updated f37 virtual machine: [msrb@fedora ~]$ will_segfault Will segfault. Segmentation fault (core dumped) [msrb@fedora ~]$ abrt ls Id cec4c37 Component will-crash Count 1 Time 2022-09-21 13:20:18 User id 1000 (msrb) Reported to ABRT Server https://retrace.fedoraproject.org/faf/reports/bthash/2b650f55015e773bd65e11cd4a3dee6cfeb328df [msrb@fedora ~]$ cat /etc/os-release NAME="Fedora Linux" VERSION="37 (Workstation Edition Prerelease)" I got the notification immediately. Note this was a fresh install, not upgrade from f36. Could you please double-check that you have the latest-greatest packages installed? Was this upgrade from f36? I tested in a fully updated F37 (using updates-testing, clean installed some weeks back) and I can reproduce the issue. ABRT doesn't detect any crashes (tested will_segfault, will_abort, will_cpp_segfault and manually killing gnome-calculator with SIGABRT). The journal doesn't contain any messages from abrt. Coredumpctl shows all the crashes, though. All abrt* systemd services are shown as running. Attaching logs. Created attachment 1913310 [details]
system journal
Created attachment 1913311 [details]
rpm -qa output
Thanks! Could you please check following: # cat /var/lib/abrt/abrt-dump-journal-core.state # journalctl --cursor-file=/var/lib/abrt/abrt-dump-journal-core.state The file should contain a valid journal cursor, but in case there is something wrong with it, you can try to remove the file and let abrt to recreate it next time crash happens. Hmm, so, I used a VM before (cleanly booted). When I started it again now, ABRT notifications work perfectly. I restarted it a few times, it still works fine. So perhaps some race condition? But, I also saw this problem on my laptop. That wasn't restarted since then, and I can still reproduce the problem. The journalctl cursor value in /var/lib/abrt/abrt-dump-journal-core.state seems to only change when I run journalctl --cursor-file=/var/lib/abrt/abrt-dump-journal-core.state. Otherwise it stays the same. On the other hand, in the VM where things work now, the value changes when I run journalctl --cursor-file=/var/lib/abrt/abrt-dump-journal-core.state or when a crash happens. This is a great feedback -- thanks! My VM has been restarted as well, so I will create a fresh one and observe. I will keep you guys posted here. Hello, mine was a fresh installation of the latest compose run in the virtual machine. I am experiencing the same problem with my laptop that runs the latest greatest packages, but is an upgrade from Fedora 36. Would it be possible to try if simple "# systemctl restart abrt-journal-core" fixes the problem on the system? Of course. I have restarted the abrt-journal-core service and Abrt is working correctly now. Thanks. Dammit. Thank you :) That explains why it somehow fixes itself after restart. But I still don't know what might be triggering the problem after clean installation or upgrade from previous version... +4 in https://pagure.io/fedora-qa/blocker-review/issue/915 , marking accepted. I tried to reproduce this issue today and I have to say I'm currently unable to. I restarted an existing system numerous times, but ABRT was intercepting crashes on every boot. I also installed a completely fresh Workstation multiple times, and ABRT was again working OK after each installation. I have no clue how to help trigger this (quite possibly) race condition. (In reply to Kamil Páral from comment #14) > I tried to reproduce this issue today and I have to say I'm currently unable > to. I restarted an existing system numerous times, but ABRT was intercepting > crashes on every boot. I also installed a completely fresh Workstation > multiple times, and ABRT was again working OK after each installation. I > have no clue how to help trigger this (quite possibly) race condition. I did not focus on this specifically, but I used the above workflow to test Abrt on several freshly installed VMs and it also worked fine always. Maybe this has somehow magically gone away? Maybe some other update fixed the race? Unfortunately I saw this again yesterday. I caused Nautilus to crash twice, coredumpctl saw the crashes but ABRT did not. After restarting abrtd.service and causing Nautilus to crash again, ABRT finally detected the crash. Michal, were you able to look into this? Find a probable cause for this? How can we provide better debug info? Thanks! Yes, I've been looking into this and I found a reproducible scenario. I upgraded 2 laptops and 1 VM from f36 to f37 recently and unfortunately non of the machines suffered from the problem, but I observed the same issue happening in Rawhide (f38). Fix incoming (later today/tomorrow morning). Thanks! Any chance still for today? We could....kind of...do a candidate compose if we get this fix, I guess. FEDORA-2022-7e53c89754 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2022-7e53c89754 FEDORA-2022-7cf3cad7c7 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2022-7cf3cad7c7 Sorry! I didn't make it yesterday. The update is in Bodhi now. FEDORA-2022-7e53c89754 has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report. Setting back to ON_QA as this is for F37. FEDORA-2022-7cf3cad7c7 has been pushed to the Fedora 37 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-7cf3cad7c7` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-7cf3cad7c7 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. According to the patch description [1], this seems to be a race condition, so it's hard to verify. I guess we'll just push the update and reopen this bug if we still see it happening in the future. [1] https://github.com/abrt/abrt/commit/0cd7e3d7faa0c36e6ef2ecdd1d599fe54e9e87be FEDORA-2022-7cf3cad7c7 has been pushed to the Fedora 37 stable repository. If problem still persists, please make note of it in this bug report. This happened to me again on Raspberry Pi 4, F37 Final RC 1.1 with abrt-2.15.1-5.fc37 I can confirm that it also happened to me on a fresh install of Fedora WS, F37 Final RC 1.1 when will-crash was freshly installed and used. The problem went away after a system restart. Could you please share the journal log from the boot when this happened? (In reply to Michal Srb from comment #30) > Could you please share the journal log from the boot when this happened? I am providing all journal logs from the affected machine as I have rebooted the machine several times and I am not sure in which cycle I attempted to simulate the crash. I believe that it was one of the first three boot cycles. Created attachment 1918739 [details]
System journal (new occurence)
Thanks ;) Indeed, I also see the problem on freshly installed f37 RC. It seems like the problem goes away when: * systemd-coredump reports a crash in journal * abrt-journal-core.service or the whole system is restarted I.e. abrt-journal-core.service somehow doesn't see the first crash (or crashes before first restart after them). I see the same error in journal: Oct 18 11:13:22 fedora abrt-dump-journal-core[792]: Cannot save journal watch's position I am wondering if abrt-journal-core.service is somehow incorrectly initializing the journal cursor, if there are no prior (valid) crashes in journal. I think I have an idea when the problem might be. Another fix incoming today. FEDORA-2022-dcd00ab021 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2022-dcd00ab021 FEDORA-2022-7fb34453de has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2022-7fb34453de FEDORA-2022-dcd00ab021 has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report. I'm gonna kick this back to proposed blocker as it seems like our experience of it is quite different now from when it was accepted as a blocker (it's intermittent and goes away on a reboot). Please vote again in https://pagure.io/fedora-qa/blocker-review/issue/915 (In reply to Fedora Update System from comment #35) > FEDORA-2022-7fb34453de has been submitted as an update to Fedora 37. > https://bodhi.fedoraproject.org/updates/FEDORA-2022-7fb34453de I tried to install a fresh system and update Abrt to this build before I attempted to induce any crashes or issues. With the updated version, even the very first issue was properly caught and could be reported. So I tend to believe that this fixed helped. If anyone can also confirm, @kparal? The problem was not about the first boot, it happened randomly on any boot (at least for me). So this can't be easily verified to be fixed, only after some time. I tested the update and it worked, which means it's surely worth a try to push it stable. FEDORA-2022-7fb34453de has been pushed to the Fedora 37 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-7fb34453de` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-7fb34453de See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. We have -7 blocker / +3 FE in https://pagure.io/fedora-qa/blocker-review/issue/915 , marking rejected blocker, accepted FE. FEDORA-2022-7fb34453de has been pushed to the Fedora 37 stable repository. If problem still persists, please make note of it in this bug report. |