Bug 1508984
Summary: | Journal files corrupted after reboot on ext4 file systems | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Morten Stevens <mstevens> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | airlied, bskeggs, esandeen, ewk, hdegoede, herrold, ichavero, itamar, jarodwilson, jeremy, jglisse, john.j5live, jonathan, josef, jsynacek, kernel-maint, linville, lnykryn, mchehab, mjg59, msekleta, mstevens, rami_adrees, rds1944, ssahani, s, steved, systemd-maint, zbyszek |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-23 20:02:41 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Morten Stevens
2017-11-02 15:35:37 UTC
(In reply to Morten Stevens from comment #0) > Actual results: > [ 3.889532] EXT4-fs (dm-0): Delayed block allocation failed for inode > 4325488 at logical offset 649 with max blocks 3 with error 121 > [ 3.889571] EXT4-fs (dm-0): This should not happen!! Data will be lost That's a kernel problem. i have the same problm with fedora 27, not only the Journal files were corrupted after rebooting, at some points system/databse files got corrupted too so the system crashed. i replaced the SCSI-Controller to LSI Logic Parallel. (i used Paravirtual before). that solve my problem and the systems are stable now. the Problem still exist when using (Paravirtual or LSI Logic SAS) This is also possible... But with the latest 4.14 and 4.15 I'm not able to reproduce it. Maybe, it's already fixed upstream? @Rami: Are you able to reproduce it after updating to the latest 4.14.18 or 4.15.2 kernel? @Morten Stevens: Yes i'm still able to reproduce. i didn't check the 4.15.2 only 4.14.18 Kernel 4.14.18-300.fc27.x86_64 rm -rf /var/log/journal/* systemctl restart systemd-journald journalctl --verify PASS: /var/log/journal/35fd8595287b4e07b59244474a51a1c4/system.journal ll -i /var/log/journal/35fd8595287b4e07b59244474a51a1c4/system.journal 531930 -rw-r-----+ 1 root systemd-journal 8388608 Feb 14 12:00 /var/log/journal/35fd8595287b4e07b59244474a51a1c4/system.journal reboot dmesg [Wed Feb 14 12:04:07 2018] EXT4-fs (dm-0): Delayed block allocation failed for inode 531930 at logical offset 291 with max blocks 2 with error 121 [Wed Feb 14 12:04:07 2018] EXT4-fs (dm-0): This should not happen!! Data will be lost [Wed Feb 14 12:04:16 2018] systemd-journald[536]: /var/log/journal/35fd8595287b4e07b59244474a51a1c4/system.journal: Journal file corrupted, rotating. journalctl --verify PASS: /var/log/journal/35fd8595287b4e07b59244474a51a1c4/system.journal 123868: Invalid object File corruption detected at /var/log/journal/35fd8595287b4e07b59244474a51a1c4/system:123868 (of 8388608 bytes, 14%). FAIL: /var/log/journal/35fd8595287b4e07b59244474a51a1c4/system (Bad message) ll -i /var/log/journal/35fd8595287b4e07b59244474a51a1c4/system 531930 -rw-r-----+ 1 root systemd-journal 8388608 Feb 14 12:04 /var/log/journal/35fd8595287b4e07b59244474a51a1c4/system @Rami Very interesting... I'm not able to reproduce it with 4.14.18 or 4.15.x. Could try to reproduce it with the latest F27 4.15 kernel? And if you don't need the journal files try to delete the corrupted journal files before updating to 4.15 with rm -rf /var/log/journal/35fd8595287b4e07b59244474a51a1c4/ I, too, have this problem. See Bug 1560149. Also look at Bug 1533620 for xfs installation. @Guido, your issue is different, the xfs issue is different as well. Those bugs are different issues, related to log replay after a "clean" reboot. @Rami, can you attach an entire dmesg when this happens? And what is the state of the filesystem holding the systemd journal, how full is it? Sorry, same question for Morten - entire dmesg when this happens, and state of the filesystem at the time ... (In reply to Eric Sandeen from comment #8) > Sorry, same question for Morten - entire dmesg when this happens, and state > of the filesystem at the time ... Hello Eric, I'm not sure if this was a kernel or systemd related issue. But I'm not able to reproduce it with the linux 4.15+ and the latest systemd update (systemd-234-10.git5f8984e.fc27) for F27. Changelog for systemd-234-10.git5f8984e.fc27 - various fixes for journalctl leaking file descriptors on very quick file rotation (upstream issues #7998, #8198) Maybe, this fixed it? AFAICT systemd(?) probably provoked the issue by not cleanly unmounting root. The kernel is behaving as expected in response. If you have time to do a little A/B testing w/ the prior version of systemd to confirm that it was in fact the systemd update that fixed it, it'd be helpful. Otherwise I guess I'll just close this for now as CURRENTRELEASE. If anyone still sees this with at least the versions Morten noted in comment #9 please re-open with further details. Thanks, -Eric |