Bug 1185640 - After resume from hibernation sometimes ext4 fs corruption [NEEDINFO]
Summary: After resume from hibernation sometimes ext4 fs corruption
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 21
Hardware: i686
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-01-25 13:47 UTC by Klaus Lichtenwalder
Modified: 2015-12-02 17:11 UTC (History)
16 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-12-02 07:57:37 UTC
Type: Bug
Embargoed:
kernel-team: needinfo?


Attachments (Terms of Use)

Description Klaus Lichtenwalder 2015-01-25 13:47:30 UTC
Description of problem:
After resume from hibernation I sometimes experience fs corruption

Version-Release number of selected component (if applicable):
fedora 21 uptodate

How reproducible:


Steps to Reproduce:
1. hibernate
2. resume
3. check messages

Actual results:
Jan 24 17:56:08 acer kernel: [ 4957.614541] EXT4-fs error (device sda5): ext4_mb_generate_buddy:757: group 18, block bitmap and bg descriptor inconsistent: 15551 vs 15534 free clusters
Jan 24 17:56:08 acer kernel: [ 4957.614579] JBD2: Spotted dirty metadata buffer (dev = sda5, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
Jan 24 17:56:08 acer kernel: EXT4-fs error (device sda5): ext4_mb_generate_buddy:757: group 18, block bitmap and bg descriptor inconsistent: 15551 vs 15534 free clusters
Jan 24 17:56:08 acer kernel: JBD2: Spotted dirty metadata buffer (dev = sda5, blocknr = 0). There's a risk of filesystem corruption in case of system crash.


Jan 24 17:56:42 acer kernel: [ 4992.092989] EXT4-fs error (device sda5): ext4_mb_generate_buddy:757: group 81, block bitmap and bg descriptor inconsistent: 5567 vs 5578 free clusters
Jan 24 17:56:42 acer kernel: [ 4992.093387] EXT4-fs error (device sda5): ext4_mb_generate_buddy:757: group 82, block bitmap and bg descriptor inconsistent: 8849 vs 8856 free clusters
Jan 24 17:56:42 acer kernel: EXT4-fs error (device sda5): ext4_mb_generate_buddy:757: group 81, block bitmap and bg descriptor inconsistent: 5567 vs 5578 free clusters
Jan 24 17:56:43 acer kernel: EXT4-fs error (device sda5): ext4_mb_generate_buddy:757: group 82, block bitmap and bg descriptor inconsistent: 8849 vs 8856 free clusters


Suspend is no real answer, as the system has long phases of downtime

Expected results:


Additional info:

Comment 1 Justin M. Forbes 2015-01-27 14:58:48 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 21 kernel bugs.

Fedora 21 has now been rebased to 3.18.3-201.fc21.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.

Comment 2 Andrew Duggan 2015-01-29 20:25:00 UTC
Yes still very broken in 3.18.3-201.fc21

Linux apd-hp2 3.18.3-201.fc21.i686+PAE #1 SMP Mon Jan 19 16:09:58 UTC 2015 i686 i686 i386 GNU/Linux


Jan 29 07:47:37 apd-hp2 kernel: [18310.258685] EXT4-fs error (device dm-0): ext4_mb_generate_buddy:757: group 1434, block bitmap and bg descriptor inconsistent: 29375 vs 29376 free clusters
Jan 29 07:47:37 apd-hp2 kernel: [18310.258708] JBD2: Spotted dirty metadata buffer (dev = dm-0, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
Jan 29 07:47:37 apd-hp2 kernel: EXT4-fs error (device dm-0): ext4_mb_generate_buddy:757: group 1434, block bitmap and bg descriptor inconsistent: 29375 vs 29376 free clusters
Jan 29 07:47:38 apd-hp2 kernel: JBD2: Spotted dirty metadata buffer (dev = dm-0, blocknr = 0). There's a risk of filesystem corruption in case of system crash.


This ultimately results in cross linked files, and a huge mess.  This is a big deal There is no point to running linux on laptop if hibernate/resume doesn't work without destroying data.

Comment 3 pawel barabasz 2015-02-21 14:55:13 UTC
still the same for 3.18.7-200.fc21 (both x64, i686)

all my 3 machines (2 laptops, 1 desktop) have this problem - started after upgrade from F20 to F21 (was OK in F20 with 3.18.7-100.fc20)

Comment 4 slartibart70 2015-02-21 20:34:51 UTC
Same here on an older Pentium4 based laptop, since upgrading to fc21/32bit the filesystem corruption started (mainly on / partition, all on LVM)
Since now, a 'fsck' helps, but corruption continues to reappear.

Comment 5 Jonas Wielicki 2015-03-04 09:21:42 UTC
I’m experiencing this on my Fedora 21 desktop, with 3.18.7-200.fc21.x86_64.

For what it’s worth, my storage layout is an ext4 ontop of luks ontop of a mdraid1 spread over two SATA harddrives.

Is there any information we could gather to help solving this bug fast?

Comment 6 Andrew J. Schorr 2015-03-06 14:48:19 UTC
I have seen this twice in the past 2 days.  After completely reinstalling fresh Fedora 21 last night and updating to current versions of all rpms, I hibernated overnight.  When I resumed this morning, I got these ext4 corruption errors:

Mar 06 08:57:58 ajs-t530 kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:757: group 17, block bitmap and bg descriptor inconsistent: 21743 vs 21621 free clusters
Mar 06 08:58:02 ajs-t530 kernel: EXT4-fs error (device dm-1): __ext4_new_inode:1010: comm bluetoothd: failed to insert inode 265044: doubly allocated?
Mar 06 08:59:04 ajs-t530 kernel: EXT4-fs error (device dm-1): __ext4_new_inode:1010: comm systemd: failed to insert inode 265045: doubly allocated?
Mar 06 08:59:11 ajs-t530 kernel: EXT4-fs error (device dm-1): __ext4_new_inode:1010: comm NetworkManager: failed to insert inode 131480: doubly allocated?
Mar 06 09:13:17 ajs-t530 kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:757: group 33, block bitmap and bg descriptor inconsistent: 20009 vs 20007 free clusters
Mar 06 09:41:24 ajs-t530 kernel: EXT4-fs error (device dm-1): __ext4_new_inode:1010: comm yum: failed to insert inode 265046: doubly allocated?
Mar 06 09:41:44 ajs-t530 kernel: EXT4-fs error (device dm-1): __ext4_new_inode:1010: comm yum: failed to insert inode 265047: doubly allocated?

[schorr@ajs-t530 ~]$ uname -a
Linux ajs-t530 3.18.7-200.fc21.x86_64 #1 SMP Wed Feb 11 21:53:17 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Note that this is on x86_64, so this problem is not specific to i686.

Comment 7 Andrew J. Schorr 2015-03-06 14:52:49 UTC
If this is the same bug addressed in bugzilla #1174945, then a patch appears to be on the way...

Comment 8 Fedora Kernel Team 2015-04-28 18:29:12 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 21 kernel bugs.

Fedora 21 has now been rebased to 3.19.5-200.fc21.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 22, and are still experiencing this issue, please change the version to Fedora 22.

If you experience different issues, please open a new bug report for those.

Comment 9 Andrew Duggan 2015-04-28 18:39:41 UTC
On a side note changing the way that Hibernate/Resume works on F21 (well after its release) seemingly does not comply with Fedora rules in the past.  After the update to dracut that removes the code to attempt a resume at the beginning of the boot process Fedora now requires a RESUME=<swap device> on the kernel command line to resume from Hibernate.  All because systemd determined that it was  good idea to always fsck.  The real issue is that your init system no longer tolerates hibernate/resume so to use this very core functionality (Windows has been resuming from hibernate since early 1999 when Windows 2000 was still called Windows NT 5.0 and was still in beta.  Windows has never once in those 16 years corrupted a file system on resume, but Fedora has done it lots.

The systemd change that forces the fsck during resume should have been reverted, you should not have changed the resume process by re-requiring the kernel commandline option.

Comment 10 Fedora End Of Life 2015-11-04 12:18:13 UTC
This message is a reminder that Fedora 21 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 21. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '21'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 21 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 11 Fedora End Of Life 2015-12-02 07:57:47 UTC
Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.