Bug 578674
Summary: | JBD: Spotted dirty metadata buffer | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Norman Gaywood <ngaywood> | ||||
Component: | kernel | Assignee: | Eric Sandeen <esandeen> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 14 | CC: | anton, dougsland, esandeen, gansalmon, itamar, jonathan, kernel-maint, kmcmartin, L.Bonnaud, ngaywood, pza | ||||
Target Milestone: | --- | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | kernel-2.6.35.10-72.fc14 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-12-22 19:53:25 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Norman Gaywood
2010-04-01 02:43:16 UTC
I guess I should be more specific in my kernel versions. I've seen this in at least: kernel-2.6.32.9-70.fc12.x86_64 kernel-2.6.32.10-90.fc12.x86_64 was this on ext3 or ext4? I have several ext4 filesystems with user quotas enabled and it happens on all of them. Output from mount looks like: /dev/mapper/SYSTEM-root on / type ext4 (rw,relatime) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) tmpfs on /dev/shm type tmpfs (rw) /dev/mapper/SYSTEM-var on /var type ext4 (rw,relatime,usrquota) /dev/mapper/SYSTEM-opt on /opt type ext4 (rw,relatime) /dev/xvda1 on /boot type ext3 (rw,relatime) /dev/mapper/SYSTEM-tmp on /tmp type ext4 (rw,relatime,usrquota) /dev/mapper/HOME-home on /.automount/turing/disks/turing/home type ext4 (rw,relatime,usrquota) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) nfsd on /proc/fs/nfsd type nfsd (rw) The syslog messages appears whenever I call this script after adding a new user: #!/bin/sh uid=$1 # Must have a uid if [ -z ${uid} ]; then echo "$0: No uid provided" exit 1 fi # setquota ${uid} ${bsoft} ${bhard} ${isoft} ${ihard} ${filesys} /usr/sbin/setquota ${uid} 100000 200000 100000 200000 /dev/mapper/SYSTEM-tmp /usr/sbin/setquota ${uid} 500000 800000 100000 200000 /dev/mapper/SYSTEM-var /usr/sbin/setquota ${uid} 500000 540000 100000 200000 /dev/mapper/HOME-home can you do: # debugfs -R "icheck 86532501" /dev/dm-0 (or whatever the device corresponsing to dm-0 is, probably your root fs device?) /dev/dm-0 is /dev/mapper/HOME-home debugfs -R "icheck 86532501" /dev/mapper/HOME-home debugfs 1.41.9 (22-Aug-2009) Block Inode number 86532501 100727 Created attachment 403861 [details]
interleaved calls to setquota(8) and JBD messages
I just did some extra checks to see if the messages did occur after setquota, and I'm not so sure now. Attached are the adduser logs where setquota is called and the JBD messages from syslog.
There seems to be a delay if there is a correlation.
ok and now: # find /dev/mapper-HOME-home -inum 100727 but ... if it's not related to setquota after all, then maybe the file containing the block in question is not so interesting. Still, may offer a clue. 100727 aquota.user and similarly on a smaller filesystem than the home area: debugfs -R "icheck 5547680" /dev/dm-4 debugfs 1.41.9 (22-Aug-2009) Block Inode number 5547680 14050 ls -li /var/aquota.user 14050 -rw------- 1 root root 12028928 2010-04-01 15:19 /var/aquota.user Ok, that was my guess; excellent clue, I'll see what I can make of this. Thanks, -Eric There's a patch proposed upstream from Jan Kara to address this. http://marc.info/?l=linux-ext4&m=127548861305393&w=2 [PATCH] ext4: Always journal quota file modifications When journaled quota options are not specified, we do writes to quota files just in data=ordered mode. This actually causes warnings from JBD2 about dirty journaled buffer because ext4_getblk unconditionally treats a block allocated by it as metadata. Since quota actually is filesystem metadata, the easiest way to get rid of the warning is to always treat quota writes as metadata... Signed-off-by: Jan Kara <jack> --- fs/ext4/super.c | 19 +++++-------------- 1 files changed, 5 insertions(+), 14 deletions(-) Ted, this patch fixes some JBD2 warning for me when running XFSQA with quotas enabled. I think this is a move into a direction you are trying to achieve as well. Will you merge the patch or should I do it? Honza Thanks Eric, good news. I guess this patch will make its way into 2.6.32.16 and then into koji. I'll wait till it hits there and then test. Norman, dunno about 2.6.32.16 but hopefully 2.6.32.x eventually. Sorry for the long wait on this, too many irons, only one fire ;) -Eric FWIW I still don't see that it's made it upstream. I'll poke Ted on it ... it's not a huge problem unless you crash, in which case you can always rebuild quotafiles. Eric, thanks for pursuing this. It so happens that the system I am seeing this problem on, a Xen domU, crashes often! See bug #550724 for the details of the system crash. I've had many crashes but have only seen a corrupted quota file reported maybe 5 times. As you say, rebuilding the quota files allows us to continue. Still seeing this message in 2.6.32.17-156.fc12.x86_64 so I guess the patch did not make the cut for 2.6.32.17. I also see that 2.6.32.18 is out and I don't see anything in the patch list that indicates the patch is included. I posted an oops to bug #608770 that was could have been caused by corruption of the quota file. I probably should have posted that report here. I guess I'm in an unlucky state in that I'm seeing a combination of rare problems that not many others are seeing. So there is not much pressure to get this particular issue fixed. I do appreciate the help I am getting though. Patch just fairly recently made it upstream: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=62d2b5f2dcd3707b070efb16bbfdf6947c38c194 I'll see about getting it into .32.y but can put it in fedora in the meantime... -Eric (In reply to comment #17) > I'll see about getting it into .32.y but can put it in fedora in the > meantime... Thanks Eric for the attention. I was wondering though, is this a serious problem? Is it just an ugly warning message? Will there always be corruption if there is a crash or does the crash have to happen within a certain time after the message? I get a lot of crashes and sometimes I get quotafile corruption. Will the patch potentially stop the corruption or will it just hide the error message? I think it's mostly just a warning. You weren't journaling quota anyway, but bits of code thought you were (because of the way the IO was submitted), so issued this warning when they saw the buffer. It's not a message that gives users a warm fuzzy though, so fixing would/will be good. Sorry it's been open so long :) -Eric This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. Just updated the system that is seeing this bug to F14 and the messages are still there. kernel-2.6.35.9-64.fc14.x86_64 Same event of setting user quotas generates the message. Thanks, will be in the next update of the F-14 kernel. kernel-2.6.35.10-68.fc14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/kernel-2.6.35.10-68.fc14 kernel-2.6.35.10-69.fc14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/kernel-2.6.35.10-69.fc14 kernel-2.6.35.10-72.fc14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/kernel-2.6.35.10-72.fc14 I'm seeing this in RHEL6 as well Ignore my comment. RHEL6 is addressed in: http://rhn.redhat.com/errata/RHSA-2010-0842.html https://bugzilla.redhat.com/show_bug.cgi?id=641454 kernel-2.6.35.10-72.fc14 has been pushed to the Fedora 14 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update kernel'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/kernel-2.6.35.10-72.fc14 kernel-2.6.35.10-72.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report. |