Bug 623656

Summary: [xfs] xfstests 231 232 233 quota fails when testing on xfs
Product: Red Hat Enterprise Linux 6 Reporter: Boris Ranto <branto>
Component: quotaAssignee: Petr Pisar <ppisar>
Status: CLOSED ERRATA QA Contact: Martin Cermak <mcermak>
Severity: medium Docs Contact:
Priority: low    
Version: 6.1CC: dchinner, esandeen, fnadge, mcermak, ovasik, rvokal
Target Milestone: rcKeywords: Patch, RHELNAK
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: quota-3.17-11.el6 Doc Type: Bug Fix
Doc Text:
Cause Sequence of XFS file system remounts to switch on, off, and on again quota enforcement. Consequence Quota utilities does not recognize the file system as having quotas enabled and refuse to operate on it. This is because mount(8) command does not update /etc/mtab properly. Fix Fix that prefer /proc/mounts to get list of file systems with enabled quotas has been back-ported from upstream. /proc/mounts is managed by kernel that changes quota flags properly. Result Quota utilities recognize file system with enabled quotas properly despite sequence of remounts and continue operating on enabled file systems.
Story Points: ---
Clone Of:
: 657379 689822 (view as bug list) Environment:
Last Closed: 2011-05-19 14:09:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Output of test with newer xfstests(20100812)
none
Fix none

Description Boris Ranto 2010-08-12 13:00:56 UTC
Created attachment 438431 [details]
Output of test with newer xfstests(20100812)

Description of problem:
/kernel/filesystems/xfs/xfstests tests 231, 232 and 233 fails because of

Version-Release number of selected component (if applicable):
xfstests-0.0.0.4-20100709
rhel6-snapshot9-10

How reproducible:
100 %

Steps to Reproduce:
1. Schedule beaker job with /kernel/filesystems/xfs/xfstests task with TEST_PARAM_RUNTESTS="231 232 233"
2. Check the results for tests 231 232 233
  
Actual results:
All the tests fail because of additional quota output.

Expected results:
Tests pass.

Additional info
I've also tried to upgrade the xfstests(git version 20100812). The result can be seen in attachment.
Here is the example job for 20100709 git version of xfstests:
https://beaker.engineering.redhat.com/jobs/11509

Comment 2 RHEL Program Management 2010-08-12 13:17:50 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 3 Eric Sandeen 2010-08-12 13:49:19 UTC
This may just be a test suite problem, but I will look into it to be sure.

Thanks,
-Eric

Comment 4 Eric Sandeen 2010-08-12 17:24:36 UTC
Ok, I can reproduce too, it's likely really just a test problem, this isn't blocker material or anything.

NB: it shows up on xfs (when using the generic vfs quota tools, which isn't really the normal usecase).  ext4 testing on these same filesystems is fine.

Comment 5 Eric Sandeen 2010-08-12 18:02:50 UTC
Jan tells me this is a quota tools problem, I'll dig up the upstream commit that fixed it...

Comment 6 Eric Sandeen 2010-08-12 18:06:28 UTC
I think the upstream fix was:

Fix kern_quota_on() to work with XFS filesystems (Jan Kara)
Fix quotaon to work correctly with XFS filesystems (Jan Kara)


-Eric

Comment 7 Petr Pisar 2010-08-13 08:17:28 UTC
Could you explain what command fails or behaves differently? Where can I get content of the test--run commands, not just an output. I guess the problem is with quotaon and repquota failing on XFS, however it's just a guess because I do not know what are you trying to do.

Comment 8 Eric Sandeen 2010-08-13 12:14:58 UTC
Petr, the tests are at git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git

I think the issue is with when quota is on, but not enforcing, for xfs.

It's an odd one since people probably don't often use the vfs quota tools on xfs filesystems.

I can try to find time to just test Jan's change with quota to confirm that it fixes the test, and then you can decide if it's a change worth taking.  It's certainly not critical for this current release I think.

Thanks,
-Eric

Comment 9 Petr Pisar 2010-11-16 17:03:56 UTC
I get to this problem now.

First notice is XFS tools are very sensitive to size of underlying device. xfs_repair cannot operate on 20MB device as it cannot figure out geometry. Running test 231 on 100MB large device (on Fedora 14) fails on `file system busy or already mounted'. Size 50 MB makes all the tools happy and I can reproduce reported problem. I do not mention that the tests need to be run against bar block device---not a symlink to the device (tests parses `lstat64' output).

The bad news is test 231 still fails with the same symptoms even running against quota tools from git HEAD.

I hope I'll have more time for this issue in next few days.

Comment 10 Eric Sandeen 2010-11-16 17:40:12 UTC
Well, xfs_repair can cope with a 20M fs, but since it has only 1 ag we need to override the default action of checking a secondary ag.  Maybe that should be made default but frankly there aren't many 20M xfs filesystems in the world.

You're right that xfstests don't play too well with symlinked devices (i.e. lvm) - I've fiddled with this a little but haven't found great generic solutions.
In a pinch you can just use loopback devices.

Anyway, the 50M vs. 100M sounds like a red herring, but if you can identify some problem I can look into it.  Both of these are really too small to worry much about.

In general for maximum flexibility on these quota tests I'd suggest 2G-4G loopback devices if you don't have simple partitions.

-Eric

Comment 11 Petr Pisar 2010-11-18 18:07:17 UTC
Correction: Test 231 passes with quota tools from git HEAD.

However there is a bug in test script. For some reason it does not unmount SCRATCH_MNT after preparation and before tests. The file system remains mounted with `noquota' option. Then the script complains it cannot mount already mounted file system and repquota cannot report limits because XFS accounting is off.

Umount before real test fixes the problem in xfs-test suite.

diff --git a/231 b/231
index 115b4c0..b2f7c4a 100755
--- a/231
+++ b/231
@@ -65,6 +65,8 @@ _fsx()
 }
 
 # real QA test starts here
+umount $SCRATCH_MNT
+#set -x
 _supported_fs generic
 _supported_os Linux
 _require_scratch


I will verify patches you pointed fix the problem later.

Comment 12 Eric Sandeen 2010-11-18 18:13:21 UTC
Argh sorry about the test problem.  I'll look into that, in general I think we shouldn't have to manually unmount that fs... maybe it's left from the prior test?

Comment 13 Petr Pisar 2010-11-23 15:56:38 UTC
It's not left. I call ./check 231 only when the point is not mounted, and according my probes the code in test 231 before line 

# real QA test starts here

(as shown in diff above) mount it and keep it mounted.

Comment 14 Eric Sandeen 2010-11-23 16:09:31 UTC
ok, thanks for checking.

Comment 15 Petr Pisar 2010-11-23 16:18:32 UTC
I've tracked the repquota problem in test 231.

repquota from Fedora-14 parses /etc/mtab. repquota from git HEAD (actually since d271d0a329a5b3e99f4ad209b5270a8648739ff0 (Use /proc/mounts for mountpoint scanning)) parses /proc/mounts.

The mount point line in time of repquota exec looks following:

$ grep scratch /etc/mtab
/dev/mapper/vg_dhcp0122-test--scratch /mnt/test-scratch xfs rw,context="system_u:object_r:nfs_t:s0",usrquota,grpquota,noquota,usrquota,grpquota 0 0

$ grep scratch /proc/mounts 
/dev/mapper/vg_dhcp0122-test--scratch /mnt/test-scratch xfs rw,context=system_u:object_r:nfs_t:s0,relatime,attr2,usrquota,grpquota 0 0

As you can see mount(8) edits mtab in really weird manner and that confuses repqouta.

Comment 16 Eric Sandeen 2010-11-23 16:39:58 UTC
yeesh!

Ok so somebody was doing a mount -o remount, and I think that's what led to the really weird mount string?

For what its worth, quota-4.0.0 on a rhel6 box passes unmodified test 231 for me ... I can't demonstrate a test script bug.

Comment 17 Petr Pisar 2010-11-23 17:24:13 UTC
I cannot see double mount on RHEL-6 (and it does not complain about busy file system), however in Fedora-14 I can see it by strace:

$ < /tmp/231.log  grep execve |grep mount | grep -E '(scratch|dm-7)'
[pid 32155] execve("/bin/umount", ["umount", "/dev/dm-7"], [/* 97 vars */]) = 0
[pid 32166] execve("/bin/mount", ["/bin/mount", "-t", "xfs", "-o", "context=system_u:object_r:nfs_t:"..., "/dev/dm-7", "/mnt/test-scratch/"], [/* 97 vars */]) = 0
[pid 32409] execve("/bin/mount", ["/bin/mount", "-t", "xfs", "-o", "context=system_u:object_r:nfs_t:"..., "/dev/dm-7", "/mnt/test-scratch/"], [/* 97 vars */]) = 0
[pid 32434] execve("/bin/mount", ["mount", "-o", "remount,noquota", "/dev/dm-7"], [/* 97 vars */]) = 0
[pid 32435] execve("/bin/mount", ["mount", "-o", "remount,usrquota,grpquota", "/dev/dm-7"], [/* 97 vars */]) = 0
[pid 32495] execve("/bin/mount", ["mount", "-o", "remount,noquota", "/dev/dm-7"], [/* 97 vars */]) = 0
[pid 32496] execve("/bin/mount", ["mount", "-o", "remount,usrquota,grpquota", "/dev/dm-7"], [/* 97 vars */]) = 0
[pid 32528] execve("/bin/mount", ["mount", "-o", "remount,noquota", "/dev/dm-7"], [/* 97 vars */]) = 0
[pid 32529] execve("/bin/mount", ["mount", "-o", "remount,usrquota,grpquota", "/dev/dm-7"], [/* 97 vars */]) = 0
[pid 32542] execve("/bin/umount", ["umount", "/dev/dm-7"], [/* 97 vars */]) = 0

Here you can see the funny remount stuff leading to disruptive /etc/mtab.

Fedora-14 behaviour is not subject of this bug report, I just add it as notice there are such problems.

Comment 18 Eric Sandeen 2010-11-24 14:53:11 UTC
Weird.  Well, thanks.  I'll keep an eye out for it.

Comment 19 Petr Pisar 2010-11-25 16:05:32 UTC
Created attachment 462931 [details]
Fix

Back-ported fix from 4.00_pre1 upstream version.

This patch makes quota tools to parse /proc/mounts in favor to /etc/mtab as /etc/mtab gets corrupted by noquota/usrquota remounts.

Comment 20 Petr Pisar 2010-11-25 16:28:15 UTC
How to test:

(1) Create XFS file system
(2) Mount it with -o usrquota
(3) remount with -o remount,noquota
(4) remount with -o remount,usrquota
(5) run repquota -nu on the file system

Affected version complains:
repquota: Mountpoint (or device) /mnt/test not found or has no quota enabled.
repquota: Not all specified mountpoints are using quota.

Fixed version print quota details:
*** Report for user quotas on device /dev/mapper/vg_dhcp0122-test
Block grace time: 7days; Inode grace time: 7days
                        Block limits                File limits
User            used    soft    hard  grace    used  soft  hard  grace
----------------------------------------------------------------------
#0        --       0       0       0              3     0     0

Comment 21 Dave Chinner 2011-01-14 09:05:00 UTC
As a different workaround, just adding /mnt/test and /mnt/scratch to /etc/fstab will make the tests work properly with an existing repquota binary. That has always worked for me....

Comment 26 Petr Pisar 2011-02-03 15:35:51 UTC
(In reply to comment #20)
> How to test:
> 
> (1) Create XFS file system
> (2) Mount it with -o usrquota
> (3) remount with -o remount,noquota
> (4) remount with -o remount,usrquota
> (5) run repquota -nu on the file system
> 
> Affected version complains:
> repquota: Mountpoint (or device) /mnt/test not found or has no quota enabled.
> repquota: Not all specified mountpoints are using quota.
> 
> Fixed version print quota details:
> *** Report for user quotas on device /dev/mapper/vg_dhcp0122-test
> Block grace time: 7days; Inode grace time: 7days
>                         Block limits                File limits
> User            used    soft    hard  grace    used  soft  hard  grace
> ----------------------------------------------------------------------
> #0        --       0       0       0              3     0     0

Actually, better test is to check new quota tools parse /proc/mounts instead of /etc/mtab:

Run `strace -eopen repquota -a' (no file system with quotas in needed) and watch stderr.
Before: You can see /etc/mtab is opened, /proc/mounts is not used:
  open("/etc/mtab", O_RDONLY) = 3
After: You can see /proc/mounts is opened instead of /etc/mtab:
  open("/proc/mounts", O_RDONLY) = 3

Comment 28 Martin Cermak 2011-03-16 13:53:11 UTC
Verified according to https://beaker.engineering.redhat.com/jobs/62389.

Comment 29 Florian Nadge 2011-03-30 14:40:53 UTC
Hi,
I am reviewing and editing erratum:
http://errata.devel.redhat.com/errata/stateview/10703
and would need some more details for this bug to state all the necessary
points.

Could you give a few key words to the points I pasted into the Technical Notes
field. Once I have the text I can start on the approval process. 

Thanks

Comment 30 Florian Nadge 2011-03-30 14:40:54 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause
    What actions or circumstances cause this bug to present.
Consequence
    What happens when the bug presents.
Fix
    What was done to fix the bug.
Result
    What now happens when the actions or circumstances above occur.
    Note: this is not the same as the bug doesn’t present anymore.

Comment 31 Petr Pisar 2011-03-31 11:36:22 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,9 +1,17 @@
 Cause
-    What actions or circumstances cause this bug to present.
+    Sequence of XFS file system remounts to switch on, off,
+    and on again quota enforcement.
 Consequence
-    What happens when the bug presents.
+    Quota utilities does not recognize the file system
+    as having quotas enabled and refuse to operate on it.
+    This is because mount(8) command does not update
+    /etc/mtab properly.
 Fix
-    What was done to fix the bug.
+    Fix that prefer /proc/mounts to get list of file
+    systems with enabled quotas has been back-ported from
+    upstream. /proc/mounts is managed by kernel that
+    changes quota flags properly.
 Result
-    What now happens when the actions or circumstances above occur.
+    Quota utilities recognize file system with enabled
-    Note: this is not the same as the bug doesn’t present anymore.+    quotas properly despite sequence of remounts and
+    continue operating on enabled file systems.

Comment 32 errata-xmlrpc 2011-05-19 14:09:55 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0716.html