Bug 1233220

Summary: xfs_admin -c and -U appears not to have any effect
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: xfsprogsAssignee: Eric Sandeen <esandeen>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 23CC: esandeen
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-26 19:33:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 910269    

Description Richard W.M. Jones 2015-06-18 13:17:24 UTC
Description of problem:

We observed some strange things happening in the libguestfs
test suite.  It seems as if the following commands have no
effect:

xfs_admin -c 0 /dev/sda1
xfs_admin -U <a_UUID> /dev/sda1

(where /dev/sda1 is an XFS partition)

Previously we ran these commands and then immediately (in
a script) ran xfs_info to read back the settings to see if
they had changed.  But this no longer works in current Rawhide.

Do we need to insert some kind of synchronization command
after the xfs_admin command?

A log demonstrating the problem is here.  You need to search
for the string "FAIL:" to see the two tests that fail.

https://kojipkgs.fedoraproject.org//work/tasks/1211/10111211/build.log

Version-Release number of selected component (if applicable):

xfsprogs-3.2.3-1.fc23.x86_64
kernel 4.1.0-0.rc8.git0.2.fc23

How reproducible:

Happened at least once.

Steps to Reproduce:
1. xfs_admin -c 0 /dev/sda1
2. mount /dev/sda1 /sysroot
3. xfs_info /sysroot
4. Check the lazy-count (should be =0, in fact =1)

Similar for the xfs_admin -U case.

Comment 1 Eric Sandeen 2015-06-18 15:26:50 UTC
uuids got tricky in 3.2.3, which enables V5 / CRC filesystems by default.  These filesystems stamp the filesystem UUID into every piece of metadata, so changing it is nontrivial, and (I thought) is disabled for now.  I have patches pending to re-allow it, albeit with a (reversible) incompatible disk format change.  Those patches require kernel and userspace changes, and are still pending.

Anyway, xfs_admin -U should fail w/ a clear explanation, right now:

# xfs_admin -U generate /dev/sdd
xfs_admin: only 'rewrite' supported on V5 fs
# xfs_admin -U fed6c018-6ecf-4ec1-a3ef-111111111111 /dev/sdd
xfs_admin: only 'rewrite' supported on V5 fs

As for lazy counters, let me look into that.

(Is there some reason you want to turn them off?)

Comment 2 Richard W.M. Jones 2015-06-18 15:51:25 UTC
(In reply to Eric Sandeen from comment #1)
> uuids got tricky in 3.2.3, which enables V5 / CRC filesystems by default. 
> These filesystems stamp the filesystem UUID into every piece of metadata, so
> changing it is nontrivial, and (I thought) is disabled for now.  I have
> patches pending to re-allow it, albeit with a (reversible) incompatible disk
> format change.  Those patches require kernel and userspace changes, and are
> still pending.
> 
> Anyway, xfs_admin -U should fail w/ a clear explanation, right now:
> 
> # xfs_admin -U generate /dev/sdd
> xfs_admin: only 'rewrite' supported on V5 fs
> # xfs_admin -U fed6c018-6ecf-4ec1-a3ef-111111111111 /dev/sdd
> xfs_admin: only 'rewrite' supported on V5 fs

So I guess we might be looking at two different bugs?  I didn't
explain the UUID part of this bug very well, but what libguestfs
does in the test is this sequence of commands:

wipefs -a --force /dev/sda1
mkfs -t xfs -f /dev/sda1
xfs_admin -L newlabel /dev/sda1
blkid -c /dev/null -o value -s LABEL /dev/sda1
# ^^ this returns "newlabel" correctly
xfs_admin -U 01234567-0123-0123-0123-0123456789ab /dev/sda1
blkid -c /dev/null -o value -s UUID /dev/sda1
# ^^ this returns some other UUID, probably the original one

We check the return values of all commands, and none of
them fail.  Also, these commands are executed in rapid
sequence, with no synchronization or udev settle or anything
like that.

> As for lazy counters, let me look into that.
> 
> (Is there some reason you want to turn them off?)

No :-)  I have no particular design to turn this setting on or
off, but it was contributed as an API upstream, and we test that
API, and we've discovered it stopped working :-)

Comment 3 Richard W.M. Jones 2015-06-18 15:54:44 UTC
s/design/desire/

Comment 4 Eric Sandeen 2015-06-18 15:59:22 UTC
(In reply to Richard W.M. Jones from comment #2)

> So I guess we might be looking at two different bugs?  I didn't
> explain the UUID part of this bug very well, but what libguestfs
> does in the test is this sequence of commands:
> 
> wipefs -a --force /dev/sda1
> mkfs -t xfs -f /dev/sda1
> xfs_admin -L newlabel /dev/sda1
> blkid -c /dev/null -o value -s LABEL /dev/sda1
> # ^^ this returns "newlabel" correctly
> xfs_admin -U 01234567-0123-0123-0123-0123456789ab /dev/sda1
> blkid -c /dev/null -o value -s UUID /dev/sda1
> # ^^ this returns some other UUID, probably the original one
> 
> We check the return values of all commands, and none of
> them fail.

Yeah, whoops, our bad:

# xfs_admin -U fed6c018-6ecf-4ec1-a3ef-111111111111 /dev/sdd
xfs_admin: only 'rewrite' supported on V5 fs
# echo $?
0

...

> > As for lazy counters, let me look into that.
> > 
> > (Is there some reason you want to turn them off?)
> 
> No :-)  I have no particular design to turn this setting on or
> off, but it was contributed as an API upstream, and we test that
> API, and we've discovered it stopped working :-)

Yeah, thanks, I'll look into it.  It's probably an undocumented and poorly implemented design decision.  ;)

Comment 5 Eric Sandeen 2015-06-18 16:09:06 UTC
Ugh, xfs_db (called by xfs_admin) exit codes are a disaster.  :/

Comment 6 Eric Sandeen 2015-06-18 16:32:53 UTC
Ok, sigh, on the lazycount front:

static inline int xfs_sb_version_haslazysbcount(xfs_sb_t *sbp)
{
        return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) ||
               (xfs_sb_version_hasmorebits(sbp) &&
                (sbp->sb_features2 & XFS_SB_VERSION2_LAZYSBCOUNTBIT));
}

so even though xfs_admin -c 0 turns it off, un-sets the bit, and writes that to the superblock (the feature bit is indeed unset on disk), the code above reports that it's enabled, and with V5 superblocks always proceeds as if enabled, regardless of the value of the feature bit on disk.

Rawr.

We *are* trying to limit this matrix; it makes sense to not allow disabling the feature on the newer superblocks, but in that case we should also fail the command to unset it, and *certainly* not write the change to disk if we're going to ignore it on read.

Comment 7 Eric Sandeen 2015-06-18 16:43:42 UTC
Brave new world:

# repair/xfs_repair -c lazycount=0 /dev/sdd
Phase 1 - find and verify superblock...
Cannot change lazy-counters on V5 fs
# echo $?
1

Comment 8 Jan Kurik 2015-07-15 13:58:55 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle.
Changing version to '23'.

(As we did not run this process for some time, it could affect also pre-Fedora 23 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23

Comment 9 Eric Sandeen 2016-02-26 19:33:19 UTC
Ok, both of these are now resolved upstream, both in xfsprogs-4.2.0:

commit c40f279af27c4ecdbc19cd906b022c1326c59d17
Author: Eric Sandeen <sandeen>
Date:   Fri Jul 31 09:04:56 2015 +1000

    xfs_repair: refuse to unset lazycount on V5 filesystems
    
    Running "xfs_admin -c 0" on a V5 fs has unfortunate results.
    
    Before:
    versionnum [0xbca5+0x18a] = ... LAZYSBCOUNT,PROJID32BIT,CRC, ...
    After:
    versionnum [0xbca5+0x188] = ... LAZYSBCOUNT,PROJID32BIT,CRC, ...
    
    The on-disk feature bit is in fact changed, but xfs_db reports that the
    feature is still present.  This is because xfs_sb_version_haslazysbcount()
    ignores the feature bit, and always reports that the lazy-count feature is
    present on a V5 fs.
    
    i.e. the intent is that this feature is not optional on a V5 fs, yet we
    still flip the bit on disk when asked.  Not good.
    
    So, document this, and error out if the user tries to disable lazy-count
    on a version 5 filesystem.
    
    Reported-by: Richard W.M. Jones <rjones>
    Signed-off-by: Eric Sandeen <sandeen>
    Reviewed-by: Brian Foster <bfoster>
    Signed-off-by: Dave Chinner <david>


commit 9c4e12fb60c15dc9c5e54041c9679454b42cb23e
Author: Eric Sandeen <sandeen>
Date:   Mon Aug 3 10:45:00 2015 +1000

    xfsprogs: Add new sb_meta_uuid field, update userspace tools to manipulate it
    
    This adds a new superblock field, sb_meta_uuid.  This allows us to
    change the use-visible UUID on crc-enabled filesytems from userspace
    if desired, by copying the existing UUID to the new location for
    metadata comparisons.  If this is done, an incompat flag must be
    set to prevent older filesystems from mounting the filesystem, but
    the original UUID can be restored, and the incompat flag removed,
    with a new xfs_db / xfs_admin UUID command, "restore."
    
    Much of this patch mirrors the kernel patch in simply renaming
    the field used for metadata uuid comparison; other bits:
    
    * Teach xfs_db to print the new meta_uuid field
    * Allow xfs_db to generate a new UUID for CRC-enabled filesystems
    * Allow xfs_db to revert to the original UUID and clear the flag
    * Fix up xfs_copy to work with CRC-enabled filesystems
    * Update the xfs_admin manpage to show the UUID "restore" command
    
    Signed-off-by: Eric Sandeen <sandeen>
    Reviewed-by: Brian Foster <bfoster>
    Signed-off-by: Dave Chinner <david>