Bug 237538

Summary:	mount.gfs2 doesn't play well with local fs on loopback devices
Product:	[Fedora] Fedora	Reporter:	Eric Sandeen <esandeen>
Component:	gfs2-utils	Assignee:	Chris Feist <cfeist>
Status:	CLOSED DUPLICATE	QA Contact:
Severity:	medium	Docs Contact:
Priority:	medium
Version:	rawhide	CC:	rpeterso, sgrubb, swhiteho, teigland
Target Milestone:	---	Keywords:	Reopened
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2008-03-12 15:41:43 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	429769
Bug Blocks:	237544

Description Eric Sandeen 2007-04-23 18:09:31 UTC

make a local gfs2 fs and try to mount it via loopback:

[root@neon tmp]# dd if=/dev/zero of=fsfile bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.239597 seconds, 438 MB/s
[root@neon tmp]# mkfs.gfs2 -p lock_nolock -j 1 fsfile 
This will destroy any data on fsfile.

Are you sure you want to proceed? [y/n] y

Device:                    fsfile
Blocksize:                 4096
Device Size                0.10 GB (25600 blocks)
Filesystem Size:           0.10 GB (25599 blocks)
Journals:                  1
Resource Groups:           1
Locking Protocol:          "lock_nolock"
Lock Table:                ""

[root@neon tmp]# dmesg -c > /dev/null
[root@neon tmp]# mount -o loop fsfile mnt/
/sbin/mount.gfs2: can't find /proc/mounts entry for directory mnt
[root@neon tmp]# dmesg
GFS2: fsid=: Trying to join cluster "lock_nolock", "loop0"
GFS2: fsid=loop0.0: Joined cluster. Now mounting FS...
GFS2: fsid=loop0.0: jid=0, already locked for use
GFS2: fsid=loop0.0: jid=0: Looking at journal...
GFS2: fsid=loop0.0: jid=0: Done

loop0 is still set up though:

[root@neon tmp]# losetup /dev/loop0
/dev/loop0: [0802]:33847258 (fsfile)
[root@neon tmp]# losetup /dev/loop1
loop: can't get info on device /dev/loop1: No such device or address

now bypass mount.gfs2:

[root@neon tmp]# mount -i -o loop fsfile mnt/

mounts and sets up another loopback device:

[root@neon tmp]# losetup /dev/loop1
/dev/loop1: [0802]:33847258 (fsfile)

umount fails too:

[root@neon tmp]# umount mnt/
/sbin/umount.gfs2: file system mounted on /tmp/mnt not found in mtab

works if you bypass umount.gfs2:

[root@neon tmp]# umount -i mnt/

original failed mount never cleaned up loop0:

[root@neon tmp]# losetup /dev/loop0
/dev/loop0: [0802]:33847258 (fsfile)
[root@neon tmp]# losetup /dev/loop1
loop: can't get info on device /dev/loop1: No such device or address

this was all on a reasonably uptodate FC6 box

Comment 1 Eric Sandeen 2007-04-23 18:10:45 UTC

mount -v output at dct's request:

[root@neon tmp]# mount -v -o loop fsfile mnt/
mount: going to use the loop device /dev/loop0
mount: you didn't specify a filesystem type for /dev/loop0
       I will try type gfs2
/sbin/mount.gfs2: mount /dev/loop0 mnt
/sbin/mount.gfs2: parse_opts: opts = "rw"
/sbin/mount.gfs2:   clear flag 1 for "rw", flags = 0
/sbin/mount.gfs2: parse_opts: flags = 0
/sbin/mount.gfs2: parse_opts: extra = ""
/sbin/mount.gfs2: parse_opts: hostdata = ""
/sbin/mount.gfs2: parse_opts: lockproto = ""
/sbin/mount.gfs2: parse_opts: locktable = ""
/sbin/mount.gfs2: mount(2) ok
/sbin/mount.gfs2: can't find /proc/mounts entry for directory mnt

Comment 2 David Teigland 2007-04-23 18:23:15 UTC

This is caused by the lack of a preceding "/" before "mnt".
/proc/mounts always shows the preceding "/" in the mountpoint
and mount.gfs2 fails to match "mnt" and "/mnt".

Comment 3 Eric Sandeen 2008-03-05 17:15:41 UTC

Just tested on rawhide:

[root@localhost tmp]# dd if=/dev/zero of=fsfile bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 1.50871 s, 69.5 MB/s
[root@localhost tmp]# mkfs.gfs2 -p lock_nolock -j 1 fsfile 
This will destroy any data on fsfile.

Are you sure you want to proceed? [y/n] y

Device:                    fsfile
Blocksize:                 4096
Device Size                0.10 GB (25600 blocks)
Filesystem Size:           0.10 GB (25599 blocks)
Journals:                  1
Resource Groups:           1
Locking Protocol:          "lock_nolock"
Lock Table:                ""

[root@localhost tmp]# mkdir mnt
[root@localhost tmp]# mount -o loop fsfile mnt/
/sbin/mount.gfs2: error 22 mounting /dev/loop0 on mnt
[root@localhost tmp]# dmesg | tail -n 5
ark3116 5-2:0.0: device disconnected
GFS2 (built Jan 25 2008 13:16:34) installed
GFS2: fsid=: unknown option: loop=/dev/loop0
GFS2: fsid=: invalid mount option(s)
GFS2: can't parse mount arguments

doesn't appear to be fixed.  Or, now there is a new problem.  Or, the fix is not
really upstream...

-Eric

Comment 4 Eric Sandeen 2008-03-05 17:16:53 UTC

ah, was not quite rawhide.  kernel was 2.6.24-2.fc9

will test 2.6.25 shortly.

Comment 5 Steve Whitehouse 2008-03-05 17:19:08 UTC

Afaik its due to fedora having an ancient build of gfs2-utils which is supposed
to be corrected shortly.

Comment 6 Eric Sandeen 2008-03-05 17:26:06 UTC

sounds good.  I'll retest & close when the new one hits.

-Eric

Comment 7 Steve Whitehouse 2008-03-05 17:33:01 UTC

Btw, bz #429769 is the bz for gfs2-utils being out of date, wrt to upstream. So
I still think its fixed in upstream, it just hasn't made it to fedora so far.

Comment 8 Eric Sandeen 2008-03-05 17:36:43 UTC

Ok.  If you really want to close this based on it being upstream that's fine;
otherwise I don't mind leaving it in needinfo state from me, and I'll test when
it hits fedora (since that's what it was originally filed against...)

-Eric

Comment 9 Eric Sandeen 2008-03-05 18:30:28 UTC

Ok, for fun, I tried code from git master:

[root@localhost tmp]# dd if=/dev/zero of=fsfile bs=1M count=100
[root@localhost tmp]# mkfs.gfs2 -O -p lock_nolock -j 1 fsfile
mkfs.gfs2: out of space

(hmm that's new behavior; what's the miminum filesystem size now?)

[root@localhost tmp]# dd if=/dev/zero of=fsfile bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 5.6532 s, 37.1 MB/s
[root@localhost tmp]# dd if=/dev/zero of=fsfile bs=1M count=100
[root@localhost tmp]# mkfs.gfs2 -O -p lock_nolock -j 1 fsfile
Device:                    fsfile
Blocksize:                 4096
Device Size                0.20 GB (51200 blocks)
Filesystem Size:           0.20 GB (51197 blocks)
Journals:                  1
Resource Groups:           1
Locking Protocol:          "lock_nolock"
Lock Table:                ""

[root@localhost tmp]# mount -o loop fsfile mnt/
/sbin/mount.gfs2: error mounting /dev/loop3 on /tmp/mnt: Invalid argument
[root@localhost tmp]# dmesg | tail -n 3
GFS2: fsid=: unknown option: loop=/dev/loop3
GFS2: fsid=: invalid mount option(s)
GFS2: can't parse mount arguments
[root@localhost tmp]# mount -i -o loop fsfile mnt/
[root@localhost tmp]# grep gfs2 /proc/mounts
/dev/loop5 /tmp/mnt gfs2 rw,relatime,localflocks,localcaching 0 0
[root@localhost tmp]# umount mnt
/sbin/umount.gfs2: file system mounted on /tmp/mnt not found in mtab
[root@localhost tmp]# grep gfs2 /proc/mounts
[root@localhost tmp]# 

looks like pretty much the same behavior.

[root@localhost tmp]# mkfs.gfs2 -V
gfs2_mkfs DEVEL.1204739061 (built Mar  5 2008 11:46:33)
Copyright (C) Red Hat, Inc.  2004-2007  All rights reserved.
[root@localhost tmp]# mount.gfs2 -V
mount.gfs2 DEVEL.1204739061 (built Mar  5 2008 11:46:36)
[root@localhost tmp]# uname -a
Linux localhost.localdomain 2.6.25-0.90.rc3.git5.fc9 #1 SMP Tue Mar 4 20:37:36
EST 2008 i686 i686 i386 GNU/Linux

Comment 10 Steve Whitehouse 2008-03-05 20:09:30 UTC

Maybe the out of space thing is due to a different default journal size? Thats
now 128M and probably should be scaled back for smaller filesystems.

Since it seems to be the same problem, I'll put this back on the "input queue"
for looking at again. I've seen a number of similar reports where mount.gfs2
doesn't understand some argument or other. I wonder if we can simply ignore
arguments which are not understood. If we can't then we need to do an audit of
all the likely arguments and ensure that our list is complete.

Comment 11 Steve Whitehouse 2008-03-10 15:57:50 UTC

I think the problem here is in mount.gfs2. So far as I can tell from a fairly
quick look at this we need to check for "loop=/dev/foo" as a parameter to
mount.gfs2 and then pass /dev/foo to the kernel when mounting gfs2 rather than
the "normal" device field (which will contain the file backing the loopback dev).

So it ought to be a fairly easy fix.

Comment 12 Robert Peterson 2008-03-11 21:15:27 UTC

I can't recreate this problem.  I've tried a lot of combinations and
loopback always seems to be working for me.  Things I tried include:

1. Different kernels: Upstream works.  2.6.18-83.el5 works.
2. Stand-alone (plain mount) vs. cluster infrastructure (service cman).
   Both work.
3. Not specifying any slashes on mount point.  Works.
4. Not specifying any slashed on device.  Works.
5. Different versions of gfs2-utils.  gfs2-utils-0.1.38-1.el5 works as
   well as gfs2-utils-0.1.43-1.el5.

Example output:
[root@roth-01 ~]# mount -o loop -tgfs2 /home/bob/bone_me /mnt/gfs2
[root@roth-01 ~]# ls /mnt/gfs2/
fstab
[root@roth-01 ~]# umount /mnt/gfs2
[root@roth-01 ~]# cd /mnt
[root@roth-01 /mnt]# mount -o loop -tgfs2 /home/bob/bone_me gfs2
[root@roth-01 /mnt]# umount /mnt/gfs2
[root@roth-01 /mnt]# cd /home/bob
[root@roth-01 /home/bob]# mount -o loop -tgfs2 bone_me /mnt/gfs2
[root@roth-01 /home/bob]# ls /mnt/gfs2/
fstab
[root@roth-01 /home/bob]# umount /mnt/gfs2

Now granted, my version of util-linux might be old:
[root@roth-01 /usr/src/redhat/SRPMS]# mount -V
mount (util-linux 2.13-pre7)

The mount helper is being passed "/dev/loopX" as the device, so
comment #11 does not seem applicable.

So what am I missing?

Comment 13 Eric Sandeen 2008-03-11 21:55:53 UTC

I think you are missing fedora ;)

[root@localhost tmp]# uname -a
Linux localhost.localdomain 2.6.25-0.73.rc3.git1.fc9 #1 SMP Wed Feb 27 21:40:05
EST 2008 i686 i686 i386 GNU/Linux
[root@localhost tmp]# rpm -q gfs2-utils
gfs2-utils-0.1.25-2.fc9.i386

it may just be because the fedora packages are still ancient, but these commands
fail for me in rawhide:

  495  cd /tmp/
  496  dd if=/dev/zero of=fsfile bs=1M count=200
  497  mkfs.gfs2 -O -p lock_nolock -j 1 fsfile
  498  mkdir mnt
  499  mount -o loop fsfile mnt/

But, I tested a git pull of gfs2-utils last week, and that failed for me too,
near as I could tell, though it was a little contrived; I didn't actually fully
install it, just mount.gfs2 and mkfs.gfs2

-Eric

Comment 14 Eric Sandeen 2008-03-11 21:59:59 UTC

The difference is in how mount ultimately is invoked.  From strace -v -ff -ofoo
mount -o loop fsfile mnt/ :

on your rhel5 box:

foo.2947:open("/dev/loop1", O_RDONLY)            = 3
foo.2947:mount("/dev/loop1", "/tmp/mnt", "gfs2", 0, "") = 0
foo.2947:write(3, "/dev/loop1 /tmp/mnt gfs2 rw,loca"..., 57) = 57

on my F9 box:

foo.28598:open("/dev/loop1", O_RDONLY|O_LARGEFILE) = 3
foo.28598:mount("/dev/loop1", "mnt", "gfs2", 0, "loop=/dev/loop1") = -1 EINVAL
(Invalid argument)
foo.28598:write(2, "error 22 mounting /dev/loop1 on "..., 36) = 36

not the "loop=/dev/loop1" - so this looks like it must be a gfs2-utils issue.

perhaps all it will take is to finally update gfs2-utils in rawhide.

-Eric

Comment 15 Robert Peterson 2008-03-12 14:25:44 UTC

So I think the Fedora gfs2 and gfs2-utils codes just needs to be rebuilt.
Reassigning to Mr. Feist.

Comment 16 Robert Peterson 2008-03-12 15:41:43 UTC

On second thought, I'm closing this as a duplicate of bug #429769.


*** This bug has been marked as a duplicate of 429769 ***