Bug 857463 - Netfs will unmount NFS partitions leaving orphaned loopback devices when mounting /dev/loop devices over iSCSI
Netfs will unmount NFS partitions leaving orphaned loopback devices when moun...
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.3
All Linux
high Severity high
: rc
: ---
Assigned To: nfs-maint
Red Hat Kernel QE team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-14 09:48 EDT by Kyle Squizzato
Modified: 2012-11-26 07:05 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-11-26 07:05:59 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Kyle Squizzato 2012-09-14 09:48:23 EDT
Description of problem:
When a loopback device is created against a file or image on an NFS share and the netfs service is stopped the device is properly unmounted (no longer visible via mount or /proc/mounts) but the /dev/loop0 device remains orphaned.

Version-Release number of selected component (if applicable):
initscripts-9.03.31-2.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create a system with a filesystem mounted over NFS from fstab, for example /mnt/looptest
2. Create a file on that filesystem (dd if=/dev/zero bs=1024 count=5 of=/nfsshare/share.img)
3. Create a loopback over that (losetup /dev/loop0 /mnt/nfs/share.img)
4. Manually start tgtd (not from init script)
5. Use tgtadm to set up an iSCSI device from the loopback file
6. Stop netfs (service netfs stop)
7. Run "mount" - observe that filesystem is umounted
8. Run "losetup --all" - observe that loopback is still present
9. Optional - Reboot the machine, a hang will occur. 
  
Actual results:
/dev/loopX devices are orphaned.

Expected results:
/dev/loopX devices should be cleaned up when the netfs service is stopped.

Additional info:
Here's a quick example of the orphaned devices:

# service netfs stop
Unmounting NFS filesystems:  umount.nfs: /mnt/nfs: device is busy
umount.nfs: /mnt/nfs: device is busy
                                                           [FAILED]
Unmounting NFS filesystems (retry):                        [  OK  ]
# mount
/dev/mapper/VolGroup-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/vda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
# losetup --all
/dev/loop0: [0016]:259301 (/mnt/nfs/share.img)
Comment 3 Kyle Squizzato 2012-09-14 10:17:41 EDT
Here's a resource that may prove useful as well
You might find this strace of VGS

https://eucalyptus.atlassian.net/secure/attachment/12009/strace-of-vgs-during-shutdown-public.png

attached to this bug
https://eucalyptus.atlassian.net/browse/EUCA-3183

helpful.
Comment 6 Dave Wysochanski 2012-11-20 16:46:28 EST
That vgs strace in Comment #3 has nothing to do with this problem as far as I know.  That hang is a side-effect simply because LVM was not configured with a proper filter to reject the loop device.  You should always configure an LVM filter which rejects any device that does not contain LVM metadata.  As far as I know, the loop device is associated with an NFS file, and exported via tgtd, but does not have LVM metadata on it - please correct me if I'm wrong.
Comment 7 Dave Wysochanski 2012-11-21 18:27:47 EST
Actually I take that back.  The original reporducer did include LVM, but the one in this bz does not, hence the reference to vgs failing.  I'm not sure it's relevant to the bug though and I think that is why Kyle removed it from the reproducer to simplify.
Comment 8 Jeff Layton 2012-11-26 06:59:54 EST
I don't think this is a kernel bug at all. If anything it might be an initscripts bug, but looks even more like NOTABUG to me.

Making a loop device on a file essentially just has the kernel hold it open. So it's quite true that the file is busy when the netfs script tries to unmount the filesystem the first time. The (retry) attempt then does a lazy umount. That just detaches the superblock from the filesystem hierarchy, but leaves it in place until its refcount drops. Once you destroy the loop device, then the superblock should get torn down.

So, this all looks like it's working as expected. If you create a loop device manually, why should the netfs script or kernel be responsible for cleaning that up? You should ensure that you do that before running "netfs stop".

I'm going to go ahead and NAK this bug, but feel free to reopen it if you want to discuss it further.
Comment 9 RHEL Product and Program Management 2012-11-26 07:05:59 EST
Development Management has reviewed and declined this request.
You may appeal this decision by reopening this request.

Note You need to log in before you can comment on or make changes to this bug.