Bug 452438 - kernel-2.6.26-0.81.rc7.fc10.i686 hangs with ntfs-3g mounts ...
kernel-2.6.26-0.81.rc7.fc10.i686 hangs with ntfs-3g mounts ...
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
All Linux
low Severity low
: ---
: ---
Assigned To: Eric Paris
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-06-22 19:20 EDT by Tom London
Modified: 2008-09-08 19:40 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-09-08 19:40:31 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
output of 'echo "t">/proc/sysrq-trigger' (218.04 KB, text/plain)
2008-06-27 13:29 EDT, Tom London
no flags Details
output of 'echo "t" >/proc/sysrq-trigger' with sync hanging (450.14 KB, text/plain)
2008-07-01 13:23 EDT, Tom London
no flags Details

  None (edit)
Description Tom London 2008-06-22 19:20:34 EDT
Description of problem:
Downloaded kernel-2.6.26-0.81.rc7.fc10.i686 from koji.

After installing and restarting, it hung at "Mounting local filesystems", with
SELinux in either permisssive or enforcing.

kernel-2.6.26-0.74.rc6.git4.fc10.i686 boots fine.

For quite a while, I have had the following line in /etc/fstab to mount my
Windows partition:

/dev/sda1		/mnt/windows		ntfs-3g	rw		0 0

Removing this line from /etc/fstab allows the system to boot.

With the system booted up in gnome, a mount on this partition hangs. (I
Ctrl-C'ed it after about 2 minutes): "mount -t ntfs-3g /dev/sda1 /mnt/windows"

Afterwards running "ntfs-3g /dev/sda1 /mnt/windows" reported /dev/sda1 being
"temporarily unavailable" (don't have the exact text).

There are no messages in dmesg or /var/log/messages.
Version-Release number of selected component (if applicable):
kernel-2.6.26-0.81.rc7.fc10.i686

How reproducible:
Every boot

Steps to Reproduce:
1. Add line in /etc/fstab for ntfs-3g partition
2. reboot
3. hang at "Mounting local filesystems"
  
Actual results:


Expected results:


Additional info:
Comment 1 Tom London 2008-06-24 13:05:21 EDT
Running 0.82, with ntfs partition omitted from /etc/fstab, I get "stuck" mounts
of ntfs-3g:

 3219 ?        S      0:00 /usr/libexec/gvfsd-trash --spawner :1.6
/org/gtk/gvfs/exec_spaw/0
 3240 ?        S      0:00 /usr/libexec/gvfsd-burn --spawner :1.6
/org/gtk/gvfs/exec_spaw/1
 3244 ?        S      0:00 gnome-mount -b -d /dev/sda1 -n
 3263 ?        S      0:00 /usr/libexec/hal-storage-mount
 3266 ?        S      0:00 /bin/mount -t ntfs-3g -o
nosuid,nodev,uhelper=hal,locale=en_US.UTF-8 /dev/sda1 /media/IBM_PRELOAD_
 3267 ?        S      0:00 /sbin/mount.ntfs-3g /dev/sda1 /media/IBM_PRELOAD_ -o
rw,nosuid,nodev,uhelper=hal,locale=en_US.UTF-8

These never complete or die.

Believe I get new mount points created in /media (IBM_PRELOAD, IBM_PRELOAD_,
etc.), but nothing mounted.
Comment 2 Tom London 2008-06-24 18:12:21 EDT
Continues to happen with 0.87.

Noticed this in /var/log/messages from session with 0.82:

Jun 24 10:53:06 localhost ntfs-3g[6178]: Version 1.2506 integrated FUSE 27
Jun 24 10:53:06 localhost ntfs-3g[6178]: Mounted /dev/sda1 (Read-Write, label
"IBM_PRELOAD", NTFS 3.1)
Jun 24 10:53:06 localhost ntfs-3g[6178]: Cmdline options: (null)
Jun 24 10:53:06 localhost ntfs-3g[6178]: Mount options:
silent,allow_other,nonempty,relatime,fsname=/dev/sda1,blkdev,blksize=4096
Jun 24 10:53:06 localhost ntfs-3g[6178]: Unmounting /dev/sda1 (IBM_PRELOAD)
Jun 24 10:53:45 localhost init: tty4 main process (2708) killed by TERM signal
Jun 24 10:53:45 localhost init: tty6 main process (2713) killed by TERM signal
Jun 24 10:53:45 localhost init: tty5 main process (2709) killed by TERM signal
Jun 24 10:53:45 localhost init: tty2 main process (2710) killed by TERM signal
Jun 24 10:53:45 localhost init: tty3 main process (2711) killed by TERM signal
Jun 24 10:53:45 localhost smartd[2702]: smartd received signal 15: Terminated
Jun 24 10:53:45 localhost smartd[2702]: smartd is exiting (exit status 0)

So it looks like the "mount" completed during shutdown, followed immediately by
the "unmount".

I tried doing an "strace ntfs-3g /dev/sda1 /mnt/windows" after rebooting into
single user mode.

Here are the last few lines of strace output:

open("/dev/fuse", O_RDWR|O_LARGEFILE)   = 4
getegid32()                             = 0
getgid32()                              = 0
getegid32()                             = 0
setresgid32(-1, 0, 0)                   = 0
getegid32()                             = 0
geteuid32()                             = 0
getuid32()                              = 0
geteuid32()                             = 0
setresuid32(-1, 0, 0)                   = 0
geteuid32()                             = 0
getuid32()                              = 0
lstat64("/mnt/windows", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getuid32()                              = 0
getuid32()                              = 0
getuid32()                              = 0
getuid32()                              = 0
getgid32()                              = 0
getuid32()                              = 0
geteuid32()                             = 0
getegid32()                             = 0
mount("/dev/sda1", "/mnt/windows", "fuseblk", 0,
"allow_other,blksize=4096,fd=4,ro"...

Hanging on the "mount" .....
Comment 3 Chuck Ebbert 2008-06-27 12:15:59 EDT
Try removing 'quiet' adding this to the kernel options in /etc/grub.conf:

  ignore_loglevel sysrq_always_enabled

The when mount hangs run this command:

  echo "t" >/proc/sysrq-trigger

Look in /var/log/messages for the output of that and post it as an attachment.
Comment 4 Tom London 2008-06-27 13:28:48 EDT
OK. Believe I did as requested:

I rebooted with above options.  At gdm screen, I cntl-alt-F1 and logged in as root.

"ps agx" showed no processes doing "mounts".

I entered "mount /dev/sda1 /mnt&".  "ps agx" showed mount hung.

I ran 'echo "t" >/proc/sysrq-trigger', and watched the text flow by.

I rebooted, copied the ouput from /var/log/messages to /tmp/sysrq.txt.  I attach
below.

Let me know if I didn't do this right, and I will rerun.
Comment 5 Tom London 2008-06-27 13:29:24 EDT
Created attachment 310463 [details]
output of 'echo "t">/proc/sysrq-trigger'
Comment 6 Tom London 2008-06-27 14:31:12 EDT
Got this comment on fedora-test (just archiving here for completeness):
	
Tom London <selinux <at> gmail.com> writes:

> Kernel versions since about 0.81 can no longer mount ntfs-3g filesystems
> for me.
> I've BZ'ed this here: https://bugzilla.redhat.com/show_bug.cgi?id=452438
>
> The symptoms appear that the call to mount just hangs.

This is probably the kernel Smack problem fixed here:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e97dcb0eadbb821eccd549d4987b653cf61e2374

If not then please send the stack traceback of the hanging
mount process (echo t > /proc/sysrq-trigger).

Regards,   Szaka

--
NTFS-3G: http://ntfs-3g.org
Comment 7 Dave Jones 2008-06-27 15:06:03 EDT
Unlikely. We don't build smack.
Comment 8 Miklos Szeredi 2008-06-27 17:25:53 EDT
Yeah, this time it's fuse vs. selinux, but the issue is similar.  This is what
happens:

sys_mount
  vfs_kern_mount
    fuse_get_sb
    security_sb_kern_mount
      selinux_sb_kern_mount
        fuse_getxattr

The mount syscall won't return until fuse_getxattr() finishes.  But
fuse_getxattr() cannot finish until the mount syscall returns -> deadlock.  The
reason fuse_getxattr cannot finish is because fuse userspace only starts request
processing after the mount has succeeded.  Since this is part of the userspace
ABI, it cannot easily be changed, and so selinux will probably have to work
around it in some way.

Looking at selinux_set_mnt_opts() in mainline, I don't actually see it calling
->getxattr().  Is this perhaps a recent addition only in fedora kernels?  That
would explain why this hang weren't reported earlier.
Comment 9 Chuck Ebbert 2008-06-30 16:14:40 EDT
Apparently caused by linux-2.6-selinux-ecryptfs-support.patch
Comment 10 Chuck Ebbert 2008-06-30 18:40:22 EDT
Patch disabled for now. Leaving bug open.
Comment 11 Tom London 2008-07-01 13:20:38 EDT
running 0.98 (believe the patch is still in):

In addition to the above, even without attempting to mount ntfs-3g partition,
"sync" command hangs when running in runlevel 5.  "sync" completes in runlevel 3.

Could this be related to the above (and fuse)?

Here is output of "strace sync":
execve("/bin/sync", ["sync"], [/* 29 vars */]) = 0
brk(0)                                  = 0x8404000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=104008, ...}) = 0
mmap2(NULL, 104008, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb8050000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000wz\0004\0\0\0"...,
512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1511052, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb804f000
mmap2(0x791000, 1513040, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) =
0x791000
mmap2(0x8fd000, 12288, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16c) = 0x8fd000
mmap2(0x900000, 9808, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,
-1, 0) = 0x900000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb804e000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb804e6c0, limit:1048575,
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0,
useable:1}) = 0
mprotect(0x8fd000, 8192, PROT_READ)     = 0
mprotect(0x789000, 4096, PROT_READ)     = 0
munmap(0xb8050000, 104008)              = 0
brk(0)                                  = 0x8404000
brk(0x8425000)                          = 0x8425000
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=79736512, ...}) = 0
mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7e4e000
close(3)                                = 0
sync(

I attach below the output from 'echo "t" >/proc/sysrq-trigger'.
Comment 12 Tom London 2008-07-01 13:23:22 EDT
Created attachment 310695 [details]
output of 'echo "t" >/proc/sysrq-trigger' with sync hanging

Obtained by booting with ignore_loglevel sysrq_always_enabled, booting to
runlevel 5, running "sync&" in terminal window, and running 'echo
"t">/proc/sysrq-trigger'
Comment 13 Miklos Szeredi 2008-07-01 14:07:08 EDT
(In reply to comment #12)
> Created an attachment (id=310695) [edit]
> output of 'echo "t" >/proc/sysrq-trigger' with sync hanging
> 
> Obtained by booting with ignore_loglevel sysrq_always_enabled, booting to
> runlevel 5, running "sync&" in terminal window, and running 'echo
> "t">/proc/sysrq-trigger'

That's exactly the same issue: gvfs trying to mount some fuse filesystem, which
hangs in sys_mount() holding the s_umount semaphore for write, and sys_sync()
trying to acquire s_umount for read.
Comment 14 Tom London 2008-07-02 11:14:03 EDT
Works for me with kernel-2.6.26-0.104.rc8.git2.fc10.i686
Comment 15 Tom London 2008-08-23 18:22:59 EDT
Close this? Been working for me since 2 July.....

Issue with linux-2.6-selinux-ecryptfs-support.patch resolved?

Note You need to log in before you can comment on or make changes to this bug.