Bug 454368 - /bin/sync hangs every time on Fedora 9 (hey, that rhymes! :)
Summary: /bin/sync hangs every time on Fedora 9 (hey, that rhymes! :)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 9
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-07-08 00:24 UTC by James Hunt
Modified: 2008-08-01 16:22 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-07-24 22:25:35 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description James Hunt 2008-07-08 00:24:17 UTC
Description of problem:

Running /bin/sync as any user (root or otherwise) hangs indefinitely.

Version-Release number of selected component (if applicable):
coreutils-6.10-25.fc9.i386
kernel-2.6.25.9-76.fc9.i686
selinux-policy-3.3.1-74.fc9.noarch
selinux-policy-targeted-3.3.1-74.fc9.noarc

How reproducible:

Every time.

Steps to Reproduce:
1. Run /bin/sync
  
Actual results:

Hang.

Expected results:

Sync!

Additional info:

SysReq sync does appear to work. I'm guessing SELinux might be the culprit...?

I have no remote filesystems mounted - output of "mount":

/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/mapper/VolGroup00-LogVol04 on /var type ext3 (rw)
/dev/mapper/VolGroup00-LogVol05 on /usr type ext3 (rw)
/dev/mapper/VolGroup00-LogVol06 on /usr/src type ext3 (rw)
/dev/mapper/VolGroup00-LogVol03 on /home type ext3 (rw)
/dev/mapper/VolGroup00-LogVol02 on /tmp type ext3 (rw)
/dev/mapper/VolGroup00-LogVol07 on /backup type ext3 (rw)
/dev/mapper/VolGroup00-LogVol08 on /.snapshots type ext3 (rw)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
gvfs-fuse-daemon on /home/jamesh/.gvfs type fuse.gvfs-fuse-daemon
(rw,nosuid,nodev,user=jamesh)

FWIW, I have enabled PV encryption.

Comment 1 Ondrej Vasik 2008-07-08 07:43:13 UTC
Thanks for report, could you please provide strace (I know that it will be
incomplete, just want to see which one command is hanging) ? Does the problem
occur even for permissive SELinux mode? As you mentioned SELinux as possible
culprit, adding Dan Walsh to CC.

Comment 2 Daniel Walsh 2008-07-08 19:12:41 UTC
Does it happen in permissive mode? 

Any AVC messages?

Comment 3 James Hunt 2008-07-08 21:15:37 UTC
Hi Ondrej,

It appears to be the sync system call itself that's the culprit...

--------------------------------
> strace sync
execve("/bin/sync", ["sync"], [/* 60 vars */]) = 0
brk(0)                                  = 0x9bd6000
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb8072000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("tls/i686/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
open("tls/i686/libc.so.6", O_RDONLY)    = -1 ENOENT (No such file or directory)
open("tls/sse2/libc.so.6", O_RDONLY)    = -1 ENOENT (No such file or directory)
open("tls/libc.so.6", O_RDONLY)         = -1 ENOENT (No such file or directory)
open("i686/sse2/libc.so.6", O_RDONLY)   = -1 ENOENT (No such file or directory)
open("i686/libc.so.6", O_RDONLY)        = -1 ENOENT (No such file or directory)
open("sse2/libc.so.6", O_RDONLY)        = -1 ENOENT (No such file or directory)
open("libc.so.6", O_RDONLY)             = -1 ENOENT (No such file or directory)
open("/home/jamesh/lib/tls/i686/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such
file or directory)
stat64("/home/jamesh/lib/tls/i686/sse2", 0xbfe70a20) = -1 ENOENT (No such file
or directory)
open("/home/jamesh/lib/tls/i686/libc.so.6", O_RDONLY) = -1 ENOENT (No such file
or directory)
stat64("/home/jamesh/lib/tls/i686", 0xbfe70a20) = -1 ENOENT (No such file or
directory)
open("/home/jamesh/lib/tls/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file
or directory)
stat64("/home/jamesh/lib/tls/sse2", 0xbfe70a20) = -1 ENOENT (No such file or
directory)
open("/home/jamesh/lib/tls/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or
directory)
stat64("/home/jamesh/lib/tls", 0xbfe70a20) = -1 ENOENT (No such file or directory)
open("/home/jamesh/lib/i686/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file
or directory)
stat64("/home/jamesh/lib/i686/sse2", 0xbfe70a20) = -1 ENOENT (No such file or
directory)
open("/home/jamesh/lib/i686/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or
directory)
stat64("/home/jamesh/lib/i686", 0xbfe70a20) = -1 ENOENT (No such file or directory)
open("/home/jamesh/lib/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or
directory)
stat64("/home/jamesh/lib/sse2", 0xbfe70a20) = -1 ENOENT (No such file or directory)
open("/home/jamesh/lib/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/home/jamesh/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("/usr/local/lib/tls/i686/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such
file or directory)
stat64("/usr/local/lib/tls/i686/sse2", 0xbfe70a20) = -1 ENOENT (No such file or
directory)
open("/usr/local/lib/tls/i686/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or
directory)
stat64("/usr/local/lib/tls/i686", 0xbfe70a20) = -1 ENOENT (No such file or
directory)
open("/usr/local/lib/tls/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or
directory)
stat64("/usr/local/lib/tls/sse2", 0xbfe70a20) = -1 ENOENT (No such file or
directory)
open("/usr/local/lib/tls/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or
directory)
stat64("/usr/local/lib/tls", 0xbfe70a20) = -1 ENOENT (No such file or directory)
open("/usr/local/lib/i686/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file
or directory)
stat64("/usr/local/lib/i686/sse2", 0xbfe70a20) = -1 ENOENT (No such file or
directory)
open("/usr/local/lib/i686/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or
directory)
stat64("/usr/local/lib/i686", 0xbfe70a20) = -1 ENOENT (No such file or directory)
open("/usr/local/lib/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or
directory)
stat64("/usr/local/lib/sse2", 0xbfe70a20) = -1 ENOENT (No such file or directory)
open("/usr/local/lib/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
stat64("/usr/local/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=120705, ...}) = 0
mmap2(NULL, 120705, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb8054000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@\7\273\0004\0\0\0"...,
512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1758468, ...}) = 0
mmap2(0xb9a000, 1476176, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) =
0xb9a000
mmap2(0xcfd000, 12288, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x163) = 0xcfd000
mmap2(0xd00000, 9808, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,
-1, 0) = 0xd00000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb8053000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb80536c0, limit:1048575,
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0,
useable:1}) = 0
mprotect(0xcfd000, 8192, PROT_READ)     = 0
mprotect(0xb92000, 4096, PROT_READ)     = 0
munmap(0xb8054000, 120705)              = 0
brk(0)                                  = 0x9bd6000
brk(0x9bf7000)                          = 0x9bf7000
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=79366688, ...}) = 0
mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7e53000
close(3)                                = 0
sync(
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=120705, ...}) = 0
mmap2(NULL, 120705, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb8054000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@\7\273\0004\0\0\0"...,
512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1758468, ...}) = 0
mmap2(0xb9a000, 1476176, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) =
0xb9a000
mmap2(0xcfd000, 12288, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x163) = 0xcfd000
mmap2(0xd00000, 9808, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,
-1, 0) = 0xd00000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb8053000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb80536c0, limit:1048575,
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0,
useable:1}) = 0
mprotect(0xcfd000, 8192, PROT_READ)     = 0
mprotect(0xb92000, 4096, PROT_READ)     = 0
munmap(0xb8054000, 120705)              = 0
brk(0)                                  = 0x9bd6000
brk(0x9bf7000)                          = 0x9bf7000
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=79366688, ...}) = 0
mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7e53000
close(3)                                = 0
sync(
--------------------------------

I tried setting SELinux to permissive mode, but sync still hangs. Also, nothing
I can see about sync in audit.log. "rpm -qV coreutils" reports no problems. I
also compiled a noddy C program that just calls sync(2) and that too hangs.
"restorecon -v /bin/sync" doesn't make any changes:

-rwxr-xr-x  root root system_u:object_r:bin_t:s0       /bin/sync

I'm kinda stumped at the moment :)

Comment 4 Ondrej Vasik 2008-07-09 20:53:23 UTC
Removing Dan Walsh from cc(most probably not SELinux related), sorry for false
alarm... could be something with kernel, as the sync() system call is hanging. I
guess it could have something to do with one of the filesystem types you are
using in /etc/fstab. Could you please try to reduce mounted filesystems in
/etc/fstab to minimal possible level and retry sync? Then maybe with adding
additional filesystems we can find out the culprit. I will try to reproduce it
on my machine,  but it could take some time.

Comment 5 James Hunt 2008-07-13 20:35:09 UTC
OK - after some further investigations, I've worked out what is triggering this,
but it's bizarre...

- At home, I'm using NetworkManager with a DHCP wireless connection.

- At work, although I've set up a static IP wired connection using
nm-connection-editor, every time I boot, NM creates an "Auto eth0" connection
(see bug 454381 I raised on this issue) and defaults to this rather than
selecting the existing wired connection I created.

Every day at work, I boot, and then when nm-applet is running, I have to
left-click and select the connection *I* created (ie I deselect "Auto eth0").

Here's the bizarre bit: I've noticed that /bin/sync works 100% reliably using
the "Auto eth0" connection, but hangs when I shift over to using my static IP
connection!?! I've checked this twice now. What on earth is going on??



Comment 6 James Hunt 2008-07-13 20:37:28 UTC
Oh, and /bin/sync works fine at home too :)

To be manifest, I'm using a laptop (Toshiba Tecra P5).



Comment 7 James Hunt 2008-07-14 00:17:49 UTC
Doh! I'd forgotten to look at the system log...
__________________________________________________

Jul 14 10:14:11 jamesh-ll kernel: INFO: task sync:18076 blocked for more than
120 seconds.
Jul 14 10:14:11 jamesh-ll kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 14 10:14:11 jamesh-ll kernel: sync          D c079123c     0 18076      1
Jul 14 10:14:11 jamesh-ll kernel:        eff64f24 00200082 eff64f00 c079123c
c0794280 c0794280 c0794280 f5eea000
Jul 14 10:14:11 jamesh-ll kernel:        f5eea248 c1ffb280 00000001 c1ffb280
00000001 0000000e 0000c413 f5eea248
Jul 14 10:14:11 jamesh-ll kernel:        005135be 00000000 00000000 ffffffff
00000000 00000000 00000000 00000002
Jul 14 10:14:11 jamesh-ll kernel: Call Trace:
Jul 14 10:14:11 jamesh-ll kernel:  [<c062ab64>] rwsem_down_failed_common+0x13e/0x15a
Jul 14 10:14:11 jamesh-ll kernel:  [<c062abc0>] rwsem_down_read_failed+0x1d/0x25
Jul 14 10:14:11 jamesh-ll kernel:  [<c062ac43>] call_rwsem_down_read_failed+0x7/0xc
Jul 14 10:14:11 jamesh-ll kernel:  [<c062a21a>] ? down_read+0x26/0x29
Jul 14 10:14:11 jamesh-ll kernel:  [<c049aa49>] __sync_inodes+0x3b/0x8d
Jul 14 10:14:11 jamesh-ll kernel:  [<c049aad7>] sync_inodes+0x3c/0x7c
Jul 14 10:14:11 jamesh-ll kernel:  [<c049d11d>] do_sync+0x14/0x5a
Jul 14 10:14:11 jamesh-ll kernel:  [<c049d170>] sys_sync+0xd/0x11
Jul 14 10:14:11 jamesh-ll kernel:  [<c0405bf2>] syscall_call+0x7/0xb
Jul 14 10:14:11 jamesh-ll kernel:  [<c0620000>] ? agp_amd64_probe+0x240/0x3ee
Jul 14 10:14:11 jamesh-ll kernel:  =======================

__________________________________________________

Comment 8 James Hunt 2008-07-14 00:18:50 UTC
Changed component from coreutils to kernel.

Comment 9 James Hunt 2008-07-17 20:38:28 UTC
The latest kernel (kernel-2.6.25.10-86.fc9.i686) fixes this very annoying problem.



Note You need to log in before you can comment on or make changes to this bug.