Bug 178558

Summary: processes get stuck when creating new files
Product: [Fedora] Fedora Reporter: Ulrich Drepper <drepper>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED RAWHIDE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: ask, k.georgiou, pfrields, redhat-bugzilla, reuben-redhatbugzilla, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-02-19 01:20:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
add missing unlock
none
sysrq-t output after the system hangs none

Description Ulrich Drepper 2006-01-21 17:51:06 UTC
Description of problem:
I'm seeing this consistently after one to maybe at most 5 hours after every
reboot.  It started in the .185?_FC5 kernels.  The FC5t2 kernel didn't show that
problem, it allowed me to run out of memory after 3 days due to a leak.

When the problem starts not all processes freeze at the same time.  I was able,
on time, to start a new process under strace.  It was just dd which would have
created a file on /tmp.  strace showed the process got stuck in the
open(O_CREAT) call.  I could stop the process with Ctrl-C and retry with the
same result.  It might be interesting to know that /tmp has it's own partition
on my system, mounted with noexec.  I wiped out the partition and used mkfs
without finding any problems with the disk nor healing the problem in the process.

After the first process gets stuck the load is sky-rocketing.  top doesn't show
any running processes, though.  After a few minutes all processes are affected.
 I cannot say whether this is due to the load or whether they all want to create
files.  The machine becomes unusable.

Version-Release number of selected component (if applicable):
2.6.15-1.185?_FC5 up to 2.6.15-1.1863_F5

How reproducible:
not on demand, but reliably

Steps to Reproduce:
1.get my system:
Intel ICH6 925X chipset, two SATA drives
two RAID volumes: one RAID0, one RAID1 (see below)
NVidia card which requires the binary driver due to twinview

2.work a bit
3.see system come to a halt
  
Actual results:
nothing works after some time

Expected results:
400+ days uptime (instead of on average 3 hours)

Additional info:
/dev/md0:
        Version : 00.90.03
  Creation Time : Tue Dec  7 15:49:17 2004
     Raid Level : raid1
     Array Size : 51199040 (48.83 GiB 52.43 GB)
    Device Size : 51199040 (48.83 GiB 52.43 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sat Jan 21 09:42:55 2006
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : cdaf33a1:d6834d33:92004050:4d985a0b
         Events : 0.10907486

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       17        1      active sync   /dev/sdb1


/dev/md1:
        Version : 00.90.03
  Creation Time : Tue Dec  7 15:55:10 2004
     Raid Level : raid0
     Array Size : 117876480 (112.42 GiB 120.71 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Tue Dec  7 15:55:10 2004
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 256K

           UUID : 4a608e21:71b9722f:452d727c:17e95e5d
         Events : 0.1

    Number   Major   Minor   RaidDevice State
       0       8        8        0      active sync   /dev/sda8
       1       8       22        1      active sync   /dev/sdb6


rootfs / rootfs rw 0 0
/dev/root / ext3 rw,data=ordered 0 0
/dev /dev tmpfs rw 0 0
/proc /proc proc rw 0 0
/sys /sys sysfs rw 0 0
none /selinux selinuxfs rw 0 0
/proc/bus/usb /proc/bus/usb usbfs rw 0 0
none /dev/pts devpts rw 0 0
/dev/sda1 /boot ext3 rw,noexec,data=ordered 0 0
none /dev/shm tmpfs rw,noexec 0 0
/dev/md0 /home ext3 rw,data=ordered 0 0
/dev/sdb2 /tmp ext3 rw,noexec,data=ordered 0 0
/dev/sda5 /usr ext3 rw,data=ordered 0 0
/dev/sda6 /var ext3 rw,data=ordered 0 0
/dev/sdb3 /var/tmp ext3 rw,noexec,data=ordered 0 0
/dev/md1 /work ext3 rw,data=ordered 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
automount(pid2537) /misc autofs rw 0 0
automount(pid2539) /net autofs rw 0 0
nfsd /proc/fs/nfsd nfsd rw 0 0


00:00.0 Host bridge: Intel Corporation 925X/XE Memory Controller Hub (rev 04)
00:01.0 PCI bridge: Intel Corporation 925X/XE PCI Express Root Port (rev 04)
00:1c.0 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI
Express Port 1 (rev 03)
00:1c.1 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI
Express Port 2 (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
USB UHCI #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
USB UHCI #2 (rev 03)
00:1d.2 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
USB UHCI #3 (rev 03)
00:1d.3 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
USB UHCI #4 (rev 03)
00:1d.7 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
USB2 EHCI Controller (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d3)
00:1e.2 Multimedia audio controller: Intel Corporation 82801FB/FBM/FR/FW/FRW
(ICH6 Family) AC'97 Audio Controller (rev 03)
00:1f.0 ISA bridge: Intel Corporation 82801FB/FR (ICH6/ICH6R) LPC Interface
Bridge (rev 03)
00:1f.1 IDE interface: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) IDE
Controller (rev 03)
00:1f.2 IDE interface: Intel Corporation 82801FR/FRW (ICH6R/ICH6RW) SATA
Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) SMBus
Controller (rev 03)
01:00.0 VGA compatible controller: nVidia Corporation NV37GL [Quadro FX
330/Quadro NVS280] (rev a2)
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751 Gigabit
Ethernet PCI Express (rev 01)

Comment 1 Ulrich Drepper 2006-01-21 19:57:48 UTC
A few more bits of information:

- the machine remains ping-able

- I can switch to the text console (at least the times I tried it)

- logins are never possible, neither locally nor via ssh

- sometimes top can start, even multiple times.  Sometimes it hangs before
  printing anything

- the strace a dd process test I mentioned in the report is also not reliable.
  Sometimes nothing is shown at all (no message from strace)

- often the processes cannot be killed (stuck in D state).  No Ctrl-C,
  no Ctrl-Z, no kill command

- this might be because I didn't have to notice it so because I never had my
  machine crash this often in the last years:

  the RAID code seems slow.  My md0 device (RAID1, for /home, which is 50G in
  size of which only about 900M are filled) covers very slowly.  The
  reconstruction in the background takes 30 mins or more.  During this time
  the load is > 2 and the system practically crawls.  I haven't noticed this
  before the .185? kernels.

  smartd isn't running because it cannot work with sata but the drives do not
  report any problems.

- I cannot really go back to an older kernel because I need to compile the
  nvidia module which in that case would require gcc 4.0.  That's too much
  to download for my sucky connection.

Comment 2 Ulrich Drepper 2006-02-04 17:01:07 UTC
On another machines (x86, P4 HT) I experience hangs as well.  I cannot say
whether they are the same but from the outside they sure look like it.  There is
even a possibility that I can provide a reproducer (will try this later).

Anyhow, on that machine I got a backtrace (don't remember which kernel version
it was):

BUG: write_lock lockup
_raw_write_lock+0x7d
eax A5F88A00  EBX C13EDAF8  ECX 00000001  EDX CFFB1988
ESI 09F3B049  EDI 00000000  EBP CFFB1988
c017b466 set_fd_pwd+0x18
c016f1fd permission+0x8e
c0162df2 sys_fchdir+0x5f
c0103d21 syscall_call+0x7

The reproducer uses some of the new *at syscalls.  Hopefully more later.

Comment 3 Ulrich Drepper 2006-02-04 17:02:13 UTC
That's of course set_fs_pwd in the backtrace.

Comment 4 Ulrich Drepper 2006-02-04 17:09:01 UTC
Created attachment 124171 [details]
add missing unlock

This patch should fix at least my second machine's hangs.  A lock wasn't
unlocked in case of an error.  The patch is against the current upstream
kernel.

Comment 5 Ulrich Drepper 2006-02-04 21:01:50 UTC
As expected, the 1909 kernel which includes the patch I attached here fixes the
problem for my x86 machines.

I'll now investigate whether it fixes my x86-64 machines as well.

Comment 6 Ulrich Drepper 2006-02-04 22:04:45 UTC
It didn't take long to find out this patch does not fix the original problem. 
I.e., the x86 freeze was a separate problem.  Not really surprising as I never
thought the x86-64 machine was using the new syscalls at the time the machine
was freezing.

Anyway, I got the null modem cable connected now.

Comment 7 Ulrich Drepper 2006-02-05 17:21:53 UTC
Well, when the machine gets stuck there is no output on the serial console.  The
machine is still pingable but not even the screen saver password widget comes up
when I press a key.  Switching to the text console is also not working.

I couldn't get the sysrq magic to work over the serial console, even when the
machine was working normally.  Maybe the problem is that I'm using minicom at
the other end?  Any hints on what to do?

Comment 8 Kostas Georgiou 2006-02-06 10:32:52 UTC
I have a similar (same?) problem as well with the latest kernels in x86_64. It
looks like all disk IO in the machine gets stuck. No new commands succeed for me
even dmesg fails so it might be a different issue.
Existing ssh connections to remote systems keep working so the system isn't
tottaly dead. I am using an md device for my lvm volume group (see below). My
other x86_64 system (without an md device) seems quite stable so far which is
weird. I wonder if it's md causing the problem.

minicom should be ok AFAIK, you are using CTRL-a f sysrqcommand right?
Unfortunately I can not try with a serial connection but I'll try to reproduce
the problem while on console to see if I can get a backtrace.

BTW you can use smartd on SATA drives with recent kernels now.

/dev/root / ext3 rw,data=ordered 0 0
/dev /dev tmpfs rw 0 0
/proc /proc proc rw 0 0
/sys /sys sysfs rw 0 0
none /selinux selinuxfs rw 0 0
/dev/devpts /dev/pts devpts rw 0 0
/dev/md1 /boot ext3 rw,data=ordered 0 0
/dev/shm /dev/shm tmpfs rw 0 0
/dev/rootvg/optvol /opt ext3 rw,data=ordered 0 0
/dev/rootvg/tmpvol /tmp ext3 rw,data=ordered 0 0
/dev/rootvg/usrvol /usr ext3 rw,data=ordered 0 0
/dev/rootvg/varvol /var ext3 rw,data=ordered 0 0
/dev/datavg/scratchvol /srv/data/scratch ext3 rw,noatime,data=writeback 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
automount(pid2380) /home autofs rw 0 0
nfsd /proc/fs/nfsd nfsd rw 0 0
nfssrv.example.com:/home/username /home/username nfs4
rw,nosuid,v4,rsize=32768,wsize=32768,hard,intr,lock,proto=tcp,addr=example.com 0 0




Comment 9 Ulrich Drepper 2006-02-06 16:21:13 UTC
The problem in comment 8 certainly sounds like the same one.  Whether ssh works
or not probably depends on the devices used.  If it has to access a file on a
device which blocks it's game over.

The commonality is x86-64 (probably 64-bit-ness).  I'm not using LVM but two
partitions are RAID.

And yes, I used the minicom sequence to send break but it has no result.  No
idea why.  I surely enabled sysrq handling.

As for smartd, no, seems not to work.

Comment 10 Kostas Georgiou 2006-02-06 22:00:11 UTC
My LVM's physical volume is on RAID as well, that is why I suspect the md code. 
My cpu is an athlon64 with an ATI graphics card (I use the xorg driver) so I
think it's unlikely that this is a hardware driver problem.
I am back to the 1863 kernel at the moment and the machine is up for 13 hours
without any problems btw.

I really have no ide why sysrq over serial isn't working I am afraid :(

I can not find the original bugzilla entry about about the smart passthru in
libata but have a look at #174095. smartctl -d ata -a /dev/sd[ab] works fine for me.

Comment 11 Ulrich Drepper 2006-02-07 00:40:28 UTC
Created attachment 124297 [details]
sysrq-t output after the system hangs

I managed to get sysrq working over the serial console.  The attached file is
the output of the entire system run.

Comment 12 Reuben Farrelly 2006-02-07 12:42:57 UTC
I'm on 

Comment 13 Ask Bjørn Hansen 2006-02-11 04:29:13 UTC
The same thing happens on my system.  (i386, Pentium-D "dual processor" system).

Without looking very closely I too have suspected the md driver.   Reading /prod/mdstat always hangs 
when it's happening.

Comment 14 Ulrich Drepper 2006-02-12 16:36:58 UTC
I have the 1928 kernel running on my machine for almost 2 days now.  Seems the
problem is fixed.  Ingo mentioned some RAID1 problem which has been fixed
upstream recently.  My bet is that this is the solution.

Comment 15 Ask Bjørn Hansen 2006-02-12 20:27:46 UTC
FWIW 1928 fixed it for me too.  (Also running RAID1)

Comment 16 Kostas Georgiou 2006-02-15 08:36:20 UTC
1939+ seems fine for me as well, time to close the bug I guess.