Bug 190756

Summary: gfs knows of directories which it chooses not to display
Product: [Retired] Red Hat Cluster Suite Reporter: Corey Marthaler <cmarthal>
Component: gfsAssignee: Robert Peterson <rpeterso>
Status: CLOSED ERRATA QA Contact: GFS Bugs <gfs-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: nobody+wcheng, rkenna, rohara
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0142 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-10 21:13:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 222299    
Attachments:
Description Flags
Proposed patch to fix the problem none

Description Corey Marthaler 2006-05-04 21:57:54 UTC
Description of problem:
I saw this after setting up to do NFS/GFS/Rgmanager testing. I have a few gfs
filesystems which are apart of a couple nfs rgmanager services and being
exported to nfs clients. 

[root@taft-04 taft0]# clustat
Member Status: Quorate

  Member Name                              Status
  ------ ----                              ------
  taft-01                                  Online, rgmanager
  taft-02                                  Online, rgmanager
  taft-03                                  Online, rgmanager
  taft-04                                  Online, Local, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  nfs1                 taft-04                        started
  nfs2                 taft-04                        started

From the servicer (taft-04) there exist directories that are not being displayed:
[root@taft-04 taft0]# df -h /mnt/taft0
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/TAFT_CLUSTER-TAFT_CLUSTER0
                       91G   22M   91G   1% /mnt/taft0

[root@taft-04 taft0]# pwd
/mnt/taft0
[root@taft-04 taft0]# ls
flea-02
[root@taft-04 taft0]# ls flea-01
[root@taft-04 taft0]# ls flea-03
[root@taft-04 taft0]# ls flea-04
[root@taft-04 taft0]# ls flea-05
[root@taft-04 taft0]# ls flea-99
ls: flea-99: No such file or directory
[root@taft-04 taft0]# cd flea-01
[root@taft-04 flea-01]# ls
[root@taft-04 flea-01]# cd ..

From an nfs client (flea-05) I see nothing as well:
[root@flea-05 taft0]# pwd
/mnt/taft0
[root@flea-05 taft0]# ls
flea-02
[root@flea-05 taft0]# touch flea-01
touch: cannot touch `flea-01': Permission denied
[root@flea-05 taft0]# touch flea-99
[root@flea-05 taft0]# ls
flea-02  flea-99
[root@flea-05 taft0]# ls -a
.  ..  flea-02  flea-99
[root@flea-05 taft0]# ls -ld
drwxrwxrwx  3 root root 3864 May  4 10:57 .

Version-Release number of selected component (if applicable):
[root@taft-04 taft0]# rpm -q GFS
GFS-6.1.5-0

Comment 1 Robert Peterson 2006-05-15 15:12:13 UTC
Corey recreated this problem for me on 11 May 2006.  System taft-04 had
scsi_eh_0     S 00000100dfd4a7c0     0   228      1           275   148 (L-TLB)
000001000e23fdf8 0000000000000046 000001000e06f400 000001021f850bd2
       0000000000000018 0000000000000006 0000000000001b4e 0000000200000246
       000001021f8507f0 00000000000026f7
Call Trace:<ffffffff80303ea7>{__down_interruptible+203}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff803057ba>{__down_failed_interruptible+53}
       <ffffffffa0006816>{:scsi_mod:.text.lock.scsi_error+45}
       <ffffffff80139eaf>{do_exit+3179} <ffffffff80110e17>{child_rip+8}
       <ffffffffa00059ce>{:scsi_mod:scsi_error_handler+0}
       <ffffffff80110e0f>{child_rip+0}
kmirrord      S ffffffff8014aabc     0   258     12          3632       (L-TLB)
000001021f9cfe68 0000000000000046 ffffffff8014aabc 00000100087706e0
       00000101fffbe7f0 0000000000008040 000001021f9cff18 00000002dfd356c0
       00000101fffbe7f0 00000000000024f1
Call Trace:<ffffffff8014aabc>{keventd_create_kthread+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff80146c7b>{worker_thread+0} <ffffffff80146d5d>{worker_thread+226}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff8014aabc>{keventd_create_kthread+0}
<ffffffff80146c7b>{worker_thread+0}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014aa93>{kthread+200}
       <ffffffff80110e17>{child_rip+8} <ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014a9cb>{kthread+0} <ffffffff80110e0f>{child_rip+0}

kmir_mon      S ffffffff8014aabc     0   259     10          3631    75 (L-TLB)
000001021f9c9e68 0000000000000046 0000000100000001 0000000000000002
       000001021f8a9210 0000000000000009 0000000000002706 0000000000000001
       000001021f8a9030 0000000000000a70
Call Trace:<ffffffff8014aabc>{keventd_create_kthread+0}
<ffffffff80146c7b>{worker_thread+0}
       <ffffffff80146d5d>{worker_thread+226}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff80146c7b>{worker_thread+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014aa93>{kthread+200} <ffffffff80110e17>{child_rip+8}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014a9cb>{kthread+0}
       <ffffffff80110e0f>{child_rip+0}
scsi_eh_1     S 0000000000529c60     0  1087      1          1091   275 (L-TLB)
000001021e743df8 0000000000000046 000001021e750000 000001021f9e7412
       000001021f9e7030 0000000000000006 0000000000001b4e 0000000100000246
       000001021f9e7030 0000000000001530
Call Trace:<ffffffff80303ea7>{__down_interruptible+203}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff803057ba>{__down_failed_interruptible+53}
       <ffffffffa0006816>{:scsi_mod:.text.lock.scsi_error+45}
       <ffffffff80139eaf>{do_exit+3179} <ffffffff80110e17>{child_rip+8}
       <ffffffffa00059ce>{:scsi_mod:scsi_error_handler+0}
       <ffffffff80110e0f>{child_rip+0}
udevd         S ffffffff80304190     0  1091      1          1127  1087 (NOTLB)
000001021e749d78 0000000000000002 000000d0803cca80 0000000000000246
       0000000000000246 0000010000012780 000001021e749e78 0000000000000000
       000001021f8c17f0 000000000000cd0e
Call Trace:<ffffffff80305491>{schedule_timeout+101}
<ffffffff802a847a>{datagram_poll+39}
       <ffffffff80189c5f>{do_select+939} <ffffffff801897f9>{__pollwait+0}
       <ffffffff80189fde>{sys_select+820} <ffffffff801101c6>{system_call+126}

qla2300_1_dpc S 000001021e750000     0  1127      1          1417  1091 (L-TLB)
000001021f5d9e28 0000000000000046 0000000000000202 0000000000000000
       000001021e7503c8 0000000000000000 000001021f5d9e18 0000000200660d00
       000001021f850030 00000000000004e2
Call Trace:<ffffffffa00b1d6a>{:qla2xxx:qla2x00_get_retry_cnt+67}
       <ffffffff80303ea7>{__down_interruptible+203}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff803057ba>{__down_failed_interruptible+53}
       <ffffffffa00aa74f>{:qla2xxx:.text.lock.qla_os+15}
<ffffffff80110e17>{child_rip+8}
       <ffffffffa00a9a52>{:qla2xxx:qla2x00_do_dpc+0} <ffffffff80110e0f>{child_rip+0}

shpchpd_event S ffffffff80304190     0  1417      1          1983  1127 (L-TLB)
000001021e161c88 0000000000000046 00000000006bcc70 00000000006bcc90
       000001021e94a030 0000000000000246 0000000000000000 0000000100000246
       000001021e94a030 0000000000003433
Call Trace:<ffffffff8014061d>{switch_uid+63}
<ffffffff80138a70>{reparent_to_init+484}
       <ffffffff801391f2>{daemonize+418}
<ffffffff80303ea7>{__down_interruptible+203}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff803057ba>{__down_failed_interruptible+53}
       <ffffffffa013044a>{:shpchp:.text.lock.shpchp_ctrl+335}
       <ffffffff802a1b90>{pci_mmcfg_read+0} <ffffffff8013212e>{schedule_tail+55}
       <ffffffff80110e17>{child_rip+8} <ffffffffa012f523>{:shpchp:event_thread+0}
       <ffffffff80110e0f>{child_rip+0}
kauditd       S ffffffff8014aabc     0  1837     13          3732       (L-TLB)
000001021dd1fea8 0000000000000046 0000000e4151e402 000001000e1bb030
       00000100087786e0 0000000000000003 0000000000000000 000000034154069a
       000001000e1007f0 0000000000000b06
Call Trace:<ffffffff8014aabc>{keventd_create_kthread+0}
<ffffffff80153874>{kauditd_thread+0}
       <ffffffff801539ee>{kauditd_thread+378}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff8014aabc>{keventd_create_kthread+0}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014aa93>{kthread+200}
       <ffffffff80110e17>{child_rip+8} <ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014a9cb>{kthread+0} <ffffffff80110e0f>{child_rip+0}

kjournald     S 0000000000000000     0  1983      1          2196  1417 (L-TLB)
000001021dadbe78 0000000000000046 000001021dadbdc8 000001021dae1af8
       000001000e1ce498 0000000000000005 0000000000000000 00000003000003e8
       000001021ea6b7f0 0000000000003ff8
Call Trace:<ffffffffa00539e8>{:jbd:kjournald+506}
<ffffffff80134df2>{autoremove_wake_function+0}
       <ffffffff80134df2>{autoremove_wake_function+0}
<ffffffffa00537e8>{:jbd:commit_timeout+0}
       <ffffffff80110e17>{child_rip+8} <ffffffffa00537ee>{:jbd:kjournald+0}
       <ffffffff80110e0f>{child_rip+0}
dhclient      S 0000000000000006     0  2196      1          2538  1983 (NOTLB)
000001021d067d78 0000000000000002 000000d01eb12c80 0000000000000246
       0000000000000246 0000010000012780 000001021d067e78 0000000000000000
       000001021ea6b030 0000000000000a19
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffff80189c5f>{do_select+939}
       <ffffffff801897f9>{__pollwait+0} <ffffffff80189fde>{sys_select+820}
       <ffffffff80191570>{dnotify_parent+34} <ffffffff801101c6>{system_call+126}

rpc.idmapd    S ffffffff80304190     0  2612      1          2698  2563 (NOTLB)
000001021ce45e78 0000000000000002 0000010037e4b7f0 0000001900000074
       000001021c4a77f0 0000000000000074 0000010008771940 0000000200000246
       000001021e94a7f0 0000000000000994
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffff8019bc4b>{sys_epoll_wait+403}
       <ffffffff801439e4>{sys_rt_sigaction+97}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff801101c6>{system_call+126}
acpid         S 0000000000000005     0  2698      1          2707  2612 (NOTLB)
000001021d67fd78 0000000000000006 000001021c899240 0000000000000000
       000001021d320cc0 000001021d67fd30 000000d000000000 0000000300000246
       000001021d5c6030 000000000004256f
Call Trace:<ffffffff80305491>{schedule_timeout+101}
<ffffffff80189c5f>{do_select+939}
       <ffffffff801897f9>{__pollwait+0} <ffffffff80189fde>{sys_select+820}
       <ffffffff80191570>{dnotify_parent+34} <ffffffff801101c6>{system_call+126}

xinetd        S ffffffff80304190     0  2792      1          2812  2773 (NOTLB)
000001021c537d78 0000000000000006 0000000000000047 ffffffff802a220b
       000000d000000000 0000000000000246 0000000000000246 0000000300012780
       000001021c607030 000000000008ae6e
Call Trace:<ffffffff802a220b>{sock_sendmsg+271}
<ffffffff80305491>{schedule_timeout+101}
       <ffffffff802ca87f>{tcp_poll+44} <ffffffff80189c5f>{do_select+939}
       <ffffffff801897f9>{__pollwait+0} <ffffffff80189fde>{sys_select+820}
       <ffffffff801101c6>{system_call+126}

gpm           S 0000000000000006     0  2834      1          2879  2820 (NOTLB)
000001021c2b3d78 0000000000000006 000001021c2b3d28 ffffffff80184032
       0000010037e59c80 ffffffff801cc704 000000d0fffffdfd 0000000300000246
       000001021cb5b7f0 0000000000009413
Call Trace:<ffffffff80184032>{do_lookup+44} <ffffffff801cc704>{capable+24}
       <ffffffff8013f378>{__mod_timer+293} <ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffff80189c5f>{do_select+939}
       <ffffffff801897f9>{__pollwait+0} <ffffffff80189fde>{sys_select+820}
       <ffffffff8018d475>{dput+56} <ffffffff801101c6>{system_call+126}

xfs           S 0000000000000004     0  2901      1          2918  2879 (NOTLB)
000001021beddd78 0000000000000002 0000010037e4b030 00000010801ea925
       0000000000000046 0000000000000010 000000d01d4569a8 0000000300000246
       000001021c4b3030 00000000000008c5
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffff80189c5f>{do_select+939}
       <ffffffff801897f9>{__pollwait+0} <ffffffff80189fde>{sys_select+820}
       <ffffffff801101c6>{system_call+126}

dbus-daemon-1 S 7fffffffffffffff     0  2927      1          2938  2918 (NOTLB)
000001021b8f3e88 0000000000000002 000001021b8f3e18 00000100087686e0
       000001000e23b240 0000000000000000 000000d01b8f3f50 0000000100000246
       000001021b85d7f0 0000000000003340
Call Trace:<ffffffff80305491>{schedule_timeout+101}
<ffffffff80134c2a>{add_wait_queue+18}
       <ffffffff8018a399>{sys_poll+610} <ffffffff801897f9>{__pollwait+0}
       <ffffffff801101c6>{system_call+126}
ccsd          S 0000000000000008     0  3545      1          3546  2951 (NOTLB)
000001021b839d78 0000000000000002 0000000000000246 ffffffff802a549a
       000000d003b12200 0000000000000246 0000000000000246 0000000000012780
       000001021d5c67f0 0000000000000f3e
Call Trace:<ffffffff802a549a>{sock_def_readable+52}
<ffffffff80305491>{schedule_timeout+101}
       <ffffffff80189c5f>{do_select+939} <ffffffff801897f9>{__pollwait+0}
       <ffffffff80189fde>{sys_select+820} <ffffffff8018d475>{dput+56}
       <ffffffff801101c6>{system_call+126}
ccsd          S 000000000000000b     0  3546      1          3584  3545 (NOTLB)
000001021af7fd78 0000000000000002 000001021af7fe80 000001021e0655c0
       000000d01e065630 0000000000000246 0000000000000246 0000000000012780
       000001021e0d77f0 000000000001de32
Call Trace:<ffffffff80305491>{schedule_timeout+101}
<ffffffff802a847a>{datagram_poll+39}
       <ffffffff80189c5f>{do_select+939} <ffffffff801897f9>{__pollwait+0}
       <ffffffff80189fde>{sys_select+820} <ffffffff801101c6>{system_call+126}

cman_comms    S ffffffff80304190     0  3584      1          3585  3546 (L-TLB)
000001021b0e9d98 0000000000000046 ffffffff803cca80 000001021baf97f0
       0000000000000000 0000000000000001 000001021b0e9d38 000000008013346f
       000001021baf97f0 000000000000018e
Call Trace:<ffffffffa0222a9e>{:cman:send_to_user_port+758}
<ffffffffa0224679>{:cman:cluster_kthread+299}
       <ffffffff801333c8>{default_wake_function+0} <ffffffff80110e17>{child_rip+8}
       <ffffffffa022454e>{:cman:cluster_kthread+0} <ffffffff80110e0f>{child_rip+0}

cman_serviced S 00000100087786e0     0  3586     11          3733       (L-TLB)
000001021b591f08 0000000000000046 0000010037e48030 0000001900000069
       000001021a8fd030 0000000000000069 0000010008769940 0000000100000000
       000001021ca85030 0000000000000091
Call Trace:<ffffffff8014aabc>{keventd_create_kthread+0}
<ffffffffa022d61d>{:cman:serviced+0}
       <ffffffffa022d745>{:cman:serviced+296} <ffffffff8014aa93>{kthread+200}
       <ffffffff80110e17>{child_rip+8} <ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014a9cb>{kthread+0} <ffffffff80110e0f>{child_rip+0}

cman_memb     S ffffffff80304190     0  3585      1          3587  3584 (L-TLB)
000001021bc3fe38 0000000000000046 0000000037e4b7f0 000001021b4c9080
       000001021c3357f0 0000000000000074 0000010008771940 0000000200000014
       000001021c24d030 000000000000035a
Call Trace:<ffffffffa022c0ea>{:cman:membership_kthread+2891}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff801333c8>{default_wake_function+0} <ffffffff80139eaf>{do_exit+3179}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff8013212e>{schedule_tail+55}
       <ffffffff80110e17>{child_rip+8}
<ffffffffa022b59f>{:cman:membership_kthread+0}
       <ffffffff80110e0f>{child_rip+0}
cman_hbeat    S 0000000000000003     0  3587      1          3617  3585 (L-TLB)
000001021b07df18 0000000000000046 0000000000000000 ffffffff801317ef
       00000100087706e0 000001021b4c9080 ffffffffa0242600 000000011e019d00
       000001021c969030 0000000000001032
Call Trace:<ffffffff801317ef>{activate_task+124}
<ffffffffa02295b1>{:cman:hello_kthread+478}
       <ffffffff80110e17>{child_rip+8} <ffffffffa02293d3>{:cman:hello_kthread+0}
       <ffffffff80110e0f>{child_rip+0}
fenced        S ffffffff801101c6     0  3617      1          3630  3587 (NOTLB)
000001021c371f28 0000000000000002 0000000000000e02 ffffff9100000001
       000000000050767d 0000000000000000 000001021c371ec8 0000000200000010
       000001021cb5b030 0000000000001e8a
Call Trace:<ffffffff801101c6>{system_call+126}
<ffffffff8010fd5d>{sys_rt_sigsuspend+199}
       <ffffffff8011053b>{ptregscall_common+103}
clvmd         S 0000000000000007     0  3630      1          3637  3617 (NOTLB)
000001021b3bdd78 0000000000000002 0000010037e4b7f0 0000000000000001
       000000d01b3bdcf8 0000000000000246 0000000000000246 0000000200012780
       000001021c3357f0 0000000000000732
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffff80189c5f>{do_select+939}
       <ffffffff801897f9>{__pollwait+0} <ffffffff80189fde>{sys_select+820}
       <ffffffff801101c6>{system_call+126}
clvmd         S ffffffff80304190     0  3637      1          3638  3630 (NOTLB)
000001021b213e58 0000000000000002 0000000000000246 0000000000000003
       000001021b213de8 ffffffff8013346f ffffffff8047faf0 0000000300000202
       000001021c4a7030 0000000000002713
Call Trace:<ffffffff8013346f>{__wake_up+54} <ffffffffa0246cbe>{:dlm:dlm_read+259}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff80177a83>{vfs_read+207} <ffffffff80177cda>{sys_read+69}
       <ffffffff801101c6>{system_call+126}
clvmd         S 000000000000002b     0  3638      1          3744  3637 (NOTLB)
000001021b215d98 0000000000000002 0000010037e48030 0000000000000003
       0000010214f3b030 0000000000000002 0000000000000012 0000000180131d1d
       000001021b85d030 00000000000046be
Call Trace:<ffffffff80305491>{schedule_timeout+101}
<ffffffff8016a7f6>{find_extend_vma+22}
       <ffffffff80134c2a>{add_wait_queue+18} <ffffffff8014bdce>{do_futex+413}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff8014c1b7>{sys_futex+203} <ffffffff801101c6>{system_call+126}

dlm_astd      S 00000100dfdcac00     0  3631     10          3633   259 (L-TLB)
000001021b4f7ec8 0000000000000046 0000010037e48030 0000001900000069
       000001021c335030 0000000000000069 0000010008769940 0000000180133419
       000001021c335030 0000000000000149
Call Trace:<ffffffff8013f378>{__mod_timer+293} <ffffffffa0245a43>{:dlm:dlm_astd+314}
       <ffffffffa0245909>{:dlm:dlm_astd+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014aa93>{kthread+200} <ffffffff80110e17>{child_rip+8}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014a9cb>{kthread+0}
       <ffffffff80110e0f>{child_rip+0}
dlm_recvd     S ffffffff8014aabc     0  3632     12          3634   258 (L-TLB)
000001021ae95ef8 0000000000000046 ffffffffffffffff 0000000000000000
       0000000000000246 ffffffffa024d68d 0000010008761940 0000000000000048
       000001021a8fd7f0 00000000000002f5
Call Trace:<ffffffffa024d68d>{:dlm:receive_from_sock+866}
<ffffffff8013bed1>{local_bh_enable+30}
       <ffffffff8014aabc>{keventd_create_kthread+0}
<ffffffffa024db31>{:dlm:dlm_recvd+170}
       <ffffffffa024da87>{:dlm:dlm_recvd+0} <ffffffff8014aa93>{kthread+200}
       <ffffffff80110e17>{child_rip+8} <ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014a9cb>{kthread+0} <ffffffff80110e0f>{child_rip+0}

dlm_sendd     S ffffffff802cac99     0  3633     10          3752  3631 (L-TLB)
000001021a801e98 0000000000000046 0000000000000001 000001021a90c3b0
       0000000000000069 ffffffff802a4cf8 0000000000000069 0000000100000246
       000001021b4317f0 000000000000089b
Call Trace:<ffffffff802a4cf8>{release_sock+16} <ffffffff802cac99>{tcp_sendpage+0}
       <ffffffffa024de3c>{:dlm:dlm_sendd+218} <ffffffffa024dd62>{:dlm:dlm_sendd+0}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014aa93>{kthread+200}
       <ffffffff80110e17>{child_rip+8} <ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014a9cb>{kthread+0} <ffffffff80110e0f>{child_rip+0}

dlm_recoverd  S 00000100087786e0     0  3634     12          3754  3632 (L-TLB)
000001021ac21ea8 0000000000000046 0000000000000000 0000001900000069
       00000102159997f0 0000000000000069 0000010008769940 0000000180131d1d
       000001021a8fd030 0000000000000344
Call Trace:<ffffffffa02458a7>{:dlm:wake_astd+27}
<ffffffffa0254e84>{:dlm:dlm_recoverd+60}
       <ffffffffa0254e48>{:dlm:dlm_recoverd+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014aa93>{kthread+200} <ffffffff80110e17>{child_rip+8}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014a9cb>{kthread+0}
       <ffffffff80110e0f>{child_rip+0}
dlm_recoverd  S 0000000000000000     0  3732     13                1837 (L-TLB)
000001021ac01ea8 0000000000000046 0000000000000000 0000001900000074
       000001021d0437f0 0000000000000074 0000010008779940 0000000380131d1d
       000001021f8a97f0 00000000000006b3
Call Trace:<ffffffffa02458a7>{:dlm:wake_astd+27}
<ffffffffa0254e84>{:dlm:dlm_recoverd+60}
       <ffffffffa0254e48>{:dlm:dlm_recoverd+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014aa93>{kthread+200} <ffffffff80110e17>{child_rip+8}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014a9cb>{kthread+0}
       <ffffffff80110e0f>{child_rip+0}
lock_dlm1     S 0000000000000000     0  3733     11          3734  3586 (L-TLB)
0000010216211e58 0000000000000046 000001021b4317f0 0000001900000069
       000001021af777f0 0000000000000069 0000010008779940 00000003a028197c
       000001021af777f0 000000000000008d
Call Trace:<ffffffffa02be2f2>{:lock_dlm:dlm_async+218}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff8014aabc>{keventd_create_kthread+0}
<ffffffffa02be218>{:lock_dlm:dlm_async+0}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014aa93>{kthread+200}
       <ffffffff80110e17>{child_rip+8} <ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014a9cb>{kthread+0} <ffffffff80110e0f>{child_rip+0}

lock_dlm2     S 0000000000000000     0  3734     11                3733 (L-TLB)
0000010215fdfe58 0000000000000046 0000000000000012 0000001900000069
       0000010215a0b030 0000000000000069 0000010008769940 0000000100000000
       0000010215a0b030 00000000000001b0
Call Trace:<ffffffff8013353c>{complete+53}
<ffffffffa02be2f2>{:lock_dlm:dlm_async+218}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffffa02be218>{:lock_dlm:dlm_async+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014aa93>{kthread+200} <ffffffff80110e17>{child_rip+8}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014a9cb>{kthread+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_scand     S 0000000000000000     0  3744      1          3745  3638 (L-TLB)
000001021bb41ec8 0000000000000046 0000010037e48030 0000000000000000
       00000101fff29400 0000010215ee3974 0000000000000000 0000000100000000
       0000010215e187f0 0000000000011c8c
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffffa026b6d1>{:gfs:gfs_scand+137}
       <ffffffff80110e17>{child_rip+8} <ffffffffa026b648>{:gfs:gfs_scand+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_glockd    S 00000100087686e0     0  3745      1          3746  3744 (L-TLB)
0000010217063ed8 0000000000000046 0000000000000001 0000001900000069
       000001021c335030 0000000000000069 0000010008779940 00000003140036a8
       000001021bba07f0 0000000000000690
Call Trace:<ffffffffa0276dd0>{:gfs:unlock_on_glock+37}
<ffffffffa026b7d6>{:gfs:gfs_glockd+185}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff80110e17>{child_rip+8} <ffffffffa026b71d>{:gfs:gfs_glockd+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_recoverd  S 0000000000000000     0  3746      1          3747  3745 (L-TLB)
0000010215e87ec8 0000000000000046 ffffffff803cca80 0000000000000000
       00000000000000c2 0000000000000000 dead4ead00000001 0000000015e87e30
       000001021baf9030 00000000000001ad
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0}
<ffffffffa026b897>{:gfs:gfs_recoverd+137}
       <ffffffff80110e17>{child_rip+8} <ffffffffa026b80e>{:gfs:gfs_recoverd+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_logd      S ffffffff80304190     0  3747      1          3748  3746 (L-TLB)
0000010215e99e58 0000000000000046 0000010037e4b7f0 0000000000000246
       0000010218cb7e54 ffffffffa02b6ec0 0000010215e99de8 000000028013353c
       000001021bbb97f0 00000000000001a1
Call Trace:<ffffffffa02768d3>{:gfs:lock_on_glock+112}
<ffffffff8013f378>{__mod_timer+293}
       <ffffffff80305520>{schedule_timeout+244}
<ffffffff8013fda2>{process_timeout+0}
       <ffffffffa026b9df>{:gfs:gfs_logd+252} <ffffffff80110e17>{child_rip+8}
       <ffffffff801cccff>{dummy_d_instantiate+0} <ffffffffa026b8e3>{:gfs:gfs_logd+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_quotad    S ffffffff80304190     0  3748      1          3749  3747 (L-TLB)
0000010215e9beb8 0000000000000046 0000010037e4b7f0 0000001900000074
       000001021c4a77f0 0000000000000074 0000010008771940 0000000200000246
       00000102162007f0 0000000000000238
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffffa026bb42>{:gfs:gfs_quotad+272}
       <ffffffff80110e17>{child_rip+8} <ffffffffa026ba32>{:gfs:gfs_quotad+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_inoded    S 0000000000000000     0  3749      1          3756  3748 (L-TLB)
0000010215e9dec8 0000000000000046 0000010037e4b030 ffffff0010081000
       ffffff00100b97b0 ffffff00100b97b4 0000000000000058 0000000300000058
       0000010218caf030 000000000000064c
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffffa026bc1c>{:gfs:gfs_inoded+137}
       <ffffffff80110e17>{child_rip+8} <ffffffffa026bb93>{:gfs:gfs_inoded+0}
       <ffffffff80110e0f>{child_rip+0}
dlm_recoverd  S 00000100087786e0     0  3752     10          3755  3633 (L-TLB)
000001021550dea8 0000000000000046 0000000000000000 0000001900000069
       0000010214fe87f0 0000000000000069 0000010008769940 0000000100000246
       00000102159997f0 0000000000000336
Call Trace:<ffffffffa02458a7>{:dlm:wake_astd+27}
<ffffffffa0254e84>{:dlm:dlm_recoverd+60}
       <ffffffffa0254e48>{:dlm:dlm_recoverd+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014aa93>{kthread+200} <ffffffff80110e17>{child_rip+8}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014a9cb>{kthread+0}
       <ffffffff80110e0f>{child_rip+0}
lock_dlm1     S 0000000000000000     0  3754     12          4273  3634 (L-TLB)
0000010215543e58 0000000000000046 000001021c335030 0000001900000069
       0000010215e18030 0000000000000069 0000010008769940 0000000100000000
       0000010215e18030 0000000000000066
Call Trace:<ffffffffa02be2f2>{:lock_dlm:dlm_async+218}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff8014aabc>{keventd_create_kthread+0}
<ffffffffa02be218>{:lock_dlm:dlm_async+0}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014aa93>{kthread+200}
       <ffffffff80110e17>{child_rip+8} <ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014a9cb>{kthread+0} <ffffffff80110e0f>{child_rip+0}

lock_dlm2     S 0000000000000000     0  3755     10                3752 (L-TLB)
0000010215545e58 0000000000000046 0000000000000012 0000001900000069
       0000010215999030 0000000000000069 0000010008761940 0000000000000000
       0000010215999030 000000000000020c
Call Trace:<ffffffff8013353c>{complete+53}
<ffffffffa02be2f2>{:lock_dlm:dlm_async+218}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffffa02be218>{:lock_dlm:dlm_async+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014aa93>{kthread+200} <ffffffff80110e17>{child_rip+8}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014a9cb>{kthread+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_scand     S 0000000000000000     0  3756      1          3757  3749 (L-TLB)
0000010215549ec8 0000000000000046 0000010037e48030 ffffff00101da000
       0000010214003bb8 ffffffffffffff00 0000000000000000 0000000100000000
       00000100dfdb5030 0000000000038cfa
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffffa026b6d1>{:gfs:gfs_scand+137}
       <ffffffff80110e17>{child_rip+8} <ffffffffa026b648>{:gfs:gfs_scand+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_glockd    S ffffffff80304190     0  3757      1          3758  3756 (L-TLB)
000001021554bed8 0000000000000046 0000010037e4b7f0 ffffffffa02783d9
       0000010213c2b980 0000010204e9ee28 ffffffffa02b6ec0 0000000204e9ef30
       00000102159cd7f0 000000000000049e
Call Trace:<ffffffffa02783d9>{:gfs:gfs_glock_drop_th+290}
<ffffffffa026b7d6>{:gfs:gfs_glockd+185}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff80110e17>{child_rip+8} <ffffffffa026b71d>{:gfs:gfs_glockd+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_recoverd  S 0000000000000000     0  3758      1          3759  3757 (L-TLB)
00000102156bdec8 0000000000000046 0000010037e48030 000000000000000d
       0000000000000002 0000000000000000 dead4ead00000001 00000001156bde30
       000001021f5d4030 00000000000001bf
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0}
<ffffffffa026b897>{:gfs:gfs_recoverd+137}
       <ffffffff80110e17>{child_rip+8} <ffffffffa026b80e>{:gfs:gfs_recoverd+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_logd      S ffffffff80304190     0  3759      1          3760  3758 (L-TLB)
00000102156cfe58 0000000000000046 0000010037e4b7f0 0000000000000246
       00000102159fde54 ffffffffa02b6ec0 00000102156cfde8 000000028013353c
       000001021bbb9030 00000000000001ab
Call Trace:<ffffffffa02768d3>{:gfs:lock_on_glock+112}
<ffffffff8013f378>{__mod_timer+293}
       <ffffffff80305520>{schedule_timeout+244}
<ffffffff8013fda2>{process_timeout+0}
       <ffffffffa026b9df>{:gfs:gfs_logd+252} <ffffffff80110e17>{child_rip+8}
       <ffffffff801cccff>{dummy_d_instantiate+0} <ffffffffa026b8e3>{:gfs:gfs_logd+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_quotad    S 0000000000000000     0  3760      1          3761  3759 (L-TLB)
00000102156d1eb8 0000000000000046 0000010037e48030 0000001900000074
       000001021c4a77f0 0000000000000074 0000010008779940 00000001055542d5
       0000010216200030 0000000000000245
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffffa026bb42>{:gfs:gfs_quotad+272}
       <ffffffff80110e17>{child_rip+8} <ffffffffa026ba32>{:gfs:gfs_quotad+0}
       <ffffffff80110e0f>{child_rip+0}
gfs_inoded    S 0000000000000000     0  3761      1          3780  3760 (L-TLB)
00000102156d5ec8 0000000000000046 0000010037e48030 0000001900000074
       000001021d0437f0 0000000000000074 0000000000000000 0000000110212620
       000001021f8c1030 000000000000073d
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffffa026bc1c>{:gfs:gfs_inoded+137}
       <ffffffff80110e17>{child_rip+8} <ffffffffa026bb93>{:gfs:gfs_inoded+0}
       <ffffffff80110e0f>{child_rip+0}
clurgmgrd     S 000000000000000b     0  3780      1          4038  3761 (NOTLB)
0000010215249d78 0000000000000006 ffffffffa02649d0 0000000000000000
       000000d0a02649d8 0000000000000246 0000000000000246 0000000300012780
       000001021bba0030 0000000000001654
Call Trace:<ffffffff80305491>{schedule_timeout+101}
<ffffffffa0246f6d>{:dlm:dlm_poll+86}
       <ffffffff80189c5f>{do_select+939} <ffffffff801897f9>{__pollwait+0}
       <ffffffff80189fde>{sys_select+820} <ffffffff80191570>{dnotify_parent+34}
       <ffffffff801101c6>{system_call+126}
clurgmgrd     S ffffffff80304190     0  4267      1          4400  4204 (NOTLB)
00000102153e1d78 0000000000000006 0000010037e4b7f0 0000001900000074
       000000d01d0437f0 0000000000000246 0000000000000246 0000000200012780
       0000010218caf7f0 0000000000000762
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0} <ffffffff80189c5f>{do_select+939}
       <ffffffff801897f9>{__pollwait+0} <ffffffff80189fde>{sys_select+820}
       <ffffffff8018d475>{dput+56} <ffffffff801101c6>{system_call+126}

rpc.statd     S 0000000000000007     0  4038      1          4082  3780 (NOTLB)
0000010214dd3d78 0000000000000002 000001021d695980 ffffffff802a2390
       000000d000000001 0000000000000246 0000000000000246 0000000100012780
       0000010214c087f0 0000000000005f4a
Call Trace:<ffffffff802a2390>{sock_recvmsg+284}
<ffffffff80305491>{schedule_timeout+101}
       <ffffffff802ca87f>{tcp_poll+44} <ffffffff80189c5f>{do_select+939}
       <ffffffff801897f9>{__pollwait+0} <ffffffff80189fde>{sys_select+820}
       <ffffffff8018d475>{dput+56} <ffffffff801101c6>{system_call+126}

rpc.statd     S 00000100087706e0     0  4082      1          4145  4038 (NOTLB)
0000010214f83d78 0000000000000002 0000000000000000 0000001900000073
       000001021f5d47f0 0000000000000073 0000010008761940 0000000000012780
       0000010214cc2030 0000000000029b38
Call Trace:<ffffffff80134df2>{autoremove_wake_function+0}
<ffffffff80305491>{schedule_timeout+101}
       <ffffffff802ca87f>{tcp_poll+44} <ffffffff80189c5f>{do_select+939}
       <ffffffff801897f9>{__pollwait+0} <ffffffff80189fde>{sys_select+820}
       <ffffffff8018d4fa>{dput+189} <ffffffff801101c6>{system_call+126}

nfsd          S 000000000036ee80     0  4145      1          4146  4082 (L-TLB)
0000010214f89de8 0000000000000046 0000010037e4b7f0 0000001900000073
       000001021f5d47f0 0000000000000073 0000010008771940 00000002dfdaa800
       0000010214dc97f0 00000000000001e1
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0}
<ffffffffa01868ea>{:sunrpc:svc_recv+786}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffffa02e0479>{:nfsd:nfsd+381}
       <ffffffff8013212e>{schedule_tail+55} <ffffffff80110e17>{child_rip+8}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffffa02e02fc>{:nfsd:nfsd+0}
       <ffffffff80110e0f>{child_rip+0}
nfsd          S 000000000036ee80     0  4146      1          4147  4145 (L-TLB)
000001021483fde8 0000000000000046 ffffffff803cca80 0000000000000001
       00000000000005a8 ffffffff802d7074 0000000100000000 00000000f835fbb0
       00000102152557f0 00000000000002c6
Call Trace:<ffffffff802d7074>{tcp_write_xmit+132}
<ffffffff802d802b>{tcp_send_fin+522}
       <ffffffff8013f378>{__mod_timer+293} <ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0}
<ffffffffa01868ea>{:sunrpc:svc_recv+786}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffffa02e0479>{:nfsd:nfsd+381}
       <ffffffff8013212e>{schedule_tail+55} <ffffffff80110e17>{child_rip+8}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffffa02e02fc>{:nfsd:nfsd+0}
       <ffffffff80110e0f>{child_rip+0}
nfsd          S 000000000036ee80     0  4147      1          4148  4146 (L-TLB)
0000010214e27de8 0000000000000046 0000010037e4b7f0 0000000000000074
       0000000000007530 00000100082bf9f8 0000000800000003 00000002dfd60400
       000001021f5d47f0 00000000000001ab
Call Trace:<ffffffffa0185c1f>{:sunrpc:svc_sendto+264}
<ffffffff8013f378>{__mod_timer+293}
       <ffffffff80305520>{schedule_timeout+244}
<ffffffff8013fda2>{process_timeout+0}
       <ffffffffa01868ea>{:sunrpc:svc_recv+786}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff801333c8>{default_wake_function+0} <ffffffffa02e02fc>{:nfsd:nfsd+0}
       <ffffffffa02e0479>{:nfsd:nfsd+381} <ffffffff8013212e>{schedule_tail+55}
       <ffffffff80110e17>{child_rip+8} <ffffffffa02e02fc>{:nfsd:nfsd+0}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffff80110e0f>{child_rip+0}

nfsd          S 000000000036ee80     0  4148      1          4149  4147 (L-TLB)
000001021480fde8 0000000000000046 0000010037e4b030 0000001900000073
       000001021d594030 0000000000000073 0000010008779940 00000003dfd60800
       00000102159f2030 000000000000028d
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0}
<ffffffffa01868ea>{:sunrpc:svc_recv+786}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffffa02e0479>{:nfsd:nfsd+381}
       <ffffffff8013212e>{schedule_tail+55} <ffffffff80110e17>{child_rip+8}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffffa02e02fc>{:nfsd:nfsd+0}
       <ffffffff80110e0f>{child_rip+0}
nfsd          S 00000100087706e0     0  4149      1          4150  4148 (L-TLB)
0000010214fd5de8 0000000000000046 ffffffff803cca80 0000001900000073
       00000102159cd030 0000000000000073 0000010008761940 0000000000000002
       0000010214d777f0 0000000000000274
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0}
<ffffffffa01868ea>{:sunrpc:svc_recv+786}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffffa02e0479>{:nfsd:nfsd+381}
       <ffffffff8013212e>{schedule_tail+55} <ffffffff80110e17>{child_rip+8}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffffa02e02fc>{:nfsd:nfsd+0}
       <ffffffff80110e0f>{child_rip+0}
nfsd          S 00000100087606e0     0  4150      1          4151  4149 (L-TLB)
0000010214f65de8 0000000000000046 0000010037e4b7f0 0000001900000073
       000001021d594030 0000000000000073 0000010008771940 00000002dfdb8400
       00000102159f27f0 000000000000028b
Call Trace:<ffffffff8013f378>{__mod_timer+293}
<ffffffff80305520>{schedule_timeout+244}
       <ffffffff8013fda2>{process_timeout+0}
<ffffffffa01868ea>{:sunrpc:svc_recv+786}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffffa02e0479>{:nfsd:nfsd+381}
       <ffffffff8013212e>{schedule_tail+55} <ffffffff80110e17>{child_rip+8}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffffa02e02fc>{:nfsd:nfsd+0}
       <ffffffff80110e0f>{child_rip+0}
nfsd          S 000000000036ee80     0  4151      1          4152  4150 (L-TLB)
0000010214e39de8 0000000000000046 00000102159f27f0 0000000000000028
       0000000000007530 00000100081d4890 00000100dfdb8800 00000002dfdb8800
       000001021d594030 000000000000017d
Call Trace:<ffffffffa0185c1f>{:sunrpc:svc_sendto+264}
<ffffffff8013f378>{__mod_timer+293}
       <ffffffff80305520>{schedule_timeout+244}
<ffffffff8013fda2>{process_timeout+0}
       <ffffffffa01868ea>{:sunrpc:svc_recv+786}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff801333c8>{default_wake_function+0} <ffffffffa02e02fc>{:nfsd:nfsd+0}
       <ffffffffa02e0479>{:nfsd:nfsd+381} <ffffffff8013212e>{schedule_tail+55}
       <ffffffff80110e17>{child_rip+8} <ffffffffa02e02fc>{:nfsd:nfsd+0}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffff80110e0f>{child_rip+0}

nfsd          S 000000000036ee80     0  4152      1          4154  4151 (L-TLB)
0000010214f8fde8 0000000000000046 0000010214d777f0 0000000000001000
       0000000000007530 0000010008407d80 dead4ead00000001 0000000000001000
       00000102159cd030 000000000000017a
Call Trace:<ffffffffa0185c89>{:sunrpc:svc_sendto+370}
<ffffffff8013f378>{__mod_timer+293}
       <ffffffff80305520>{schedule_timeout+244}
<ffffffff8013fda2>{process_timeout+0}
       <ffffffffa01868ea>{:sunrpc:svc_recv+786}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff801333c8>{default_wake_function+0} <ffffffffa02e02fc>{:nfsd:nfsd+0}
       <ffffffffa02e0479>{:nfsd:nfsd+381} <ffffffff8013212e>{schedule_tail+55}
       <ffffffff80110e17>{child_rip+8} <ffffffffa02e02fc>{:nfsd:nfsd+0}
       <ffffffffa02e02fc>{:nfsd:nfsd+0} <ffffffff80110e0f>{child_rip+0}

lockd         S 7fffffffffffffff     0  4154      1          4155  4152 (L-TLB)
0000010214f59e18 0000000000000046 00000000000005a8 ffffffff802a55c3
       000001021a90cfb0 ffffffff802d727c 0000000100000001 000000001a90cfb0
       0000010214c08030 0000000000000a0a
Call Trace:<ffffffff802a55c3>{sk_reset_timer+15}
<ffffffff802d727c>{tcp_write_xmit+652}
       <ffffffff802d802b>{tcp_send_fin+522} <ffffffff802a4cf8>{release_sock+16}
       <ffffffff80305491>{schedule_timeout+101}
<ffffffff80134c2a>{add_wait_queue+18}
       <ffffffffa01868ea>{:sunrpc:svc_recv+786}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffffa02cb088>{:lockd:lockd+0}
       <ffffffffa02cb1fa>{:lockd:lockd+370} <ffffffff80110e17>{child_rip+8}
       <ffffffffa02cb088>{:lockd:lockd+0} <ffffffffa02cb088>{:lockd:lockd+0}
       <ffffffff80110e0f>{child_rip+0}
rpciod        S ffffffffa02cb088     0  4155      1          4158  4154 (L-TLB)
0000010214c7bec8 0000000000000046 0000010204d94e00 0000000000000073
       000001021c449400 ffffffffa0182ebb 00004874cb160e8e 000000020e1bb7f0
       0000010214f3b7f0 00000000000082ce
Call Trace:<ffffffffa0182ebb>{:sunrpc:__rpc_execute+867}
<ffffffffa02cb088>{:lockd:lockd+0}
       <ffffffffa0183489>{:sunrpc:rpciod+483}
<ffffffff80134df2>{autoremove_wake_function+0}
       <ffffffff80134df2>{autoremove_wake_function+0}
<ffffffff80110e17>{child_rip+8}
       <ffffffffa02cb088>{:lockd:lockd+0} <ffffffffa01832a6>{:sunrpc:rpciod+0}
       <ffffffff80110e0f>{child_rip+0}
rpc.mountd    S 0000000000000008     0  4158      1          4197  4155 (NOTLB)
0000010214801d78 0000000000000002 00000100087606e0 ffffffff80304a85
       000000d014801d98 0000000000000246 0000000000000246 0000000200012780
       0000010214fe8030 0000000000006fc9
Call Trace:<ffffffff80304a85>{thread_return+0}
<ffffffff80305491>{schedule_timeout+101}
       <ffffffff802ca87f>{tcp_poll+44} <ffffffff80189c5f>{do_select+939}
       <ffffffff801897f9>{__pollwait+0} <ffffffff80189fde>{sys_select+820}
       <ffffffff8018d4fa>{dput+189} <ffffffff801101c6>{system_call+126}

rpc.rquotad   S 000001021bb17b80     0  4197      1          4204  4158 (NOTLB)
00000102153a7e88 0000000000000002 000000552abc1516 0000001900000073
       000001021524c030 0000000000000073 0000010008771940 0000000200012780
       0000010215a0b7f0 000000000001b3f2
Call Trace:<ffffffff80305491>{schedule_timeout+101}
<ffffffff80134c2a>{add_wait_queue+18}
       <ffffffff802ca87f>{tcp_poll+44} <ffffffff8018a399>{sys_poll+610}
       <ffffffff801897f9>{__pollwait+0} <ffffffff801101c6>{system_call+126}

rpc.mountd    S 0000000000000008     0  4204      1          4267  4197 (NOTLB)
0000010214e09d78 0000000000000002 0000010214dc97f0 00000102144b0680
       000000d000000000 0000000000000246 0000000000000246 0000000000012780
       0000010214cc27f0 0000000000002cd0
Call Trace:<ffffffff80305491>{schedule_timeout+101}
<ffffffff80189c5f>{do_select+939}
       <ffffffff801897f9>{__pollwait+0} <ffffffff80189fde>{sys_select+820}
       <ffffffff80191570>{dnotify_parent+34} <ffffffff801101c6>{system_call+126}

dlm_recoverd  S ffffffff80304190     0  4273     12          5373  3754 (L-TLB)
0000010214f75ea8 0000000000000046 0000000000000000 0000000000000026
       00000000fffffffb 0000000000000000 00000100087686e0 0000000100000002
       0000010214fe87f0 0000000000001a84
Call Trace:<ffffffffa02458a7>{:dlm:wake_astd+27}
<ffffffffa0254e84>{:dlm:dlm_recoverd+60}
       <ffffffffa0254e48>{:dlm:dlm_recoverd+0}
<ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014aa93>{kthread+200} <ffffffff80110e17>{child_rip+8}
       <ffffffff8014aabc>{keventd_create_kthread+0} <ffffffff8014a9cb>{kthread+0}
       <ffffffff80110e0f>{child_rip+0}
rdisc         S 0000000000000000     0  4400      1          5794  4267 (NOTLB)
0000010214eafb28 0000000000000006 0000010037e48030 0000001900000074
       000001021c4a77f0 0000000000000074 0000010008779940 0000000114eafc5c
       000001021b5f87f0 00000000000009a0
Call Trace:<ffffffff80305491>{schedule_timeout+101}
<ffffffff80134d4a>{prepare_to_wait_exclusive+21}
       <ffffffff802a7d55>{skb_recv_datagram+373}
<ffffffff80134df2>{autoremove_wake_function+0}
       <ffffffff80134df2>{autoremove_wake_function+0}
<ffffffff802e291f>{raw_recvmsg+134}
       <ffffffff802a586d>{sock_common_recvmsg+48}
<ffffffff802a2390>{sock_recvmsg+284}
       <ffffffff80140ca6>{dequeue_signal+58}
<ffffffff80134df2>{autoremove_wake_function+0}
       <ffffffff802a1f93>{sockfd_lookup+16} <ffffffff802a37c3>{sys_recvfrom+182}
       <ffffffff8013f378>{__mod_timer+293} <ffffffff801101c6>{system_call+126}

pdflush       S ffffffff8014aabc     0  5374     12                5373 (L-TLB)
000001013c917ec8 0000000000000046 00000101fffddc88 00000000000120e9
       0000000105781523 0000010149af1d98 00000000fffffffc 0000000149af1d88
       0000010214d77030 00000000000005fc
Call Trace:<ffffffff8014aabc>{keventd_create_kthread+0}
<ffffffff8015ed30>{pdflush+191}
       <ffffffff8015ec71>{pdflush+0} <ffffffff8014aa93>{kthread+200}
       <ffffffff80110e17>{child_rip+8} <ffffffff8014aabc>{keventd_create_kthread+0}
       <ffffffff8014a9cb>{kthread+0} <ffffffff80110e0f>{child_rip+0}

bash          S 0000007fbffff7b4     0  5484   5482  5793               (NOTLB)
00000101250fbeb8 0000000000000002 0000000102d92000 0000000000000001
       0000000000000016 0000010202d93e48 0000010215ce3600 0000000173ed7018
       000001000e100030 0000000000003a86
Call Trace:<ffffffff801e86b9>{__up_read+16} <ffffffff8012370b>{do_page_fault+575}
       <ffffffff8013af47>{do_wait+3298} <ffffffff801333c8>{default_wake_function+0}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffff801101c6>{system_call+126}

mkdir         D 0000000000000000     0  5793   5484                     (NOTLB)
0000010202d93b58 0000000000000006 00000100dfddfae8 000001021ed2a040
       00000100dfddfae8 00000100087786e0 ffffffff80482480 000000000b7d0a10
       000001021d709030 000000000000534a
Call Trace:<ffffffffa02ba313>{:lock_dlm:queue_delayed+30}
<ffffffffa02ba8d7>{:lock_dlm:do_dlm_lock+93}
       <ffffffff80178ad2>{bh_wake_function+0}
<ffffffff80304cbd>{wait_for_completion+167}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffffa028193f>{:gfs:gfs_lm_lock+50}
       <ffffffff801333c8>{default_wake_function+0}
<ffffffffa02774b1>{:gfs:glock_wait_internal+350}
       <ffffffffa0277ce6>{:gfs:gfs_glock_nq+961}
<ffffffffa027bef9>{:gfs:gfs_createi+107}
       <ffffffffa027b8bc>{:gfs:gfs_lookupi+443} <ffffffff8013353c>{complete+53}
       <ffffffffa02768d3>{:gfs:lock_on_glock+112}
<ffffffffa028ed71>{:gfs:gfs_mkdir+103}
       <ffffffffa028fc65>{:gfs:gfs_permission+475} <ffffffff80186722>{vfs_mkdir+200}
       <ffffffff80186806>{sys_mkdir+154} <ffffffff80110c61>{error_exit+0}
       <ffffffff801101c6>{system_call+126}
agetty        S 0000000000000001     0  5794      1                4400 (NOTLB)
000001021f213d88 0000000000000002 ffffffff801333c8 0000000000000000
       0000000000000000 0000000000000246 000001021f213d38 0000000200000206
       000001021fa077f0 000000000000ffbc
Call Trace:<ffffffff801333c8>{default_wake_function+0}
<ffffffff8023db19>{uart_start+38}
       <ffffffff80305491>{schedule_timeout+101}
<ffffffff80134c2a>{add_wait_queue+18}
       <ffffffff80229768>{read_chan+1059}
<ffffffff801333c8>{default_wake_function+0}
       <ffffffff8013346f>{__wake_up+54} <ffffffff801333c8>{default_wake_function+0}
       <ffffffff802243dd>{tty_read+230} <ffffffff80177a83>{vfs_read+207}
       <ffffffff80177cda>{sys_read+69} <ffffffff801101c6>{system_call+126}

(some cut due to space limitations)

That shows the mkdir is hung in do_dlm_lock, so there must also be some
locking issues related to this problem.  Unfortunately, I couldn't
do any more analysis because the machine wouldn't let me touch the fs.
Therefore, I asked Corey to reboot it.  I think we need to add some kind
of debug code, possibly to vfs's dcache code, before recreating it again.


Comment 7 Robert Peterson 2006-09-27 16:23:46 UTC
I finally recreated this problem by running the nfs_try test case
using three servers (roth-01,02,03) and four nfs clients 
(trin-13,14,15,16).  This was recreated at RHEL4U4.  Here's what I
observed:

The "incorrect" node was roth-01 and it did not see two files
/mnt/roth0/trin-16/lock1_small and /mnt/roth0/trin-16/rwranbuflarge.
The other two server nodes could see these files correctly.

Once again, I verified that the inodes and directory entries 
were okay in the gfs file system on disk.

The "incorrect" node reported this error from strace ls:
stat("/mnt/roth0/trin-16/lock1_small", 0x5164e8) = -1 ENOENT (No such file or
directory)
lstat("/mnt/roth0/trin-16/lock1_small", 0x5164e8) = -1 ENOENT (No such file or
directory)

An attempt to open the file on the "incorrect" node produced:
open("/mnt/roth0/trin-16/lock1_small", O_WRONLY|O_NONBLOCK|O_CREAT|O_NOCTTY,
0666) = -1 EPERM (Operation not permitted)

I did a gfs_tool lockdump on the fs from all nodes in the cluster
and discovered two glock entries for the "missing" file's inode
that had the same values on all nodes in the cluster:

Glock (2, 90145100)
  gl_flags = 
  gl_count = 2
  gl_state = 0
  req_gh = no
  req_bh = no
  lvb_count = 0
  object = yes
  new_le = no
  incore_le = no
  reclaim = no
  aspace = 0
  ail_bufs = no
  Inode:
    num = 90145100/90145100
    type = 1
    i_count = 1
    i_flags = 
    vnode = yes

and

Glock (5, 90145100)
  gl_flags = 
  gl_count = 2
  gl_state = 3
  req_gh = no
  req_bh = no
  lvb_count = 0
  object = yes
  new_le = no
  incore_le = no
  reclaim = no
  aspace = no
  ail_bufs = no
  Holder
    owner = -1
    gh_state = 3
    gh_flags = 5 7 
    error = 0
    gh_iflags = 1 6 7 

The glocks for the subdirectory in question were also identical
on all nodes:

Glock (2, 90047399)
  gl_flags = 
  gl_count = 3
  gl_state = 3
  req_gh = no
  req_bh = no
  lvb_count = 0
  object = yes
  new_le = no
  incore_le = no
  reclaim = no
  aspace = 1
  ail_bufs = no
  Inode:
    num = 90047399/90047399
    type = 2
    i_count = 1
    i_flags = 
    vnode = yes

and

Glock (5, 90047399)
  gl_flags = 
  gl_count = 2
  gl_state = 3
  req_gh = no
  req_bh = no
  lvb_count = 0
  object = yes
  new_le = no
  incore_le = no
  reclaim = no
  aspace = no
  ail_bufs = no
  Holder
    owner = -1
    gh_state = 3
    gh_flags = 5 7 
    error = 0
    gh_iflags = 1 6 7 

I had earlier thought that this might be locking related, but
I would have expected different glock states if that were the case.

An ls -lad on the directory that contains the "missing" files showed
an odd descrepancy on the nodes.  The link count was zero on the
incorrect node:

roth-01 (incorrect):
drwxr-xr-x    0 root root 2048 Sep 26 09:14 trin-16

roth-02 (correct):
drwxr-xr-x    2 root root 2048 Sep 26 09:17 trin-16

I did stat on the "missing" file from both good and bad nodes:
Incorrect:
[root@roth-01 /mnt/roth0/trin-16]# stat /mnt/roth0/trin-16/lock1_small
stat: cannot stat `/mnt/roth0/trin-16/lock1_small': No such file or directory

Correct:
[root@roth-02 /mnt/roth0/trin-16]# stat /mnt/roth0/trin-16/lock1_small
  File: `/mnt/roth0/trin-16/lock1_small'
  Size: 512000          Blocks: 1008       IO Block: 4096   regular file
Device: fc00h/64512d    Inode: 90145100    Links: 1
Access: (0666/-rw-rw-rw-)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2006-09-26 09:17:10.000000000 -0500
Modify: 2006-09-26 09:17:16.000000000 -0500
Change: 2006-09-26 09:17:16.000000000 -0500

Next I did the stat command on the directory from the nodes.
Incorrect:

[root@roth-01 /mnt/roth0/trin-16]# stat /mnt/roth0/trin-16
  File: `/mnt/roth0/trin-16'
  Size: 2048            Blocks: 24         IO Block: 4096   directory
Device: fc00h/64512d    Inode: 104775835   Links: 0
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2006-09-27 09:21:06.000000000 -0500
Modify: 2006-09-26 09:14:23.000000000 -0500
Change: 2006-09-26 09:14:23.000000000 -0500

Note that the inode 104775835 listed for the directory is wrong.
That inode (0x63ec9b) is a directory inode, but not a directory 
off of the root.  I suspect it may be a deleted directory, but
I'm still investigating.

The correct information from stat of the directory from a good 
node shows:
[root@roth-02 /mnt/roth0/trin-16]# stat /mnt/roth0/trin-16
  File: `/mnt/roth0/trin-16'
  Size: 2048            Blocks: 24         IO Block: 4096   directory
Device: fc00h/64512d    Inode: 90047399    Links: 2
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2006-09-27 10:25:39.000000000 -0500
Modify: 2006-09-26 09:17:10.000000000 -0500
Change: 2006-09-26 09:17:10.000000000 -0500

This shows the correct directory inode, 90047399 (0x055e03a7),
as shown on disk.

I suspect I'll have to temporarily add debug code to the gfs lookup
code and recreate this again to figure out what's going on.
This problem took many runs of the test to recreate the problem.


Comment 8 Robert Peterson 2007-01-03 18:03:02 UTC
I recreated this problem again today after trying for many weeks.
This is exactly what I did to recreate it:

1. I've got a three-node NFS server cluster (roth-01/2/3), all 
   serving the same GFS file system, mounted as /mnt/roth0.
2. On the same cluster, there is a virtual IP, xx.xx.xx.250.
   I occasionally move the virtual IP between two of the three
   nfs servers by running the following script on roth-01:

#!/bin/bash
for ((a=1; a>0; a++))
do
        clusvcadm -r xx.xx.xx.250 -m roth-02
        sleep 20
        clusvcadm -r xx.xx.xx.250 -m roth-01
        sleep 20
done
exit 0

3. I've got a five-node NFS client cluster (trin-12,13,14,15,16), 
   all of which are mounting the xx.xx.xx.250:/mnt/roth0 mount point.
   This mounting happens automatically when I run the nfs_try test
   case from trin-12.  I'm using the following command to run the test:

./nfs_try -R /usr/tests/sts/var/share/resource_files/roth.xml -n nfs1 -C
/usr/tests/sts/var/share/resource_files/roth.xml -S TRIVIAL

The "TRIVIAL" test case boils down to a script that runs a bunch
of the test suite programs for 1 second, but the genesis program is
run with -d 100 to create 100 directories.

When the test was finished this iteration, roth-01 exhibited the 
symptoms and couldn't see anything inside directories /mnt/roth0/trin-12
and /mnt/roth0/trin-14.  This was despite the test case ending
successfully.

A stat on the "incorrect" directory indicated the wrong inode, as before:

[root@roth-01 ~]# stat /mnt/roth0/trin-12
  File: `/mnt/roth0/trin-12'
  Size: 2048            Blocks: 24         IO Block: 4096   directory
Device: fc00h/64512d    Inode: 179012524   Links: 0
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2007-01-03 10:50:23.000000000 -0600
Modify: 2007-01-03 11:00:50.000000000 -0600
Change: 2007-01-03 11:00:50.000000000 -0600

179012524 in hex is 0xaab83ac, which corresponds to a directory that 
was deleted by the gfs kernel code running on roth-02.  This is output
from some debug code I added to roth-02 for this problem:

GFS mkdir(trin-15) inum=aac9a57 rc=0
GFS rmdir(trin-16) inum=aab83ab rc=0
GFS rmdir(trin-14) inum=aab83a9 rc=0
GFS mkdir(trin-16) inum=aac9a58 rc=0
GFS mkdir(trin-14) inum=aac9a59 rc=0
GFS rmdir(trin-12) inum=aab83ac rc=0
GFS mkdir(trin-12) inum=aac9a5a rc=0

The last rmdir shown is for the directory that roth-01 still think's
is valid.  The trin-12 directory created immediately thereafter on 
roth-02, 0xaac9a5a (179083866 decimal) shows up as the good inode on
the other nodes:

[root@roth-02 ~]# stat /mnt/roth0/trin-12
  File: `/mnt/roth0/trin-12'
  Size: 2048            Blocks: 40         IO Block: 4096   directory
Device: fc00h/64512d    Inode: 179083866   Links: 101
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2007-01-03 11:00:50.000000000 -0600
Modify: 2007-01-03 11:02:00.000000000 -0600
Change: 2007-01-03 11:02:00.000000000 -0600

So we've definitely got a problem with the GFS kernel rmdir (when done
through NFS?) not being seen from the other node when a mkdir is done
afterward with the same name in its place.


Comment 9 Robert Peterson 2007-01-05 22:11:42 UTC
Adding Wendy Cheng to the cc list because I'd like her to review my
findings and code changes.

I was able to determine through debug code that the dentry
revalidate code was not getting called for the "missing" directory
by VFS.  Since revalidate wasn't getting called, VFS never asked 
GFS if the dentry was still valid.  Instead, VFS thought the dentry
was okay and used the inode value it found for the dentry in memory.

In fact, the inode had been changed on disk by a process on a 
different node (the genesis program from the test case), which had 
done rmdir/mkdir through nfs.  So a simple revalidate would have
corrected the problem.

The first big question in my mind was: How did GFS manage to get a
VFS directory entry (dentry) that refused to call revalidate?
The revalidate callback is set whenever a GFS lookup is done
through gfs_lookup, which is pretty much all the time.  To be more
specific, it happens whenever an inode is looked up by name (and
thus the directory path is searched).

The obvious case where lookups don't need to be done by name is in 
ops_export.c, which has the hooks for exporting GFS to NFS.
Because of its nature, NFS can look up inodes by a "cookie" 
identifier rather than a file name.

Inside ops_export.c, there are two places where it can create a
dentry by calling d_alloc_anon.  So my theory is that the missing
revalidate callback was causing the problem.  It also explained 
why the error only showed up when exporting GFS through NFS.  
The problem is, it's a hard theory to test.

The only way I could think of to test the theory was to make a
code change and see if the problem went away, but that assumed I
had a baseline for the failure in the first place.

So armed with my recreation method documented in comment #8, I
tried to figure out how often I could recreate the failure.  It
turns out I could recreate the problem almost half the time while
I had debugging code in GFS:

Try Result            Time the test ran
=== ================= =================
 1. No failure        1m30s
 2. Failure recreated 8m
 3. No failure        8m
 4. roth-02 kernel panic in NFS (unrelated to GFS)
 5. No failure        
 6. Failure recreated 1m43s
 7. No failure        12m30s
 8. Failure recreated 10m0s
 9. No failure        7m20s
10. Failure recreated 1m39s

The kernel panic I saw in try #4 seemed unrelated to GFS.
So this established a baseline of recreating the failure nearly
50% of the time.

Next, I patched my kernel to set the correct callback from both
locations in ops_export.c.  Then I reran my test thirty more times:

Try Result            Time the test ran
=== ================= =================
 1. No failure        12m
 2. No failure        6m40s
 3. No failure        1m9s
 4. roth-02 kernel panic in NFS (unrelated to GFS)
 5. No failure        6m48s
 6. No failure        9m2s
 7. roth-02 kernel panic in NFS (unrelated to GFS)
 8. No failure        4m12s
 9. No failure        5m4s
10. No failure        5m42s
11. roth-02 kernel panic in NFS (unrelated to GFS)
12. No failure        4m16s
13. No failure        9m15s
14. No failure        4m50s
15. No failure        3m50s
16. No failure        5m7s
17. No failure        4m46s
18. No failure        5m11s
19. No failure        5m46s
20. roth-02 kernel panic in NFS (unrelated to GFS)
21. No failure        7m8s
22. No failure        6m34s
23. No failure        8m0s
24. roth-01 kernel panic in NFS (unrelated to GFS)
25. No failure        5m32s
26. No failure        8m52s
27. roth-02 kernel panic in NFS (unrelated to GFS)
28. No failure        8m29s
29. roth-01 kernel panic in NFS (unrelated to GFS)
30. No failure        6m36s

I got the panic 7 times out of 30, which is 23 percent of the time.
Just for fun, I took the lines of code back out and reran it a
dozen more times without any kind of debug code:

Try Result            Time the test ran
=== ================= =================
 1. No failure        7m20s
 2. No failure        6m14s
 3. No failure        7m49s
 4. No failure        6m20s
 5. No failure        12m3s
 6. Failed on roth-01
 7. No failure
 8. Failed on roth-01
 9. Same kernel panic on roth-02 (without my patch)
10. No failure
11. Failed on roth-01
12. No failure

Conclusions:

1. The same kernel panic in NFS happens with or without my patch
   so my patch doesn't seem to change that.  I'll be opening another 
   bugzilla for that problem against NFS.
2. With my patch, I got no failures other than the unrelated NFS bug.
   The problem seemed to be fixed by the patch.
3. Without my patch and debug code, I still had 3 out of 12 
   attempts fail (25% failure rate).

I'd like Wendy to look over these results and conclusions, and if she
doesn't see a problem, I'll commit it to CVS.


Comment 10 Robert Peterson 2007-01-05 22:14:28 UTC
Created attachment 144940 [details]
Proposed patch to fix the problem

Comment 11 Robert Peterson 2007-01-05 22:37:39 UTC
I opened bz #221666 regarding the NFS kernel panic above.


Comment 12 Robert Peterson 2007-01-11 18:48:44 UTC
I did more nfs testing on the patch today; everything worked as expected.
I also received positive feedback from Wendy Cheng and Steve Whitehouse
who were kind enough to review my patch (thanks much).
So I'll be committing the patch to CVS for RHEL4 U5 (4.5) today.

I also opened bugs to crosswrite this problem to RHEL5 GFS1:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=222299 

and RHEL5 GFS2:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=222302


Comment 13 Robert Peterson 2007-01-11 19:07:38 UTC
Code checked into the RHEL4 branch of CVS.  Changing status to Modified.


Comment 16 Red Hat Bugzilla 2007-05-10 21:13:27 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0142.html