710539 – NFS sometimes causes suspend to hang

Bug 710539 - NFS sometimes causes suspend to hang

Summary: NFS sometimes causes suspend to hang

Keywords:
Status:	CLOSED DUPLICATE of bug 717735
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	15
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Jeff Layton
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-06-03 16:28 UTC by Kieran Clancy
Modified:	2014-06-18 07:41 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2011-11-01 13:55:30 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Kieran Clancy 2011-06-03 16:28:06 UTC

Description of problem:

Sometimes when I suspend, the system won't switch off, and will instead come back after 20-30 seconds as if nothing had happened. dmesg shows that a kernel thread involved with NFS caused the issues.

Version-Release number of selected component (if applicable):
2.6.38.6-27.fc15.i686.PAE

How reproducible:
Maybe 25%

Steps to Reproduce:
1. Suspend
  
Actual results:
System does not suspend.

Expected results:
System suspends.

Additional info:
This seems to happen regardless of whether I've recently been accessing an NFS mount. But, I think it shouldn't really matter in any case -- even if I am currently writing to 10 files on NFS, it should still suspend because I told it to.

dmesg backtrace thing:

[  736.627054] PM: Syncing filesystems ... done.
[  736.630879] PM: Preparing system for mem sleep
[  736.694583] Freezing user space processes ... 
[  756.705123] Freezing of tasks failed after 20.01 seconds (1 tasks refusing to freeze, wq_busy=0):
[  756.705299] umount          D e74abd48     0  3298   3272 0x00800084
[  756.705307]  e74abd58 00000086 00000002 e74abd48 00000286 e74abce8 eeabe000 c0b05180
[  756.705318]  e764283c c0b05180 3489d693 000000ab eea3b200 00000000 e76425b0 00000246
[  756.705328]  e74abd10 c0446b49 e74abd18 c0446baf e74abd20 c07e9149 e74abd44 f87e7ce7
[  756.705339] Call Trace:
[  756.705352]  [<c0446b49>] ? _local_bh_enable_ip+0x1d/0x76
[  756.705357]  [<c0446baf>] ? local_bh_enable_ip+0xd/0xf
[  756.705364]  [<c07e9149>] ? _raw_spin_unlock_bh+0x12/0x14
[  756.705388]  [<f87e7ce7>] ? rpc_wake_up_next+0x130/0x13a [sunrpc]
[  756.705393]  [<c0446b49>] ? _local_bh_enable_ip+0x1d/0x76
[  756.705400]  [<c0469b90>] ? arch_local_irq_save+0x12/0x17
[  756.705404]  [<c07e8ffb>] ? _raw_spin_unlock_irqrestore+0x13/0x15
[  756.705422]  [<f87e7627>] rpc_wait_bit_killable+0x2e/0x32 [sunrpc]
[  756.705427]  [<c07e81de>] __wait_on_bit+0x39/0x60
[  756.705444]  [<f87e75f9>] ? rpc_wait_bit_killable+0x0/0x32 [sunrpc]
[  756.705449]  [<c07e8263>] out_of_line_wait_on_bit+0x5e/0x66
[  756.705466]  [<f87e75f9>] ? rpc_wait_bit_killable+0x0/0x32 [sunrpc]
[  756.705471]  [<c0459056>] ? wake_bit_function+0x0/0x4a
[  756.705488]  [<f87e7f7c>] __rpc_execute+0xdb/0x23c [sunrpc]
[  756.705493]  [<c0458e63>] ? wake_up_bit+0x1c/0x20
[  756.705509]  [<f87e7961>] ? rpc_make_runnable+0x72/0x78 [sunrpc]
[  756.705526]  [<f87e81f4>] ? rpc_new_task+0xba/0x122 [sunrpc]
[  756.705543]  [<f87e8137>] rpc_execute+0x36/0x39 [sunrpc]
[  756.705558]  [<f87e2b7c>] rpc_run_task+0xb1/0xb8 [sunrpc]
[  756.705573]  [<f87e2c6b>] rpc_call_sync+0x40/0x5b [sunrpc]
[  756.705602]  [<f8cd8a87>] _nfs4_call_sync+0x22/0x25 [nfs]
[  756.705623]  [<f8cd71f0>] _nfs4_proc_getattr+0x79/0x81 [nfs]
[  756.705646]  [<f8cd9733>] nfs4_proc_getattr+0x2d/0x47 [nfs]
[  756.705666]  [<f8cc8140>] __nfs_revalidate_inode+0xa8/0x1c4 [nfs]
[  756.705672]  [<c04e0e50>] ? kmem_cache_free+0x67/0x94
[  756.705691]  [<f8cc843b>] nfs_getattr+0x72/0xb7 [nfs]
[  756.705697]  [<c04f10c3>] vfs_getattr+0x3c/0x53
[  756.705716]  [<f8cc83c9>] ? nfs_getattr+0x0/0xb7 [nfs]
[  756.705721]  [<c04f1131>] vfs_fstatat+0x57/0x6b
[  756.705725]  [<c04f1183>] vfs_stat+0x1e/0x20
[  756.705729]  [<c04f136c>] sys_stat64+0x16/0x27
[  756.705734]  [<c04835c9>] ? audit_syscall_entry+0x128/0x14a
[  756.705741]  [<c041202e>] ? syscall_trace_enter+0x10f/0x121
[  756.705746]  [<c04120e9>] ? syscall_trace_leave+0xa9/0xbc
[  756.705751]  [<c07e93fc>] syscall_call+0x7/0xb

Comment 1 Chris 2011-06-20 15:24:55 UTC

I can report a similar problem here on FC15 2.6.38.8-32.fc15.x86_64 except on a hibernate. Mine is 100% reproducible while I have NFS mounts no matter whether I've just booted or if I've been using the NFS all day.

If I umount the nfs before I hibernate, it succeeds with no delay. Even the umount succeeds immediately. 

Freezing of tasks failed after 20.01 seconds (1 tasks refusing to freeze, wq_busy=0):
[ 9639.882196] umount          D ffff8800cfeafa30     0 12969  12949 0x00800084
[ 9639.882294]  ffff880210317b68 0000000000000086 0000000000000246 ffff88021f594590
[ 9639.882447]  ffff880210317fd8 ffff880210317fd8 0000000000013840 0000000000013840
[ 9639.882598]  ffff88021f595cc0 ffff88021f594590 ffff880210317b38 ffffffff814759c4
[ 9639.882749] Call Trace:
[ 9639.882788]  [<ffffffff814759c4>] ? _raw_spin_unlock_irqrestore+0x17/0x19
[ 9639.882841]  [<ffffffffa0eec07b>] ? rpc_wait_bit_killable+0x0/0x38 [sunrpc]
[ 9639.882886]  [<ffffffffa0eec0af>] rpc_wait_bit_killable+0x34/0x38 [sunrpc]
[ 9639.882925]  [<ffffffff8147493d>] __wait_on_bit+0x48/0x7b
[ 9639.882964]  [<ffffffff8106acc7>] ? queue_work_on+0x37/0x45
[ 9639.883009]  [<ffffffff814749e2>] out_of_line_wait_on_bit+0x72/0x7d
[ 9639.883053]  [<ffffffffa0eec07b>] ? rpc_wait_bit_killable+0x0/0x38 [sunrpc]
[ 9639.883092]  [<ffffffff8106f2ab>] ? wake_bit_function+0x0/0x31
[ 9639.883137]  [<ffffffffa0eecc80>] __rpc_execute+0xf2/0x295 [sunrpc]
[ 9639.883175]  [<ffffffff8106f021>] ? wake_up_bit+0x25/0x2a
[ 9639.883218]  [<ffffffffa0eece90>] rpc_execute+0x3f/0x43 [sunrpc]
[ 9639.883261]  [<ffffffffa0ee6dde>] rpc_run_task+0xeb/0xf7 [sunrpc]
[ 9639.883303]  [<ffffffffa0ee6ed7>] rpc_call_sync+0x45/0x66 [sunrpc]
[ 9639.883471]  [<ffffffffa0f50acc>] ? nfs_alloc_fattr+0x28/0x46 [nfs]
[ 9639.883639]  [<ffffffffa0f5db0a>] nfs3_rpc_wrapper.constprop.7+0x2c/0x64 [nfs]
[ 9639.883926]  [<ffffffffa0f5ebdf>] nfs3_proc_getattr+0x5d/0x83 [nfs]
[ 9639.884095]  [<ffffffffa0f50c14>] __nfs_revalidate_inode+0xb4/0x1a2 [nfs]
[ 9639.884260]  [<ffffffffa0f50e52>] nfs_revalidate_inode+0x4a/0x51 [nfs]
[ 9639.884424]  [<ffffffffa0f50f31>] nfs_getattr+0x92/0xc4 [nfs]
[ 9639.884581]  [<ffffffff81124fab>] vfs_getattr+0x45/0x63
[ 9639.884736]  [<ffffffff81125016>] vfs_fstatat+0x4d/0x63
[ 9639.884892]  [<ffffffff81125067>] vfs_stat+0x1b/0x1d
[ 9639.885052]  [<ffffffff81125166>] sys_newstat+0x1a/0x33
[ 9639.885208]  [<ffffffff81129e21>] ? path_put+0x1f/0x23
[ 9639.885364]  [<ffffffff8109fa68>] ? audit_syscall_entry+0x145/0x171
[ 9639.885521]  [<ffffffff81009bc2>] system_call_fastpath+0x16/0x1b

Comment 2 Daniel 2011-07-03 00:17:50 UTC

Same problem here, using Fedora 15, 2.6.38.8-32.fc15.x86_64.

I have 5 different nfs shares of the same server mounted via /etc/fstab. 
Only one of them fails to umount.
Hibernate works, if i umount the share manually. 
The remaining four shares are handled correctly when hibernating. 

I see no any differences between the shares. 
The nfs share is accessible and a manual umount is successfull. 
There is no stale nfs handle or things like that. 

[  814.380086] Freezing of tasks failed after 20.00 seconds (1 tasks refusing to freeze, wq_busy=0):
[  814.380186] umount          D 0000000000000000     0  3215   3194 0x00800084
[  814.380190]  ffff88011b6dbb68 0000000000000082 ffff880100000000 ffff880133a21730
[  814.380195]  ffff88011b6dbfd8 ffff88011b6dbfd8 0000000000013840 0000000000013840
[  814.380198]  ffff880137761730 ffff880133a21730 ffff88011b6dbb38 00000001814759c4
[  814.380202] Call Trace:
[  814.380223]  [<ffffffffa051507b>] ? rpc_wait_bit_killable+0x0/0x38 [sunrpc]
[  814.380233]  [<ffffffffa05150af>] rpc_wait_bit_killable+0x34/0x38 [sunrpc]
[  814.380238]  [<ffffffff8147493d>] __wait_on_bit+0x48/0x7b
[  814.380243]  [<ffffffff8105a527>] ? _local_bh_enable_ip+0x25/0x8e
[  814.380246]  [<ffffffff814749e2>] out_of_line_wait_on_bit+0x72/0x7d
[  814.380255]  [<ffffffffa051507b>] ? rpc_wait_bit_killable+0x0/0x38 [sunrpc]
[  814.380259]  [<ffffffff8106f2ab>] ? wake_bit_function+0x0/0x31
[  814.380268]  [<ffffffffa0515c80>] __rpc_execute+0xf2/0x295 [sunrpc]
[  814.380271]  [<ffffffff8106f021>] ? wake_up_bit+0x25/0x2a
[  814.380280]  [<ffffffffa0515e90>] rpc_execute+0x3f/0x43 [sunrpc]
[  814.380288]  [<ffffffffa050fdde>] rpc_run_task+0xeb/0xf7 [sunrpc]
[  814.380295]  [<ffffffffa050fed7>] rpc_call_sync+0x45/0x66 [sunrpc]
[  814.380312]  [<ffffffffa05a1b0a>] nfs3_rpc_wrapper.constprop.7+0x2c/0x64 [nfs]
[  814.380325]  [<ffffffffa05a2bdf>] nfs3_proc_getattr+0x5d/0x83 [nfs]
[  814.380335]  [<ffffffffa0594c14>] __nfs_revalidate_inode+0xb4/0x1a2 [nfs]
[  814.380344]  [<ffffffffa0594e52>] nfs_revalidate_inode+0x4a/0x51 [nfs]
[  814.380354]  [<ffffffffa0594f31>] nfs_getattr+0x92/0xc4 [nfs]
[  814.380357]  [<ffffffff81124fab>] vfs_getattr+0x45/0x63
[  814.380360]  [<ffffffff81125016>] vfs_fstatat+0x4d/0x63
[  814.380363]  [<ffffffff81125067>] vfs_stat+0x1b/0x1d
[  814.380366]  [<ffffffff81125166>] sys_newstat+0x1a/0x33
[  814.380369]  [<ffffffff81129e21>] ? path_put+0x1f/0x23
[  814.380373]  [<ffffffff8109fa68>] ? audit_syscall_entry+0x145/0x171
[  814.380376]  [<ffffffff81009bc2>] system_call_fastpath+0x16/0x1b

Comment 3 starfall 2011-09-15 07:09:00 UTC

Same problem here on 2.6.40.4-5.fc15.x86_64. I can reproduce this 100% when the NFS shares are mounted.

Comment 4 Lukas Bezdicka 2011-09-22 09:14:58 UTC

I have the same problem, for now I'm fixing it with :

http://palebluedot.nl/jml/computer-stuff/26-umountnfs.html

while I edited it and added -flr options to umount which mostly work.

Comment 5 Jeff Layton 2011-11-01 13:55:30 UTC


*** This bug has been marked as a duplicate of bug 717735 ***

Note You need to log in before you can comment on or make changes to this bug.