Bug 1005737

Summary: Warning messages when using fscache with cifs
Product: [Fedora] Fedora Reporter: Sachin Prabhu <sprabhu>
Component: kernelAssignee: Sachin Prabhu <sprabhu>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rawhideCC: dhowells, gansalmon, itamar, jlayton, jonathan, kernel-maint, madhu.chinakonda, marcelo.barbosa, skottler
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-13 20:05:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sachin Prabhu 2013-09-09 10:12:12 UTC
Warning messages are seen when using cifs mount with fscache.

To reproduce, 
1) Make sure you have setup cachefilesd and it is running.
2) Mount cifs share with the fsc mount option to enable it to use fscache.
3) run the command on the mounted share. find .|xargs cat

The following warnings indicating that 

------------[ cut here ]------------
WARNING: CPU: 2 PID: 1013 at mm/filemap.c:608 unlock_page+0x41/0x50()
Modules linked in: arc4 md4 nls_utf8 cifs dns_resolver nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables cachefiles fscache microcode serio_raw virtio_balloon virtio_net i2c_piix4 cirrus drm_kms_helper ttm virtio_blk drm ata_generic i2c_core pata_acpi
CPU: 2 PID: 1013 Comm: kworker/u8:1 Tainted: G        W    3.12.0-0.rc0.git11.1.fc21.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
Workqueue: fscache_operation fscache_op_work_func [fscache]
 0000000000000009 ffff880038807c48 ffffffff817376ab 0000000000000000
 ffff880038807c80 ffffffff8107675d ffffea00009c8240 0000000000000000
 ffffea0000bbbe80 ffff88003ba0eb58 0000000000000000 ffff880038807c90
Call Trace:
 [<ffffffff817376ab>] dump_stack+0x54/0x74
 [<ffffffff8107675d>] warn_slowpath_common+0x7d/0xa0
 [<ffffffff8107683a>] warn_slowpath_null+0x1a/0x20
 [<ffffffff8117a1f1>] unlock_page+0x41/0x50
 [<ffffffffa01d1d2c>] cifs_readpage_from_fscache_complete+0x2c/0x60 [cifs]
 [<ffffffffa00b2a66>] cachefiles_read_copier+0x116/0x3d0 [cachefiles]
 [<ffffffffa0146095>] fscache_op_work_func+0x25/0x50 [fscache]
 [<ffffffff81098f01>] process_one_work+0x211/0x6e0
 [<ffffffff81098e95>] ? process_one_work+0x1a5/0x6e0
 [<ffffffff810994ed>] worker_thread+0x11d/0x3a0
 [<ffffffff810993d0>] ? process_one_work+0x6e0/0x6e0
 [<ffffffff810a22fd>] kthread+0xed/0x100
 [<ffffffff810a2210>] ? insert_kthread_work+0x80/0x80
 [<ffffffff8174a52c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810a2210>] ? insert_kthread_work+0x80/0x80
---[ end trace 83a4cc70b3944ff2 ]---

Comment 1 Sachin Prabhu 2013-09-09 13:05:03 UTC
I have narrowed down the cause to 0 byte files.

To reproduce:
1) Mount the cifs share with the fsc mount option.
2) Cat a 0 byte file. This file can be created by a simple touch command to create a new file.

If you continue to cat the file, the machine hangs after 4-5 attempts with the following message



---[ end trace 99afa34f9aad29b6 ]---
note: kworker/u8:0[6] exited with preempt_count 2
BUG: unable to handle kernel paging request at ffffffffffffff98
IP: [<ffffffff810a2c80>] kthread_data+0x10/0x20
PGD 1c0f067 PUD 1c11067 PMD 0 
Oops: 0000 [#2] SMP 
Modules linked in: arc4 md4 nls_utf8 cifs dns_resolver nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables cachefiles fscache microcode serio_raw virtio_balloon virtio_net i2c_piix4 cirrus drm_kms_helper ttm drm virtio_blk i2c_core ata_generic pata_acpi
CPU: 1 PID: 6 Comm: kworker/u8:0 Tainted: G      D W    3.12.0-0.rc0.git11.1.fc21.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
task: ffff88003d378000 ti: ffff88003d376000 task.ti: ffff88003d376000
RIP: 0010:[<ffffffff810a2c80>]  [<ffffffff810a2c80>] kthread_data+0x10/0x20
RSP: 0018:ffff88003d377828  EFLAGS: 00010002
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 000000000000000d
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88003d378000
RBP: ffff88003d377828 R08: ffff88003d378090 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003e1d5240
R13: 0000000000000001 R14: ffff88003d3774d8 R15: 0000000000000020
FS:  0000000000000000(0000) GS:ffff88003e000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000028 CR3: 000000002ee6a000 CR4: 00000000000006e0
Stack:
 ffff88003d377840 ffffffff8109a421 ffff88003d378000 ffff88003d3778a0
 ffffffff8173e06a 00000000001d5240 ffff88003d377fd8 ffff88003d377fd8
 00000000001d5240 ffff88003d378000 ffff88003d378730 ffff88003d3774d8
Call Trace:
 [<ffffffff8109a421>] wq_worker_sleeping+0x11/0x90
 [<ffffffff8173e06a>] __schedule+0x63a/0x930
 [<ffffffff8173e389>] schedule+0x29/0x70
 [<ffffffff81079b80>] do_exit+0x7b0/0xcd0
 [<ffffffff81742302>] oops_end+0xa2/0xe0
 [<ffffffff81731e87>] no_context+0x268/0x275
 [<ffffffff81731f12>] __bad_area_nosemaphore+0x7e/0x1d7
 [<ffffffff8173207e>] bad_area_nosemaphore+0x13/0x15
 [<ffffffff817451be>] __do_page_fault+0x23e/0x5a0
 [<ffffffff811348f7>] ? rcu_irq_exit+0x77/0xc0
 [<ffffffff817413b3>] ? retint_restore_args+0x13/0x13
 [<ffffffff8137ea0d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 [<ffffffff8174552e>] do_page_fault+0xe/0x10
 [<ffffffff81741688>] page_fault+0x28/0x30
 [<ffffffffa00e2b6b>] ? __cachefiles_printk_object+0xfb/0x171 [cachefiles]
 [<ffffffffa00e2c3c>] cachefiles_printk_object+0x5b/0x41f [cachefiles]
 [<ffffffffa00e0551>] cachefiles_walk_to_object+0x871/0x880 [cachefiles]
 [<ffffffff810b99cd>] ? sched_clock_local+0x1d/0x80
 [<ffffffff810e57bd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff810b9c9f>] ? local_clock+0x5f/0x70
 [<ffffffff810aa152>] ? creds_are_invalid.part.8+0x12/0x50
 [<ffffffffa00deab4>] cachefiles_lookup_object+0x54/0x100 [cachefiles]
 [<ffffffffa0125275>] fscache_look_up_object+0xd5/0x1f0 [fscache]
 [<ffffffffa012550c>] fscache_object_work_func+0xbc/0x230 [fscache]
 [<ffffffff81098f01>] process_one_work+0x211/0x6e0
 [<ffffffff81098e95>] ? process_one_work+0x1a5/0x6e0
 [<ffffffff810994ed>] worker_thread+0x11d/0x3a0
 [<ffffffff810993d0>] ? process_one_work+0x6e0/0x6e0
 [<ffffffff810a22fd>] kthread+0xed/0x100
 [<ffffffff810a2210>] ? insert_kthread_work+0x80/0x80
 [<ffffffff8174a52c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810a2210>] ? insert_kthread_work+0x80/0x80
Code: 00 48 89 e5 5d 48 8b 40 88 48 c1 e8 02 83 e0 01 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 b8 04 00 00 55 48 89 e5 <48> 8b 40 98 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 
RIP  [<ffffffff810a2c80>] kthread_data+0x10/0x20
 RSP <ffff88003d377828>
CR2: ffffffffffffff98
---[ end trace 99afa34f9aad29b7 ]---
Fixing recursive fault but reboot is needed!

Comment 2 Sachin Prabhu 2013-09-13 11:39:55 UTC
To recreate the problem:

1) Enable fscache using the 'fsc' mount option.
2) This particular code path is hit on files having the read oplock.
3) Touch a new file and then subsequently cat it. You may need to do this a few times. I used the following command to reproduce the issue.
# for i in {1..30}; do touch t$i; done
# cat *
You may need to run cat a few times before you hit the problem.

Comment 4 Sachin Prabhu 2013-09-16 10:54:16 UTC
http://thread.gmane.org/gmane.linux.kernel.cifs/8844

Comment 5 Sachin Prabhu 2014-01-13 20:05:32 UTC
Committed upstream:
466bd31bbda9e1dd2ace1d72c8de5045bf6f3bf6

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=466bd31bbda9e1dd2ace1d72c8de5045bf6f3bf6