Red Hat Bugzilla – Bug 1144128
FUSE: Scheduling while atomic OOPSes when using inval_entry
Last modified: 2015-07-22 04:18:03 EDT
Created attachment 939006 [details] The patch to avoid scheduling while atomic in fuse.ko Description of problem: I am seeing this OOPS reasonable frequently after I added a call to fuse_lowlevel_notify_inval_entry: Jun 23 11:53:24 localhost kernel: BUG: scheduling while atomic: fuse.hf/13976/0x10000001 Jun 23 11:53:24 localhost kernel: Modules linked in: nls_utf8 fuse ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs autofs4 8021q garp stp llc vboxpci(U) vboxnetadp(U) vboxnetflt(U) vboxdrv(U) cpufreq_ondemand acpi_cpufreq freq_table mperf ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support microcode serio_raw r8169 mii i2c_i801 sg lpc_ich mfd_core snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc shpchp ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif ahci xhci_hcd wmi radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Jun 23 11:53:24 localhost kernel: Pid: 13976, comm: fuse.hf Not tainted 2.6.32-431.20.3.el6.x86_64 #1 Jun 23 11:53:24 localhost kernel: Call Trace: Jun 23 11:53:24 localhost kernel: [<ffffffff8105da66>] ? __schedule_bug+0x66/0x70 Jun 23 11:53:24 localhost kernel: [<ffffffff81528e30>] ? thread_return+0x640/0x760 Jun 23 11:53:24 localhost kernel: [<ffffffff810695fa>] ? __cond_resched+0x2a/0x40 Jun 23 11:53:24 localhost kernel: [<ffffffff81529220>] ? _cond_resched+0x30/0x40 Jun 23 11:53:24 localhost kernel: [<ffffffff8116f405>] ? kmem_cache_alloc_trace+0xe5/0x1b0 Jun 23 11:53:24 localhost kernel: [<ffffffffa03d095e>] ? fuse_notify+0x9e/0x2a0 [fuse] Jun 23 11:53:24 localhost kernel: [<ffffffffa03d1580>] ? fuse_dev_write+0x0/0x3e0 [fuse] Jun 23 11:53:24 localhost kernel: [<ffffffffa03d0593>] ? fuse_copy_one+0x53/0x70 [fuse] Jun 23 11:53:24 localhost kernel: [<ffffffffa03d16ea>] ? fuse_dev_write+0x16a/0x3e0 [fuse] Jun 23 11:53:24 localhost kernel: [<ffffffff8105a323>] ? enqueue_pushable_task+0x73/0x90 Jun 23 11:53:24 localhost kernel: [<ffffffffa03d1580>] ? fuse_dev_write+0x0/0x3e0 [fuse] Jun 23 11:53:24 localhost kernel: [<ffffffff811889bb>] ? do_sync_readv_writev+0xfb/0x140 Jun 23 11:53:24 localhost kernel: [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40 Jun 23 11:53:24 localhost kernel: [<ffffffff810b0f10>] ? do_futex+0x100/0xb50 Jun 23 11:53:24 localhost kernel: [<ffffffff81226c06>] ? security_file_permission+0x16/0x20 Jun 23 11:53:24 localhost kernel: [<ffffffff81189946>] ? do_readv_writev+0xd6/0x1f0 Jun 23 11:53:24 localhost kernel: [<ffffffff81189aa6>] ? vfs_writev+0x46/0x60 Jun 23 11:53:24 localhost kernel: [<ffffffff81189bd1>] ? sys_writev+0x51/0xb0 Jun 23 11:53:24 localhost kernel: [<ffffffff810e1c7e>] ? __audit_syscall_exit+0x25e/0x290 Jun 23 11:53:24 localhost kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b Version-Release number of selected component (if applicable): Seems to exist in all versions of RHEL 6.x because there is a kzalloc after a kmap_atomic call, and if memory is low, we will schedule while atomic. How reproducible: Add code to your fuse file system to call fuse_lowlevel_notify_inval_entry. Steps to Reproduce: 1. As above 2. Consume lots of memory 3. Create lots of files and exercise the code path where fuse_lowlevel_notify_inval_entry is called. Actual results: Lots of OOPS messages in /var/log/messages. These seem to be benign but they scare admins. Expected results: No such messages. Additional info: I have a patch which I will attach but there is a better one at: https://lkml.org/lkml/2014/7/4/361 However, this will need a little massaging for 2.6.32.
Brian, Eric, Richard has hit this on all versions of RHEL6.x without the attached patch. Can we get this nominated for a RHEL6.x kernel build? We might eventually see this in Red Hat Storage. Thanks!
I did some brief legwork on this a few days ago and managed to pull enough back to compile successfully on latest rhel6. I'll need to get back to it, review it more carefully and test, but I think we have plenty of time to get this fixed for 6.7. Set devel ack.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release.
Patch(es) available on kernel-2.6.32-556.el6
Hi Richard, I can't reproduce this bug easily. Could you give some detailed information about "lots of memory" and "lots of files"? At least, How much memory should I consume? How many files should I create? 1) I mount a glusterfs 2) I mmap 2G memory and keep writing 2G random data to this memory. This consume nearly 100% memory of my machine. 3) I created 10000 files in the glusterfs, and invalidate their inode(setfattr -n 'inode-invalidate' testfile${num}) one by one. But still can not reproduce. Is there something I missed? Thanks, Zorro
I no longer work on that stuff, but here is my memory of what we were doing. 1. A long create run where we were creating anywhere between 4M and 20M files. 2. The create run used 10 threads, so at least 10 sub-directories, but could be more (like 100). 3. File sizes varied from 8kiB to 64kiB of random data. However, the problem might also only occur because the FUSE file system I was working on was not properly cleaning up inodes etc.
Done FUSE regression test test on kernel 567. No regression failures. This bug still hard to reproduce for me. I will sanityOnly this bug first.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1272.html