Description of problem: problem occurs probably in case I try to access indexed projects data in Eclipse Additional info: libreport version: 2.0.14 abrt_version: 2.0.13 cmdline: BOOT_IMAGE=/vmlinuz-3.6.1-1.fc17.i686 root=/dev/mapper/vg_krecek-lv_root ro rd.md=0 rd.dm=0 SYSFONT=True KEYTABLE=us rd.luks=0 rd.lvm.lv=vg_krecek/lv_swap rd.lvm.lv=vg_krecek/lv_root LANG=en_US.UTF-8 rhgb quiet kernel: 3.6.1-1.fc17.i686 backtrace: :kernel BUG at fs/dcache.c:2138! :invalid opcode: 0000 [#1] SMP :Modules linked in: fuse ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat des_generic xt_CHECKSUM iptable_mangle md4 bridge stp llc nls_utf8 cifs dns_resolver fscache lockd sunrpc rfcomm bnep be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack snd_hda_codec_conexant btusb bluetooth snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm thinkpad_acpi snd_page_alloc arc4 snd_timer iwldvm mac80211 iwlwifi cfg80211 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev snd media rfkill soundcore i2c_i801 mei r592 memstick iTCO_wdt iTCO_vendor_support lpc_ich tg3 coretemp microcode ppdev parport_pc parport vhost_net tun macvtap macvlan kvm_intel kvm uinput binfmt_misc firewire_oh :ci firewire_core crc_itu_t sdhci_pci sdhci mmc_core yenta_socket wmi video radeon i2c_algo_bit drm_kms_helper ttm drm i2c_core :Pid: 2219, comm: eclipse Not tainted 3.6.1-1.fc17.i686 #1 LENOVO 27326FG/27326FG :EIP: 0060:[<c0557577>] EFLAGS: 00010282 CPU: 1 :EIP is at __d_rehash+0x57/0x60 :EAX: ee641c80 EBX: 0000096a ECX: f7338208 EDX: f7338208 :ESI: fffffffe EDI: f6339f60 EBP: f0481d84 ESP: f0481d80 : DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 :CR0: 8005003b CR2: 06567f84 CR3: 30710000 CR4: 000027d0 :DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 :DR6: ffff0ff0 DR7: 00000400 :Process eclipse (pid: 2219, ti=f0480000 task=f0490c90 task.ti=f0480000) :Stack: : 0000096a f0481d90 c05575b2 ee641c80 f0481d9c c05575da 00000000 f0481dd8 : f9c01e32 f0734000 00004213 00000000 f0734000 f0481df4 f11ce600 00004213 : f6b07c60 00000000 00000000 f0481ef8 ee641c80 ef1781e0 f0481e20 f9c01f52 :Call Trace: : [<c05575b2>] _d_rehash+0x32/0x40 : [<c05575da>] d_rehash+0x1a/0x30 : [<f9c01e32>] cifs_lookup+0x212/0x300 [cifs] : [<f9c01f52>] cifs_atomic_open+0x32/0x200 [cifs] : [<c05584bf>] ? d_lookup+0x2f/0x50 : [<c055074c>] do_last+0x42c/0xbd0 : [<c054dd8c>] ? link_path_walk+0x1dc/0x750 : [<c0550f8d>] path_openat+0x9d/0x3b0 : [<c046d47f>] ? wake_up_state+0xf/0x20 : [<c0490ed2>] ? wake_futex+0x32/0x50 : [<c046fcb5>] ? __dequeue_entity+0x25/0x40 : [<c0551481>] do_filp_open+0x31/0x80 : [<c055bdfc>] ? alloc_fd+0x3c/0xe0 : [<c054c970>] ? getname_flags+0x40/0xd0 : [<c0542885>] do_sys_open+0xe5/0x1c0 : [<c05429d0>] sys_openat+0x30/0x40 : [<c095addf>] sysenter_do_call+0x12/0x28 :Code: 48 08 74 03 89 59 04 f6 c3 01 89 50 0c 75 20 f6 02 01 74 19 83 cb 01 89 1a 0f ba 32 00 5b 5d c3 f3 90 8b 0a 80 e1 01 75 f7 eb b9 <0f> 0b 0f 0b 0f 0b 8d 76 00 55 89 e5 53 66 66 66 66 90 8b 0d fc :EIP: [<c0557577>] __d_rehash+0x57/0x60 SS:ESP 0068:f0481d80
*** Bug 867319 has been marked as a duplicate of this bug. ***
Looks like: BUG_ON(!d_unhashed(entry)); ...and the d_rehash was called from d_add. Perhaps we should be doing something different to add this thing to the cache? I'll need to look more closely at how the atomic_open code works.
cc'ing Al, in case he has any ideas while I look at this...
Created attachment 629053 [details] patch -- don't call cifs_lookup on hashed, negative dentry Vit, does this patch fix it?
I'm just reboot the system Package: kernel Architecture: x86_64 OS Release: Fedora release 17 (Beefy Miracle)
upgrade to 3.6.2-4.fc17.x86_64 Package: kernel Architecture: x86_64 OS Release: Fedora release 17 (Beefy Miracle)
This exact error seemed to occur to me this AM on 3.6.2-4.fc17.x86_64 That is of course not the first time it has happened. In my case I seem to hit this whenever attempting to save a .ods file (in libreoffice calc) which is mounted via CIFS.
Could any of you test the patch in comment #4 and let me know if it helps?
i have not built a linux kernel in > 10 years so it would probably take me a little while to do that unless you could build it for me.
(In reply to comment #8) > Could any of you test the patch in comment #4 and let me know if it helps? Thanks for your immediate reaction. I tested the patch and the reported problem does not appear anymore. Good work!
Thanks for testing it. I'll send it upstream ASAP!
*** Bug 867954 has been marked as a duplicate of this bug. ***
when is the eta on pushing this out via yum? thanks, got 10 desktops borked..
Josh, can you apply the patch that's attached to this email to current 3.6 kernels? Apparently a number of people are hitting this. It should go into stable kernels eventually, but it's not in mainline yet and I'm not sure when Steve F will get around to putting it there.
Applied to F16-F18. Thanks Jeff!
does this mean the updates via yum will show up now? or is there a way for me to know the patch is on, ie patchlist for the kernel or cifs modules.
It's not in yet. It'll take some time to get the patch into an actual build and then it'll need to sit in updates-testing for a bit. If you need it sooner, you'll need to build your own kernel. Alternately you can revert to something pre-3.6.
I have seen the fs/dcache.c:2138 error under several other circumstances. The latest kernel update you pushed let's me start eclipse correctly a few times, but the error is not yet gone completly.
I was able to reproduce this by deleting .eclipse before starting it again.
kernel-3.6.3-3.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/kernel-3.6.3-3.fc18
Package kernel-3.6.3-3.fc18: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.6.3-3.fc18' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-16787/kernel-3.6.3-3.fc18 then log in and leave karma (feedback).
Sorry, noticed that kernel-3.6.3-1.fc17 does not even seem to contain the patch. Could we please prepare a F17 build for that, too?
(In reply to comment #22) > Sorry, noticed that kernel-3.6.3-1.fc17 does not even seem to contain the > patch. Could we please prepare a F17 build for that, too? The patch is in Fedora git. It will be included in the next F17 build.
Thanks Josh.
@Jeff The patch seems to have introduced a new strange behaviour. We are mounting our user homes using CIFS and autofs. During login .bashrc does not seem to be read correctly, so PS1 is displayed as -bash-4.2$ When doing a ls -al .bashrc && bash PS1 is displayed correctly, again. This does not occur on a NFS mounted home.
(In reply to comment #4) > Created attachment 629053 [details] > patch -- don't call cifs_lookup on hashed, negative dentry > > Vit, does this patch fix it? Jeff, This patch is wrong. The dentry used is not associated with a d_inode at this stage in the atomic_open process. lookup_open is called on the last patch component of an open file. We first lookup the dcache and if we do find a positive dentry, we return the dentry and let the f_op->open complete the open path. The next stage in case we intend on opening the file after the lookup is to use atomic_open to open the file. At this stage we do not have an inode associated with the dentry. static int lookup_open(struct nameidata *nd, struct path *path, struct file *file, const struct open_flags *op, bool got_write, int *opened) { .. dentry = lookup_dcache(&nd->last, dir, nd->flags, &need_lookup); if (IS_ERR(dentry)) return PTR_ERR(dentry); /* Cached positive dentry: will open in f_op->open */ if (!need_lookup && dentry->d_inode) goto out_no_open; if ((nd->flags & LOOKUP_OPEN) && dir_inode->i_op->atomic_open) { return atomic_open(nd, dentry, path, file, op, got_write, need_lookup, opened); .. } In atomic_open, we do an explicit check to ensure that there is no inode associated with the dentry. At this stage we have fresh dentry which needs to be initialised. static int atomic_open(struct nameidata *nd, struct dentry *dentry, struct path *path, struct file *file, const struct open_flags *op, bool got_write, bool need_lookup, int *opened) { .. BUG_ON(dentry->d_inode); .. file->f_path.dentry = DENTRY_NOT_SET; file->f_path.mnt = nd->path.mnt; error = dir->i_op->atomic_open(dir, dentry, file, open_flag, mode, opened); .. } Here we will always hit the if condition since the inode is not set in this code path. We will always fail with an ENOENT in this code path. int cifs_atomic_open(struct inode *inode, struct dentry *direntry, struct file *file, unsigned oflags, umode_t mode, int *opened) { .. if (!(oflags & O_CREAT)) { struct dentry *res; if (!direntry->d_inode) return -ENOENT; *res = cifs_lookup(inode, direntry, 0); if (IS_ERR(res)) return PTR_ERR(res); return finish_no_open(file, res); } .. } This will however work when we do a lookup on the file and then attempt to open the file as done by Marcus in the last comment since the lookup would have populated the dcache with the required values. We find this cached value in lookup_open which would then follow a different code path and instead use f_ops->open to open the file instead of using cifs_atomic_open. So NAK on the patch. I am continuing investigation on the original bug report.
I was able to reproduce this problem easily with 2 terminals open on the client. To reproduce 1) Mount a cifs share on the client # mount -t cifs -o <options> //share/export /mnt 2) Open a terminal and run the following command # while :;do cat /mnt/b >/dev/null 2>&1; done 3) On another terminal, run the following command # while :; do touch /mnt/b; rm /mnt/b; done The client crashes with [ 95.475073] ------------[ cut here ]------------ [ 95.476973] kernel BUG at fs/dcache.c:2138! [ 95.478611] invalid opcode: 0000 [#1] SMP [ 95.480230] Modules linked in: md4 nls_utf8 cifs dns_resolver fscache lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ip6table_filter ip6_tables virtio_net virtio_balloon i2c_piix4 microcode virtio_blk cirrus drm_kms_helper ttm drm i2c_core [ 95.491625] CPU 1 [ 95.492565] Pid: 4757, comm: cat Not tainted 3.7.0-0.rc1.git1.1.sprabhu.fc17.x86_64 #1 Red Hat KVM [ 95.497119] RIP: 0010:[<ffffffff811a8b3f>] [<ffffffff811a8b3f>] __d_rehash+0x5f/0x70 [ 95.499711] RSP: 0018:ffff880036273b58 EFLAGS: 00010286 [ 95.502549] RAX: 000000000001f29b RBX: ffff88003cfc5840 RCX: 0000000000000011 [ 95.504972] RDX: ffffc90000000000 RSI: ffffc900000f94d8 RDI: ffff88003cfc5840 [ 95.507366] RBP: ffff880036273b58 R08: 0000000000016a40 R09: ffff88003699f400 [ 95.509785] R10: ffff88003db30508 R11: ffff8800368cf000 R12: ffff8800367f1400 [ 95.512523] R13: 0000000000000000 R14: 00000000fffffffe R15: 000000000000000d [ 95.515224] FS: 00007fd6f90d2740(0000) GS:ffff88003fc80000(0000) knlGS:0000000000000000 [ 95.517673] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 95.519393] CR2: 0000000000430000 CR3: 000000003c9ea000 CR4: 00000000000006e0 [ 95.521938] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 95.525328] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 95.528883] Process cat (pid: 4757, threadinfo ffff880036272000, task ffff8800368b1720) [ 95.532794] Stack: [ 95.533590] ffff880036273b68 ffffffff811a8b86 ffff880036273b88 ffffffff811a8bb2 [ 95.535976] ffff88003cfc5840 0000000000000000 ffff880036273be8 ffffffffa016b793 [ 95.538735] ffff88003db30508 ffff880036a74a00 ffff88003a062680 0000000000000000 [ 95.541496] Call Trace: [ 95.542410] [<ffffffff811a8b86>] _d_rehash+0x36/0x40 [ 95.544133] [<ffffffff811a8bb2>] d_rehash+0x22/0x30 [ 95.546104] [<ffffffffa016b793>] cifs_lookup+0x263/0x360 [cifs] [ 95.548148] [<ffffffffa016b8ea>] cifs_atomic_open+0x5a/0x300 [cifs] [ 95.550398] [<ffffffff811a9d05>] ? d_lookup+0x35/0x60 [ 95.552317] [<ffffffff8119c6c0>] ? lookup_dcache+0x80/0xd0 [ 95.554213] [<ffffffff811a1129>] do_last+0x4a9/0xe00 [ 95.556139] [<ffffffff8119dfc8>] ? inode_permission+0x18/0x50 [ 95.558675] [<ffffffff811741b0>] ? alloc_pages_current+0xb0/0x120
Another requirement which is needed for the reproducer above to work is that the user doesn't have permission to write onto that directory. I seem to have blundered onto the reproducer above by mistake. Given below is another reproducer which can also be run against a share you can write to. To reproduce: 1) Mount a cifs share on the client # mount -t cifs -o <options> //share/export /mnt 2) Open a terminal and run the following command # while :;do cat /mnt/b >/dev/null 2>&1; done 3) On another terminal and run the same command from step 2. # while :;do cat /mnt/b >/dev/null 2>&1; done It will crash in a similar manner. The problem occurs because lookup_open can pass a dentry which is a hashed negative lookup dentry. When we open the file with the O_CREAT flag, cifs_atomic_open() attempts to do a lookup on that dentry. It doesn't expect to be passed a hashed dentry and so ends up crashing. When we have a hashed negative lookup dentry in cifs_atomic_lookup() we have already revalidated the dentry and found it to be fine. We should return a -ENOENT in that case.
Created attachment 635685 [details] cifs: Do not lookup hashed negative dentry in cifs_atomic_open cifs: Do not lookup hashed negative dentry in cifs_atomic_open We do not need to lookup a hashed negative directory since we have already revalidated it before and have found it to be fine. This also prevents a crash in cifs_lookup() when it attempts to rehash the already hashed negative lookup dentry. Signed-off-by: Sachin Prabhu <sprabhu>
In the reproducer in c#28, the file /mnt/b must not actually exist to be able to reproduce this issue.
(In reply to comment #29) > Created attachment 635685 [details] > cifs: Do not lookup hashed negative dentry in cifs_atomic_open Pushed this to F16-F18 (just updated the existing patch).
Is there a new build scheduled already?
3.6.5 is supposed to release this evening, so I would expect a build in the morning most likely.
*** Bug 872090 has been marked as a duplicate of this bug. ***
kernel-3.6.5-2.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/kernel-3.6.5-2.fc16
Package kernel-3.6.5-2.fc16: * should fix your issue, * was pushed to the Fedora 16 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.6.5-2.fc16' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-17479/kernel-3.6.5-2.fc16 then log in and leave karma (feedback).
*** Bug 872794 has been marked as a duplicate of this bug. ***
This should be fixed at this point. Bodhi is having trouble closing bugs.