Description of problem: Fedora 15 box with above kernel will panic and kill the UI and lock up when sudo mount -t cifs //192.168.0.5/photos /mnt/tigger5/pictures -o user=timali,password=zzzzzz,uid=502,gid=100 is issued. Version-Release number of selected component (if applicable): Installed kernel-2.6.38.8-32.fc15.x86_64 The Linux kernel Installed kernel-2.6.38.8-35.fc15.x86_64 The Linux kernel Installed kernel-2.6.40-4.fc15.x86_64 When the machine is booted with kernel-2.6.40-4.fc15.x86_64 the machine will always panic and lockup. How reproducible: Always as it took 6 reboots to fine the source of the problem. Steps to Reproduce: 1.Boot with Kernel and issue mount command 2. 3. Actual results: Locked machine with no UI, Panic trace on screen and a hard reboot necessary. Expected results: Working desktop with drive mounted. Additional info: Unable to get any trace info I have looked sorry.
can you try a mount from a tty (ctrl-alt-f2). That might get you a trace you can capture with a camera.
I've not seen this sort of problem with a similar mount command on 2.6.40. I'll need a stack trace or something to go on in order to pursue this.
Created attachment 516577 [details] Video of the crash if it helps
Created attachment 516578 [details] Best picture of the dump
The trace will start with 5 umount commands then the crash. The message reads to me. CIFS NFS default security mechanism required. The default security mechanism will be .....................
I'm afraid that doesn't really help as I can't read any of the oops message.
I can reliably reproduce with 2.6.40-4.fc15.x86_64 (but not with 3.0.0-1.fc16.x86_64) and get stack trace. $ sudo mount -v -t cifs //server2/data/mydata data -o user=cifsuser Password: mount.cifs kernel mount options: ip=10.0.0.2,unc=\\server2\data,,ver=1,user=cifsuser,prefixpath=mydata,pass=******** [ 41.012855] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 [ 41.013117] IP: [<ffffffff814b636e>] mutex_lock+0x2c/0x4a [ 41.013358] PGD 37c00067 PUD 37b1f067 PMD 0 [ 41.013608] Oops: 0002 [#1] SMP [ 41.013798] CPU 0 [ 41.013875] Modules linked in: des_generic md4 nls_utf8 cifs fscache sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ip6table_filter ip6_tables ppdev parport_pc parport e1000 microcode vmw_balloon shpchp i2c_piix4 i2c_core vmw_pvscsi [last unloaded: speedstep_lib] [ 41.018285] [ 41.018354] Pid: 963, comm: mount.cifs Not tainted 2.6.40-4.fc15.x86_64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform [ 41.018945] RIP: 0010:[<ffffffff814b636e>] [<ffffffff814b636e>] mutex_lock+0x2c/0x4a [ 41.019408] RSP: 0018:ffff88003d4cbd08 EFLAGS: 00010246 [ 41.019660] RAX: 0000000000000000 RBX: 0000000000000038 RCX: 000000000000005c [ 41.019943] RDX: 0000000000000000 RSI: 0000000000000055 RDI: 0000000000000038 [ 41.020226] RBP: ffff88003d4cbd28 R08: ffffea0000cd91c8 R09: ffffffffa01441bf [ 41.020509] R10: ffff88003d4cb998 R11: ffff88003d4cb998 R12: ffff88003cb08600 [ 41.020791] R13: ffff88003abbfcc0 R14: ffff88003552f240 R15: ffff88003abbfcef [ 41.021087] FS: 00007fd5faaf8740(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 41.021520] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 41.021781] CR2: 0000000000000038 CR3: 000000003744f000 CR4: 00000000000006f0 [ 41.022093] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 41.022402] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 41.022688] Process mount.cifs (pid: 963, threadinfo ffff88003d4ca000, task ffff880037eb0000) [ 41.023113] Stack: [ 41.023308] ffff88003d4cbd28 ffffffff81137c49 ffff88003b1f7000 ffff88003b1f7000 [ 41.023889] ffff88003d4cbda8 ffffffffa0133d2f ffff88003abbfcda ffffffff811f845c [ 41.024475] 0000000000000038 00000004378fb000 ffff88003a88f000 ffff88003aed8400 [ 41.025055] Call Trace: [ 41.025280] [<ffffffff81137c49>] ? dput+0x42/0xea [ 41.025542] [<ffffffffa0133d2f>] cifs_do_mount+0x396/0x466 [cifs] [ 41.025834] [<ffffffff811f845c>] ? selinux_sb_copy_data+0x148/0x1ab [ 41.026113] [<ffffffff81129858>] mount_fs+0x69/0x155 [ 41.026382] [<ffffffff810f528a>] ? __alloc_percpu+0x10/0x12 [ 41.026646] [<ffffffff8113d5e9>] vfs_kern_mount+0x63/0x9d [ 41.026903] [<ffffffff8113df6c>] do_kern_mount+0x4d/0xdf [ 41.027158] [<ffffffff8113f5f1>] do_mount+0x63c/0x69f [ 41.027417] [<ffffffff8113f8d6>] sys_mount+0x88/0xc2 [ 41.027675] [<ffffffff814bd7c2>] system_call_fastpath+0x16/0x1b [ 41.027939] Code: 48 89 e5 53 48 83 ec 18 66 66 66 66 90 31 d2 be 55 00 00 00 48 89 fb 48 c7 c7 82 36 7b 81 e8 9c 10 b9 ff e8 43 f7 ff ff 48 89 df <3e> ff 0f 79 05 e8 51 00 00 00 65 48 8b 04 25 80 cd 00 00 48 89 [ 41.031502] RIP [<ffffffff814b636e>] mutex_lock+0x2c/0x4a [ 41.031801] RSP <ffff88003d4cbd08> [ 41.032020] CR2: 0000000000000038 [ 41.032256] ---[ end trace 3bc5d4d0271d1502 ]---
(cc'ing Al since he wrote this code...) Thanks, that helps somewhat: (gdb) list *(cifs_do_mount+0x396) 0xd53 is in cifs_do_mount (fs/cifs/cifsfs.c:580). 575 /* next separator */ 576 while (*s && *s != sep) 577 s++; 578 579 mutex_lock(&dir->i_mutex); 580 child = lookup_one_len(p, dentry, s - p); 581 mutex_unlock(&dir->i_mutex); 582 dput(dentry); 583 dentry = child; 584 } while (!IS_ERR(dentry)); ...which is actually inlined cifs_get_root. So the problem is likely that dir is NULL here, which would imply that we hit a negative dentry while walking down to the root of the vfsmount.
Created attachment 516711 [details] patch -- cope with negative dentries in cifs_get_root Here's a possible patch (untested, but I don't seem to be able to reproduce this). If lookup_one_len returns a negative dentry, then put it and set the dentry pointer to an error of -ENOENT. I'd like Al to weigh in on this before I propose it upstream, but it might be worth testing in the meantime if you feel brave.
The patch certainly avoids the oops for me with no negative side-effects that I've noticed. (But then I'm still stuck with bug 727834, of course).
Ok, patch sent upstream. I have a feeling we'll be ripping and replacing this code with something that doesn't require access to top-level directories in order to mount a lower one, but this patch should at least stop the oopses in the interim.
*** Bug 728684 has been marked as a duplicate of this bug. ***
fyi: I'm also having the same issue on 2.6.40-4.fc15.i686.
Yep. Looks like 3.0.2 got released upstream in the last day or two, and the relevant patches should be in there. You may want to test this build out of koji: http://koji.fedoraproject.org/koji/buildinfo?buildID=258790
the kernel-2.6.40.3-0.fc15.i686.rpm from the link above does not improve things for me. Kernel halts few seconds after mounting cifs Message from syslogd@test at Aug 16 13:37:07 ... kernel:[ 134.876632] Oops: 0000 [#1] SMP Message from syslogd@test at Aug 16 13:37:07 ... kernel:[ 134.876632] Process sshd (pid: 1654, ti=f658c000 task=f6545860 task.ti=f658c000) Message from syslogd@test at Aug 16 13:37:07 ... kernel:[ 134.876632] Stack: Message from syslogd@test at Aug 16 13:37:07 ... kernel:[ 134.876632] Call Trace: Message from syslogd@test at Aug 16 13:37:07 ... kernel:[ 134.876632] Code: 08 85 c9 89 4d f0 75 17 8b 7d d8 8b 55 ec 89 04 24 89 f0 89 f9 e8 aa 5d 30 00 89 45 f0 eb 2d 8b 5d f0 8b 46 14 8b 7d e4 8b 55 e4 <8b> 04 03 47 89 7d dc 89 f9 8b 3e 89 45 e0 89 c3 8b 45 f0 64 0f Message from syslogd@test at Aug 16 13:37:07 ... kernel:[ 134.876632] EIP: [<c04de862>] kmem_cache_alloc_trace+0x78/0xd8 SS:ESP 0068:f658dd10 Message from syslogd@test at Aug 16 13:37:07 ... kernel:[ 134.876632] CR2: 00000000800a8115 Message from syslogd@test at Aug 16 13:37:10 ... kernel:[ 137.537968] Oops: 0000 [#2] SMP Message from syslogd@test at Aug 16 13:37:10 ... kernel:[ 137.538260] Process sshd (pid: 1656, ti=f6588000 task=f6544bc0 task.ti=f6588000) Message from syslogd@test at Aug 16 13:37:10 ... kernel:[ 137.538260] Stack: Message from syslogd@test at Aug 16 13:37:10 ... kernel:[ 137.538260] Call Trace: Message from syslogd@test at Aug 16 13:37:10 ... kernel:[ 137.538260] Code: 08 85 c9 89 4d f0 75 17 8b 7d d8 8b 55 ec 89 04 24 89 f0 89 f9 e8 aa 5d 30 00 89 45 f0 eb 2d 8b 5d f0 8b 46 14 8b 7d e4 8b 55 e4 <8b> 04 03 47 89 7d dc 89 f9 8b 3e 89 45 e0 89 c3 8b 45 f0 64 0f Message from syslogd@test at Aug 16 13:37:10 ... kernel:[ 137.538260] EIP: [<c04de862>] kmem_cache_alloc_trace+0x78/0xd8 SS:ESP 0068:f6589f0c Message from syslogd@test at Aug 16 13:37:10 ... kernel:[ 137.538260] CR2: 00000000800a8115
That's almost certainly a different bug than this one. Could you open a new bug for this, and include the entire stack trace if you're able. Please cc me on the bug too.
Jeff: unfortunately I don't have easy access to that server to reset it and every time it locks I need to contact support / create ticket, etc. :(
(In reply to comment #14) > Yep. Looks like 3.0.2 got released upstream in the last day or two, and the > relevant patches should be in there. You may want to test this build out of > koji: > > http://koji.fedoraproject.org/koji/buildinfo?buildID=258790 Tested with 2.6.40.3-0.fc15 and still fails
Could you test with the kernel here and let me know if it fails? https://koji.fedoraproject.org/koji/taskinfo?taskID=3298071
Jeff: 1 hour runtime - so far good. Before it would crash within seconds. -- vlad
Fixes are in 2.6.40.4-3
*** Bug 731278 has been marked as a duplicate of this bug. ***
I can confirm that 2 days of testing have been sucessfull and the original bug is fixed. [16:54][timali@tigger3] bug-fixes $map_drives umount: /mnt/tigger5/pictures: not mounted umount: /mnt/tigger5/backups: not mounted umount: /mnt/tigger5/share: not mounted umount: /mnt/tigger5/music: not mounted umount: /mnt/tigger5/sarah: not mounted umount: /mnt/tigger5/peter: not mounted [16:54][timali@tigger3] bug-fixes $uname -s Linux [17:14][timali@tigger3] bug-fixes $uname -r 2.6.40.3-1.cifs.1.fc15.x86_64
kernel-2.6.40.4-5.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.40.4-5.fc15
kernel-2.6.40.4-5.fc15 has been pushed to the Fedora 15 stable repository. If problems still persist, please make note of it in this bug report.
(In reply to comment #25) > kernel-2.6.40.4-5.fc15 has been pushed to the Fedora 15 stable repository. If > problems still persist, please make note of it in this bug report. See immediate kernel panic with this. Ok with prev kernel 2.6.40-3.0.fc15.x86_64 Here is abbreviated version of the dump (nothing in logs so wrote it down): ---- Kernel Panic - not syncing: VFS Unable to mount root fs or unknown block(0,0) Pid: 1 comm: swapper Not tainted 2.6.40.4-5.fc15.x86_64 #1 Call trace: Panic mount block root mount root prepare namespace ? release_tgcred kernel init ? sched_tail kernel thread helper ? sfmt kernel ? gs change ------ Note: When I did yum update that installed new kernel, yum hung at end of cleanup for long time. Finally I did crtl-c to get back to prompt. Also, did lsinitrd on new inintramfs for new kernel and it shows strange errors: gzip: initramfs-2.6.40.4-5.fc15.x86_64.img: unexpected end of file gzip: initramfs-2.6.40.4-5.fc15.x86_64.img: unexpected end of file cpio: premature end of file Other kernels still in /boot don't show this error from lsinitrd. Rebuilt it with dracut --force and see same results. Possibly the ram image file is corrupt? System: HP pavilion dv7-6195 laptop (i7/sandybridge)
That sounds like a different problem entirely, unrelated to the issue here. I'd recommend opening a new bug for that.
(In reply to comment #27) > That sounds like a different problem entirely, unrelated to the issue here. I'd > recommend opening a new bug for that. Just for the record apparently my problem was user induced by killing yum update before the new kernel was completely installed. Removed the partially installed kernel and doing yum update again fixed the problem.