Bug 682829

Summary: mounting of DFS share causes kernel oops
Product: [Fedora] Fedora Reporter: Yogesh Sharma <ysharma>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 14CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, steved
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-03-15 17:46:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch -- always do is_path_accessible check in mount
none
extract from /var/log/messages
none
dmesg output
none
Bugzilla Screenshot
none
cifs traffic none

Description Yogesh Sharma 2011-03-07 17:45:52 UTC
I am trying to mount our corporate windows shared drive using cifs with all the latest patches applied for FC14.

As soon as I enter the password, kernel cifs module crashes.


Mar  7 07:48:46 ysdev kernel: [ 2191.684396] ------------[ cut here ]------------
Mar  7 07:48:46 ysdev kernel: [ 2191.684407] kernel BUG at fs/cifs/cifs_dfs_ref.c:318!
Mar  7 07:48:46 ysdev kernel: [ 2191.684414] invalid opcode: 0000 [#5] SMP 
Mar  7 07:48:46 ysdev kernel: [ 2191.684423] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
Mar  7 07:48:46 ysdev kernel: [ 2191.684429] CPU 1 
Mar  7 07:48:46 ysdev kernel: [ 2191.684433] Modules linked in: nls_utf8 cifs tcp_lp ipt_LOG iptable_nat iptable_raw xt_comment ipt_addrtype bridge stp llc xt_multiport xt_mark iptable_mangle nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nfnetlink nf_conntrack_netbios_ns nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp ipv6 vmnet vmblock vmci vmmon cpufreq_ondemand acpi_cpufreq freq_table mperf uinput nvidia(P) snd_hda_codec_analog arc4 ecb snd_hda_intel snd_hda_codec iwlagn iwlcore snd_hwdep snd_seq r852 sm_common snd_seq_device nand mac80211 snd_pcm e1000e nand_ids nand_ecc ppdev snd_timer thinkpad_acpi i2c_i801 cfg80211 iTCO_wdt parport_pc mtd snd wmi i2c_core snd_page_alloc rfkill iTCO_vendor_support parport microcode soundcore joydev sdhci_pci sdhci yenta_socket 
Mar  7 07:48:46 ysdev kernel: mmc_core firewire_ohci firewire_core crc_itu_t video output [last unloaded: scsi_wait_scan]
Mar  7 07:48:46 ysdev kernel: [ 2191.684626] 
Mar  7 07:48:46 ysdev kernel: [ 2191.684634] Pid: 4539, comm: umount Tainted: P      D     2.6.35.11-83.fc14.x86_64 #1 64575KU/64575KU
Mar  7 07:48:46 ysdev kernel: [ 2191.684642] RIP: 0010:[<ffffffffa0c97489>]  [<ffffffffa0c97489>] cifs_dfs_follow_mountpoint+0x57/0x472 [cifs]
Mar  7 07:48:46 ysdev kernel: [ 2191.684675] RSP: 0018:ffff8800b8927c68  EFLAGS: 00010246
Mar  7 07:48:46 ysdev kernel: [ 2191.684682] RAX: ffffffffa0c9c560 RBX: ffff8800b8927e58 RCX: 0000000000001638
Mar  7 07:48:46 ysdev kernel: [ 2191.684689] RDX: 0000000000001639 RSI: ffff8800b8927e58 RDI: ffff88013bbfea80
Mar  7 07:48:46 ysdev kernel: [ 2191.684696] RBP: ffff8800b8927ce8 R08: 0000000000000000 R09: 0000000000000000
Mar  7 07:48:46 ysdev kernel: [ 2191.684702] R10: 0000000000000670 R11: 0000000000000002 R12: ffff88013bbfea80
Mar  7 07:48:46 ysdev kernel: [ 2191.684709] R13: ffff8800b8927d78 R14: ffff88013bbfea80 R15: ffff880114540000
Mar  7 07:48:46 ysdev kernel: [ 2191.684717] FS:  00007f64fffe6760(0000) GS:ffff880002100000(0000) knlGS:0000000000000000
Mar  7 07:48:46 ysdev kernel: [ 2191.684725] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Mar  7 07:48:46 ysdev kernel: [ 2191.684732] CR2: 00007f64ff6ff730 CR3: 000000005dcfc000 CR4: 00000000000006e0
Mar  7 07:48:46 ysdev kernel: [ 2191.684738] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar  7 07:48:46 ysdev kernel: [ 2191.684745] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar  7 07:48:46 ysdev kernel: [ 2191.684753] Process umount (pid: 4539, threadinfo ffff8800b8926000, task ffff880114540000)
Mar  7 07:48:46 ysdev kernel: [ 2191.684759] Stack:
Mar  7 07:48:46 ysdev kernel: [ 2191.684763]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
Mar  7 07:48:46 ysdev kernel: [ 2191.684773] <0> ffff880133ae6500 ffff880133ae65c8 ffff8800b8927cb8 ffffffff8112d219
Mar  7 07:48:46 ysdev kernel: [ 2191.684785] <0> 0000000000000000 00000000b8927e58 ffff8800b8927cc8 ffff8800b8927e58
Mar  7 07:48:46 ysdev kernel: [ 2191.684788] Call Trace:
Mar  7 07:48:46 ysdev kernel: [ 2191.684794]  [<ffffffff8112d219>] ? mntput_no_expire+0x25/0xa4
Mar  7 07:48:46 ysdev kernel: [ 2191.684798]  [<ffffffff81120982>] do_follow_link+0xf7/0x246
Mar  7 07:48:46 ysdev kernel: [ 2191.684801]  [<ffffffff81120e68>] link_path_walk+0x397/0x4a0
Mar  7 07:48:46 ysdev kernel: [ 2191.684805]  [<ffffffff810d43ca>] ? filemap_fault+0x1bb/0x30a
Mar  7 07:48:46 ysdev kernel: [ 2191.684808]  [<ffffffff81121066>] path_walk+0x4f/0x9f
Mar  7 07:48:46 ysdev kernel: [ 2191.684811]  [<ffffffff811205cd>] ? path_init+0x8d/0x159
Mar  7 07:48:46 ysdev kernel: [ 2191.684814]  [<ffffffff8112118f>] do_path_lookup+0x2f/0x7b
Mar  7 07:48:46 ysdev kernel: [ 2191.684817]  [<ffffffff81121efd>] user_path_at+0x54/0x91
Mar  7 07:48:46 ysdev kernel: [ 2191.684820]  [<ffffffff8112d219>] ? mntput_no_expire+0x25/0xa4
Mar  7 07:48:46 ysdev kernel: [ 2191.684823]  [<ffffffff8111aee8>] sys_readlinkat+0x30/0x98
Mar  7 07:48:46 ysdev kernel: [ 2191.684825]  [<ffffffff8111af6b>] sys_readlink+0x1b/0x1d
Mar  7 07:48:46 ysdev kernel: [ 2191.684829]  [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
Mar  7 07:48:46 ysdev kernel: [ 2191.684830] Code: 45 cc 00 00 00 00 74 1c 48 c7 c2 20 c6 c9 a0 48 c7 c6 1d 6a ca a0 48 c7 c7 c4 6a ca a0 31 c0 e8 96 0d 7d e0 4d 3b 64 24 28 75 02 <0f> 0b e8 73 53 ff ff f6 05 39 11 01 00 01 89 45 a8 74 33 65 48 
Mar  7 07:48:46 ysdev kernel: [ 2191.684858] RIP  [<ffffffffa0c97489>] cifs_dfs_follow_mountpoint+0x57/0x472 [cifs]
Mar  7 07:48:46 ysdev kernel: [ 2191.684866]  RSP <ffff8800b8927c68>
Mar  7 07:48:46 ysdev kernel: [ 2191.684869] ---[ end trace 2ca9a767db3aabce ]---

Fedora: 14 + Latest patches
Kernel: 2.6.35.11-83.fc14.x86_64
Desktop: KDE
Mount CMD: mount -t cifs -o username=<user> //<server>/<folder> /home/w_drive/

Comment 1 Chuck Ebbert 2011-03-08 07:38:55 UTC
In cifs_dfs_follow_mountpoint():
        BUG_ON(IS_ROOT(dentry));

Comment 2 Jeff Layton 2011-03-08 12:01:10 UTC
Interesting. I'm actually not that familiar with this code, but we do have code to chase DFS referrals at mount time. In this situation, it looks like we somehow ended up not chasing the referral chain all the way down and ended up with a DFS referral as a mountpoint, we then tripped over the BUG_ON in that path.

IIRC (and it has been a while) Igor added that BUG_ON to ensure that we could always unmount the filesystem.

I assume that this is always reproducible? If so, would it be possible to turn up debugging when you do this and attach a log to this bug? Instructions on how to do that are here:

    http://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Enabling_Debugging

Comment 3 Yogesh Sharma 2011-03-08 14:26:25 UTC
==Domain,Host,UserID removed==

I was not able to unmount the share.


[80177.963917] fs/cifs/cifsfs.c: Devname: //<...sharedServer...>/Workgroup/ flags: 0 
[80177.964115] fs/cifs/connect.c: CIFS VFS: in cifs_mount as Xid: 2 with uid: 0
[80177.964135] fs/cifs/connect.c: Username: <...myusername...>
[80177.964144] fs/cifs/connect.c: UNC: \\<...sharedServer...>\Workgroup ip: 172.16.1.154
[80177.964173] fs/cifs/connect.c: Socket created
[80177.964935] fs/cifs/connect.c: sndbuf 16384 rcvbuf 87380 rcvtimeo 0x1b58
[80177.965427] fs/cifs/connect.c: CIFS VFS: in cifs_get_smb_ses as Xid: 3 with uid: 0
[80177.965434] fs/cifs/connect.c: Existing smb sess not found
[80177.965450] fs/cifs/connect.c: Demultiplex PID: 11983
[80177.965456] fs/cifs/cifssmb.c: secFlags 0x7
[80177.965464] fs/cifs/transport.c: For smb_command 114
[80177.965470] fs/cifs/transport.c: Sending smb:  total_len 82
[80177.966217] fs/cifs/connect.c: rfc1002 length 0x75
[80177.966243] fs/cifs/cifssmb.c: Dialect: 2
[80177.966251] fs/cifs/cifssmb.c: negprot rc 0
[80177.966287] fs/cifs/connect.c: Security Mode: 0x3 Capabilities: 0xf3fd TimeAdjust: 28800
[80177.966295] fs/cifs/sess.c: sess setup type 2
[80177.966487] fs/cifs/transport.c: For smb_command 115
[80177.966493] fs/cifs/transport.c: Sending smb:  total_len 262
[80178.041476] fs/cifs/connect.c: rfc1002 length 0x83
[80178.041536] fs/cifs/misc.c: Null buffer passed to cifs_small_buf_release
[80178.041546] fs/cifs/sess.c: ssetup rc from sendrecv2 is 0
[80178.041551] fs/cifs/sess.c: UID = 10242 
[80178.041556] fs/cifs/sess.c: bleft 85
[80178.041565] fs/cifs/sess.c: serverOS=Windows 5.0
[80178.041573] fs/cifs/sess.c: serverNOS=Windows 2000 LAN Manager
[80178.041580] fs/cifs/sess.c: serverDomain=CYMER
[80178.041587] fs/cifs/sess.c: ssetup freeing small buf ffff88004ea796c0
[80178.041593] fs/cifs/connect.c: CIFS Session Established successfully
[80178.041601] fs/cifs/connect.c: CIFS VFS: leaving cifs_get_smb_ses (xid = 3) rc = 0
[80178.041609] fs/cifs/connect.c: file mode: 0x1ed  dir mode: 0x1ed
[80178.041619] fs/cifs/connect.c: CIFS VFS: in cifs_get_tcon as Xid: 4 with uid: 0
[80178.041636] fs/cifs/transport.c: For smb_command 117
[80178.041642] fs/cifs/transport.c: Sending smb:  total_len 102
[80178.042597] fs/cifs/connect.c: rfc1002 length 0x42
[80178.042622] fs/cifs/connect.c: disk share connection
[80178.042630] fs/cifs/connect.c: nativeFileSystem=NTFS
[80178.042636] fs/cifs/connect.c: Tcon flags: 0x3 
[80178.042642] fs/cifs/connect.c: CIFS VFS: leaving cifs_get_tcon (xid = 4) rc = 0
[80178.042648] fs/cifs/connect.c: CIFS Tcon rc = 0
[80178.042654] fs/cifs/cifssmb.c: In QFSDeviceInfo
[80178.042661] fs/cifs/transport.c: For smb_command 50
[80178.042667] fs/cifs/transport.c: Sending smb:  total_len 72
[80178.043324] fs/cifs/connect.c: rfc1002 length 0x44
[80178.043361] fs/cifs/cifssmb.c: In QFSAttributeInfo
[80178.043370] fs/cifs/transport.c: For smb_command 50
[80178.043376] fs/cifs/transport.c: Sending smb:  total_len 72
[80178.044078] fs/cifs/connect.c: rfc1002 length 0x50
[80178.044122] fs/cifs/connect.c: CIFS VFS: leaving cifs_mount (xid = 2) rc = 0
[80178.044133] fs/cifs/inode.c: CIFS VFS: in cifs_root_iget as Xid: 5 with uid: 0
[80178.044140] fs/cifs/inode.c: Getting info on 
[80178.044148] fs/cifs/transport.c: For smb_command 50
[80178.044154] fs/cifs/transport.c: Sending smb:  total_len 78
[80178.044767] fs/cifs/connect.c: rfc1002 length 0x27
[80178.044777] fs/cifs/connect.c: invalid transact2 word count
[80178.044814] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
[80178.044825] fs/cifs/netmisc.c: Mapping smb error code 3 to POSIX err -66
[80178.044832] fs/cifs/cifssmb.c: Send error in QPathInfo = -66
[80178.044838] fs/cifs/inode.c: creating fake fattr for DFS referral
[80178.044844] fs/cifs/cifssmb.c: In GetSrvInodeNum for 
[80178.044852] fs/cifs/transport.c: For smb_command 50
[80178.044857] fs/cifs/transport.c: Sending smb:  total_len 78
[80178.045543] fs/cifs/connect.c: rfc1002 length 0x27
[80178.045552] fs/cifs/connect.c: invalid transact2 word count
[80178.045593] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
[80178.045602] fs/cifs/netmisc.c: Mapping smb error code 3 to POSIX err -66
[80178.045610] fs/cifs/cifssmb.c: error -66 in QueryInternalInfo
[80178.045616] fs/cifs/inode.c: GetSrvInodeNum rc -66
[80178.045624] CIFS VFS: Autodisabling the use of server inode numbers on \\<...sharedServer...>\Workgroup. This server doesn't seem to support them properly. Hardlinks will not be recognized on this mount. Consider mounting with the "noserverino" option to silence this message.
[80178.045637] fs/cifs/inode.c: looking for uniqueid=3
[80178.045659] fs/cifs/inode.c: cifs_revalidate_cache: revalidating inode 3
[80178.045665] fs/cifs/inode.c: cifs_revalidate_cache: inode 3 is new
[80178.045672] fs/cifs/inode.c: inode 0xffff880110f0c048 old_time=0 new_time=4374845341
[80178.045924] SELinux: initialized (dev cifs, type cifs), uses genfs_contexts
[80178.065996] fs/cifs/cifs_dfs_ref.c: in cifs_dfs_follow_mountpoint
[80178.066053] ------------[ cut here ]------------
[80178.066061] kernel BUG at fs/cifs/cifs_dfs_ref.c:318!
[80178.066067] invalid opcode: 0000 [#1] SMP 
[80178.066075] last sysfs file: /sys/devices/virtual/bdi/cifs-2/uevent
[80178.066081] CPU 1 
[80178.066085] Modules linked in: nls_utf8 cifs vfat fat usb_storage tcp_lp ipt_LOG iptable_nat iptable_raw xt_comment ipt_addrtype bridge stp llc xt_multiport xt_mark iptable_mangle nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nfnetlink nf_conntrack_netbios_ns nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp ipv6 vmnet vmblock vmci vmmon cpufreq_ondemand acpi_cpufreq freq_table mperf uinput nvidia(P) snd_hda_codec_analog arc4 ecb snd_hda_intel snd_hda_codec iwlagn iwlcore r852 snd_hwdep sm_common snd_seq nand snd_seq_device nand_ids mac80211 nand_ecc snd_pcm e1000e mtd ppdev thinkpad_acpi snd_timer i2c_i801 cfg80211 iTCO_wdt parport_pc snd wmi i2c_core snd_page_alloc rfkill iTCO_vendor_support parport soundcore joydev microcode sdhci_pci sdhci mmc_core firewire_ohci yenta_socket firewire_core crc_itu_t video output [last unloaded: scsi_wait_scan]
[80178.066265] 
[80178.066265] Pid: 2781, comm: ksysguardd Tainted: P            2.6.35.11-83.fc14.x86_64 #1 64575KU/64575KU
[80178.066265] RIP: 0010:[<ffffffffa0e76489>]  [<ffffffffa0e76489>] cifs_dfs_follow_mountpoint+0x57/0x472 [cifs]
[80178.066265] RSP: 0018:ffff8801336cbc08  EFLAGS: 00010246
[80178.066265] RAX: 000000000000004b RBX: ffff8801336cbdf8 RCX: 000000000000ff29
[80178.066265] RDX: 0000000000000000 RSI: 0000000000000092 RDI: 0000000000000246
[80178.066265] RBP: ffff8801336cbc88 R08: 0000000000000002 R09: 00000000fffffffe
[80178.066265] R10: ffff8801b36cbb27 R11: 0000000000000000 R12: ffff880012ee90c0
[80178.066265] R13: ffff8801336cbd18 R14: ffff880012ee90c0 R15: ffff88011af11740
[80178.066265] FS:  00007ff9cbc3f720(0000) GS:ffff880002100000(0000) knlGS:0000000000000000
[80178.066265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[80178.066265] CR2: 00007ff9cbc61000 CR3: 0000000119c2f000 CR4: 00000000000006e0
[80178.066265] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[80178.066265] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[80178.066265] Process ksysguardd (pid: 2781, threadinfo ffff8801336ca000, task ffff88011af11740)
[80178.066265] Stack:
[80178.066265]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[80178.066265] <0> ffff8801338d1b00 ffff8801338d1bc8 ffff8801336cbc58 ffffffff8112d219
[80178.066265] <0> 0000000000000000 00000000336cbdf8 ffff8801336cbc68 ffff8801336cbdf8
[80178.066265] Call Trace:
[80178.066265]  [<ffffffff8112d219>] ? mntput_no_expire+0x25/0xa4
[80178.066265]  [<ffffffff81120982>] do_follow_link+0xf7/0x246
[80178.066265]  [<ffffffff81120e68>] link_path_walk+0x397/0x4a0
[80178.066265]  [<ffffffff81121066>] path_walk+0x4f/0x9f
[80178.066265]  [<ffffffff811205cd>] ? path_init+0x8d/0x159
[80178.066265]  [<ffffffff8112118f>] do_path_lookup+0x2f/0x7b
[80178.066265]  [<ffffffff81121efd>] user_path_at+0x54/0x91
[80178.066265]  [<ffffffff8112d219>] ? mntput_no_expire+0x25/0xa4
[80178.066265]  [<ffffffff8111f7c7>] ? mntput+0x1d/0x1f
[80178.066265]  [<ffffffff811371d1>] sys_statfs+0x2e/0x89
[80178.066265]  [<ffffffff8111f9b5>] ? path_put+0x22/0x27
[80178.066265]  [<ffffffff81099b75>] ? audit_syscall_entry+0x11c/0x148
[80178.066265]  [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
[80178.066265] Code: 45 cc 00 00 00 00 74 1c 48 c7 c2 20 b6 e7 a0 48 c7 c6 1d 5a e8 a0 48 c7 c7 c4 5a e8 a0 31 c0 e8 96 1d 5f e0 4d 3b 64 24 28 75 02 <0f> 0b e8 73 53 ff ff f6 05 39 11 01 00 01 89 45 a8 74 33 65 48 
[80178.066265] RIP  [<ffffffffa0e76489>] cifs_dfs_follow_mountpoint+0x57/0x472 [cifs]
[80178.066265]  RSP <ffff8801336cbc08>
[80178.066746] ---[ end trace eeb7ac6b2ebf296e ]---
[80178.414753] fs/cifs/cifs_dfs_ref.c: in cifs_dfs_follow_mountpoint
[80178.414781] ------------[ cut here ]------------
[80178.414784] kernel BUG at fs/cifs/cifs_dfs_ref.c:318!
[80178.414786] invalid opcode: 0000 [#2] SMP 
[80178.414789] last sysfs file: /sys/devices/virtual/bdi/cifs-2/uevent
[80178.414791] CPU 1 
[80178.414792] Modules linked in: nls_utf8 cifs vfat fat usb_storage tcp_lp ipt_LOG iptable_nat iptable_raw xt_comment ipt_addrtype bridge stp llc xt_multiport xt_mark iptable_mangle nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nfnetlink nf_conntrack_netbios_ns nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp ipv6 vmnet vmblock vmci vmmon cpufreq_ondemand acpi_cpufreq freq_table mperf uinput nvidia(P) snd_hda_codec_analog arc4 ecb snd_hda_intel snd_hda_codec iwlagn iwlcore r852 snd_hwdep sm_common snd_seq nand snd_seq_device nand_ids mac80211 nand_ecc snd_pcm e1000e mtd ppdev thinkpad_acpi snd_timer i2c_i801 cfg80211 iTCO_wdt parport_pc snd wmi i2c_core snd_page_alloc rfkill iTCO_vendor_support parport soundcore joydev microcode sdhci_pci sdhci mmc_core firewire_ohci yenta_socket firewire_core crc_itu_t video output [last unloaded: scsi_wait_scan]
[80178.414861] 
[80178.414864] Pid: 11984, comm: ksysguardd Tainted: P      D     2.6.35.11-83.fc14.x86_64 #1 64575KU/64575KU
[80178.414866] RIP: 0010:[<ffffffffa0e76489>]  [<ffffffffa0e76489>] cifs_dfs_follow_mountpoint+0x57/0x472 [cifs]
[80178.414879] RSP: 0018:ffff880112b71c08  EFLAGS: 00010246
[80178.414881] RAX: 000000000000004b RBX: ffff880112b71df8 RCX: 0000000000000105
[80178.414884] RDX: 0000000000000000 RSI: 0000000000000092 RDI: 0000000000000246
[80178.414886] RBP: ffff880112b71c88 R08: 0000000000000002 R09: 00000000fffffffe
[80178.414888] R10: ffff880192b71b27 R11: 0000000000000000 R12: ffff880012ee90c0
[80178.414891] R13: ffff880112b71d18 R14: ffff880012ee90c0 R15: ffff88000dc6c5c0
[80178.414893] FS:  00007ff9e4359720(0000) GS:ffff880002100000(0000) knlGS:0000000000000000
[80178.414896] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[80178.414898] CR2: 000000000195d1c8 CR3: 000000010f7ef000 CR4: 00000000000006e0
[80178.414901] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[80178.414903] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[80178.414906] Process ksysguardd (pid: 11984, threadinfo ffff880112b70000, task ffff88000dc6c5c0)
[80178.414907] Stack:
[80178.414909]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[80178.414912] <0> ffff8801338d1b00 ffff8801338d1bc8 ffff880112b71c58 ffffffff8112d219
[80178.414916] <0> 0000000000000000 0000000012b71df8 ffff880112b71c68 ffff880112b71df8
[80178.414920] Call Trace:
[80178.414926]  [<ffffffff8112d219>] ? mntput_no_expire+0x25/0xa4
[80178.414931]  [<ffffffff81120982>] do_follow_link+0xf7/0x246
[80178.414934]  [<ffffffff81120e68>] link_path_walk+0x397/0x4a0
[80178.414938]  [<ffffffff81121066>] path_walk+0x4f/0x9f
[80178.414941]  [<ffffffff811205cd>] ? path_init+0x8d/0x159
[80178.414944]  [<ffffffff8112118f>] do_path_lookup+0x2f/0x7b
[80178.414947]  [<ffffffff81121efd>] user_path_at+0x54/0x91
[80178.414950]  [<ffffffff8112d219>] ? mntput_no_expire+0x25/0xa4
[80178.414953]  [<ffffffff8111f7c7>] ? mntput+0x1d/0x1f
[80178.414956]  [<ffffffff811371d1>] sys_statfs+0x2e/0x89
[80178.414959]  [<ffffffff8111f9b5>] ? path_put+0x22/0x27
[80178.414964]  [<ffffffff81099b75>] ? audit_syscall_entry+0x11c/0x148
[80178.414968]  [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
[80178.414970] Code: 45 cc 00 00 00 00 74 1c 48 c7 c2 20 b6 e7 a0 48 c7 c6 1d 5a e8 a0 48 c7 c7 c4 5a e8 a0 31 c0 e8 96 1d 5f e0 4d 3b 64 24 28 75 02 <0f> 0b e8 73 53 ff ff f6 05 39 11 01 00 01 89 45 a8 74 33 65 48 
[80178.414999] RIP  [<ffffffffa0e76489>] cifs_dfs_follow_mountpoint+0x57/0x472 [cifs]
[80178.415008]  RSP <ffff880112b71c08>
[80178.415011] ---[ end trace eeb7ac6b2ebf296f ]---

Comment 4 Jeff Layton 2011-03-08 14:56:09 UTC
Created attachment 482916 [details]
patch -- always do is_path_accessible check in mount

Thanks. Ok, it looks like it fell down in cifs_root_iget, which tries to fetch information for the inode at the root of the tree. The interesting bit is that 

We have code in cifs_mount to loop around if the TREE_CONNECT returns NT_STATUS_PATH_NOT_COVERED. That didn't happen here though, it returned 0 (success). We also check to see if the path is accessible but only when there is a prefixpath (part of the path after the sharename). That didn't trigger either because your mount didn't have one.

I think the right thing to do is probably to have cifs_mount always do an is_path_accessible check, even when the prefixpath isn't set. This patch makes that change. It's against current cifs-2.6 HEAD, but should also apply to 2.6.35.11.

Yogesh, would you be able to test this patch and let me know if it fixes the issue?

Comment 5 Yogesh Sharma 2011-03-08 21:48:18 UTC
I have installed kernel source and started build after applying your patch, may be tomorrow I will reboot my laptop and try new kernel.

Comment 6 Yogesh Sharma 2011-03-10 00:50:47 UTC
First attempt failed during kernel compile, may take couple day (due to heavy workload @work)

Comment 7 Jeff Layton 2011-03-10 00:58:36 UTC
Understood. Thanks for testing it and let me know as soon as you can. I think this is probably a stable candidate too.

Comment 8 Jeff Layton 2011-03-11 14:56:34 UTC
Hi Yogesh, any progress on testing this? Sorry to be pushy, but I was hoping to get this into 2.6.38 if possible, and in order to do that I need to send it out soon.

Comment 9 Yogesh Sharma 2011-03-11 15:02:49 UTC
Do you have any test build of latest kernel which I can download install and test ? It would really help and I can get you the results in notime.

Comment 10 Jeff Layton 2011-03-11 18:38:53 UTC
Ok, I patched a kernel and had koji build it:

    http://koji.fedoraproject.org/koji/taskinfo?taskID=2904668

I don't have a great place to test this at the moment, so if you can do so and report back that would be great.

Comment 11 Yogesh Sharma 2011-03-14 14:01:42 UTC
Jeff: Kernel oops is gone but still I am not able to mount the DFS due to error -40

Mar 14 06:51:17 ysdev kernel: [  171.318971] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
Mar 14 06:51:18 ysdev kernel: [  171.348063] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
Mar 14 06:51:18 ysdev kernel: [  171.373821] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
Mar 14 06:51:18 ysdev kernel: [  171.398791] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
Mar 14 06:51:18 ysdev kernel: [  171.425587] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
Mar 14 06:51:18 ysdev kernel: [  171.452401] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
Mar 14 06:51:18 ysdev kernel: [  171.481566] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
Mar 14 06:51:18 ysdev kernel: [  171.506886] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
Mar 14 06:51:18 ysdev kernel: [  171.532139] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
Mar 14 06:51:18 ysdev kernel: [  171.562223] Status code returned 0xc0000257 NT_STATUS_PATH_NOT_COVERED
Mar 14 06:51:18 ysdev kernel: [  171.564214] CIFS VFS: cifs_mount failed w/return code = -40

Comment 12 Jeff Layton 2011-03-14 15:29:08 UTC
-40 is -ELOOP. The mount time DFS referral chasing code limits the amount of nested links to 8:

                if (referral_walks_count > MAX_NESTED_LINKS) {
                        /*
                         * BB: when we implement proper loop detection,
                         *     we will remove this check. But now we need it
                         *     to prevent an indefinite loop if 'DFS tree' is
                         *     misconfigured (i.e. has loops).
                         */
                        rc = -ELOOP;
                        goto mount_fail_check;
                }

...so it seems like your server's DFS configuration has problems, or we have a bug in the referral chasing code. Can you turn up cifsFYI again and collect the log from that?

This time, please attach a file with the log rather than pasting it into a comment. The line wrapping makes that harder to parse.

Comment 13 Yogesh Sharma 2011-03-14 16:15:57 UTC
Created attachment 484243 [details]
extract from /var/log/messages

cifsFYI = 7 log

Comment 14 Jeff Layton 2011-03-14 16:50:06 UTC
/var/log/messages doesn't generally get KERN_DEBUG messages, which is what most of these get logged at.

Can you do something like:

    # dmesg > /tmp/cifsfyi.txt

...and then attach that to the case?

Comment 15 Yogesh Sharma 2011-03-14 17:22:53 UTC
Created attachment 484255 [details]
dmesg output

cifsFYI.txt with cifsFYI=7

Comment 16 Jeff Layton 2011-03-14 17:45:30 UTC
The debug info certainly looks like your server is set up to refer to itself in a loop. The best way to confirm that would be a capture of the data between client and server however.

Would you be able to do that and attach it here? You can mark it private if you don't want to make that public. Instructions for doing the capture are here:

    http://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Wire_Captures

Comment 17 Jeff Layton 2011-03-14 17:46:08 UTC
In the meantime, I'll go ahead and send this patch to the maintainer upstream since it at least fixes the oops.

Comment 18 Yogesh Sharma 2011-03-14 18:05:59 UTC
"Wire captures can also contain sensitive data like addresses, password hashes, filenames and data" What is best way to share this data ? May be I need approval from our IT.........

Comment 19 Jeff Layton 2011-03-14 19:42:15 UTC
If the capture is fairly small, you can make it a "private" attachment which will limit access to employees of Red Hat. Alternately, you can send it to me privately in some other fashion if you wish. Let me know what would work best for you.

Consulting with your IT department seems prudent if you're worried about it.

Comment 20 Yogesh Sharma 2011-03-14 19:47:08 UTC
Do you know how cifs code is going behave when I don't have access to some of
the server in DFS ? Will it skip them or error out ?

Comment 21 Yogesh Sharma 2011-03-14 19:47:22 UTC
Do you know how cifs code is going behave when I don't have access to some of
the server in DFS ? Will it skip them or error out ?

Comment 22 Jeff Layton 2011-03-14 19:50:06 UTC
It will likely error out.

Comment 23 Yogesh Sharma 2011-03-14 20:31:02 UTC
I have TCPDump, any sftp location where I can upload it ?

I didn't see any option to make "private attachment".

Comment 24 Jeff Layton 2011-03-14 21:20:09 UTC
If you go to "add an attachment" there should be a checkbox just above the submit button that says:

"Privacy: Make attachment and comment private".

Comment 25 Yogesh Sharma 2011-03-14 22:08:30 UTC
Created attachment 484318 [details]
Bugzilla Screenshot

bugzilla screenshot showing no option to make it private.

Comment 26 Jeff Layton 2011-03-15 00:37:54 UTC
My mistake, that must not be available on Fedora bugs. In that case, I'm afraid don't have a secure mechanism for you to send me the capture.

Another option would be to gpg encrypt it using the key that I use to sign cifs-utils updates:

ftp://ftp.samba.org/pub/linux-cifs/cifs-utils/cifs-utils-pubkey_70F3B981.asc

...if you do that, then you can just attach it in the clear here.

Comment 27 Yogesh Sharma 2011-03-15 12:39:25 UTC
Created attachment 484452 [details]
cifs traffic

Comment 28 Jeff Layton 2011-03-15 17:46:53 UTC
Thanks for the capture. As I suspected, this DFS configuration seems to be dysfunctional. It's very similar to making a symlink that points to itself.

You're connecting to \\<hostname>\share. When we do a QPathInfo for the root, the server sends back NT_STATUS_PATH_NOT_COVERED. That triggers the client to go see if there's a DFS referral for this path. The server responds that there is one that it has the exact same path that it tried to query. After doing this for MAX_NESTED_LINKS times (which I think is 8), it gives up and returns ELOOP back to userspace.

You may want to show this capture to the folks who maintain your servers, it really looks like they have something misconfigured.

Since we have a patch to fix the oops, I'm going to close this with a resolution of upstream. The patch should make it into kernel releases soon. Please reopen the bug if you have more questions or wish to discuss this further.