Bug 501588 - Kernel Oops caused by kerberos and cifs upon boot
Summary: Kernel Oops caused by kerberos and cifs upon boot
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: i686
OS: Linux
low
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-05-19 20:57 UTC by Josh Lange
Modified: 2009-06-03 08:55 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-06-03 08:55:56 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
messages with 2 Oopses (1.59 MB, application/octet-stream)
2009-05-19 20:57 UTC, Josh Lange
no flags Details

Description Josh Lange 2009-05-19 20:57:07 UTC
Created attachment 344707 [details]
messages with 2 Oopses

Description of problem:
We get a kernel Oops when we boot our test machines.

Our setup:
* Windows Active directory (ldap+kerberos)
* Windows home directory servers (shared over smb)
* Use automount to mount user's home directories under /home

When the new version of gdm starts, it looks at the files of recently logged in user's. This file activity forces automount to attempt to mount home directories. When this happens I get the kernel Oops

Version-Release number of selected component (if applicable): 2.6.29.2-126.fc11.i686.PAE


How reproducible: Always


Steps to Reproduce:
1. set up a system in a kerberos realm, with networked home directories over smb
2. install/set autofs to start with the machine
3. edit /etc/auto.master, and add the line "/home /etc/auto.home.test.sh"
4. make a /etc/auto.home.test.sh (make sure its executable!):
#!/bin/bash
HOMEDIR=$1
echo "-fstype=cifs,sec=krb5,user=$HOMEDIR \\\\\\\\sambaserver.domain.com\\\\home\\\\$HOMEDIR"

5. log in as one of the users, via kerbeoros auth, it should work, and the home dir should mount.
6. reboot and check the system log

Actual results upon boot:
Home dirs attempt to get mounted and Kernel Oops happens. After this /home is stuck in a bad state, and the system is pretty much unusable.

Expected results:
The filesystems should not be mounted, no kerberos tokens exist.


Additional info (2 Oopses exist in the log file, check it out):


May 18 13:58:32 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at 00000018
May 18 13:58:32 localhost kernel: IP: [<c07156d3>] down_write+0x26/0x36
May 18 13:58:32 localhost kernel: *pdpt = 0000000033b2f001 *pde = 000000003f9fc067
May 18 13:58:32 localhost kernel: Oops: 0002 [#1] SMP
May 18 13:58:32 localhost kernel: last sysfs file: /sys/devices/virtual/backlight/dell_backlight/brightness
May 18 13:58:32 localhost kernel: Modules linked in: nls_utf8 cifs autofs4 rpcsec_gss_krb5 auth_rpcgss des_generic sunrpc ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath uinput arc4 ecb snd_intel8x0 b43 snd_intel8x0m iTCO_wdt snd_ac97_codec ac97_bus mac80211 iTCO_vendor_support snd_pcm snd_timer cfg80211 input_polldev snd soundcore ppdev parport_pc video output parport dell_laptop dcdbas pcspkr ssb tg3 yenta_socket rsrc_nonstatic snd_page_alloc joydev ata_generic pata_acpi radeon drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
May 18 13:58:32 localhost kernel:
May 18 13:58:32 localhost kernel: Pid: 1948, comm: mount.cifs Not tainted (2.6.29.2-126.fc11.i686.PAE #1) Latitude D610
May 18 13:58:32 localhost kernel: EIP: 0060:[<c07156d3>] EFLAGS: 00010246 CPU: 0
May 18 13:58:32 localhost kernel: EIP is at down_write+0x26/0x36
May 18 13:58:32 localhost kernel: EAX: 00000018 EBX: 00000018 ECX: c07f7cae EDX: ffff0001
May 18 13:58:32 localhost kernel: ESI: f4196600 EDI: f884c330 EBP: f3b3bd58 ESP: f3b3bd54
May 18 13:58:32 localhost kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
May 18 13:58:32 localhost kernel: Process mount.cifs (pid: 1948, ti=f3b3a000 task=f41b5860 task.ti=f3b3a000)
May 18 13:58:32 localhost kernel: Stack:
May 18 13:58:32 localhost kernel: 00000000 f3b3bd8c c05345c7 ffffffff ffffffff f884551d f4196d80 c0899e2c
May 18 13:58:32 localhost kernel: 00000018 f3af2a80 c0899e38 f884c330 f41bba00 f3aeaa00 f3b3bdb4 c05347ef
May 18 13:58:32 localhost kernel: 00000000 00000000 00000000 00000000 f884551d f4196d80 f4196d80 f41bba00
May 18 13:58:32 localhost kernel: Call Trace:
May 18 13:58:32 localhost kernel: [<c05345c7>] ? request_key_and_link+0x168/0x2e5
May 18 13:58:32 localhost kernel: [<c05347ef>] ? request_key+0x36/0x61
May 18 13:58:32 localhost kernel: [<f8835b41>] ? cifs_get_spnego_key+0x123/0x146 [cifs]
May 18 13:58:32 localhost kernel: [<f8834a54>] ? CIFS_SessSetup+0x39f/0x903 [cifs]
May 18 13:58:32 localhost kernel: [<c04818c4>] ? mempool_free_slab+0x13/0x15
May 18 13:58:32 localhost kernel: [<c0481a7a>] ? mempool_free+0x67/0x6e
May 18 13:58:32 localhost kernel: [<f882217d>] ? cifs_setup_session+0x105/0x985 [cifs]
May 18 13:58:32 localhost kernel: [<c0421bc7>] ? default_spin_lock_flags+0x8/0xd
May 18 13:58:32 localhost kernel: [<f8825b39>] ? cifs_mount+0x14dc/0x1bcd [cifs]
May 18 13:58:32 localhost kernel: [<f8818766>] ? cifs_get_sb+0xdb/0x20b [cifs]
May 18 13:58:32 localhost kernel: [<c04aa939>] ? vfs_kern_mount+0x82/0xfb
May 18 13:58:32 localhost kernel: [<c04aaa01>] ? do_kern_mount+0x38/0xbe
May 18 13:58:32 localhost kernel: [<c04bbe71>] ? do_mount+0x662/0x69c
May 18 13:58:32 localhost kernel: [<c04ba537>] ? copy_mount_options+0x77/0xea
May 18 13:58:32 localhost kernel: [<c04bbf16>] ? sys_mount+0x6b/0xa5
May 18 13:58:32 localhost kernel: [<c040945f>] ? sysenter_do_call+0x12/0x34
May 18 13:58:32 localhost kernel: Code: 5b 5e 5f 5d c3 55 89 e5 53 0f 1f 44 00 00 ba 30 00 00 00 89 c3 b8 ae 7c 7f c0 e8 5e 94 d1 ff e8 6a f6 ff ff ba 01 00 ff ff 89 d8 <3e> 0f c1 10 85 d2 74 05 e8 9c 0a 00 00 5b 5d c3 55 89 e5 53 0f
May 18 13:58:32 localhost kernel: EIP: [<c07156d3>] down_write+0x26/0x36 SS:ESP 0068:f3b3bd54
May 18 13:58:32 localhost kernel: ---[ end trace 550961bb086eb70c ]---

Comment 1 Josh Lange 2009-05-19 21:00:45 UTC
On windows AD, the user for the cifs mount is "realm/username" and not only "username"

Comment 2 Chuck Ebbert 2009-05-21 00:17:16 UTC
Almost certainly fixed by commit 34574dd10b6d0697b86703388d6d6af9cbf4bb48:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=34574dd10b6d0697b86703388d6d6af9cbf4bb48


Quote:

When request_key() is called, without there being any standard process
keyrings on which to fall back if a destination keyring is not specified, an
oops is liable to occur when construct_alloc_key() calls down_write() on
dest_keyring's semaphore.

Due to function inlining this may be seen as an oops in down_write() as called
from request_key_and_link().

Comment 3 Jeff Layton 2009-05-23 19:50:06 UTC
Agreed, that's almost assuredly the problem. Recent kernels have this fix, but I'm not sure if it made it into stable series.

Comment 4 Kyle McMartin 2009-05-25 15:46:33 UTC
http://koji.fedoraproject.org/koji/taskinfo?taskID=1375900

^- scratch build chunking away (should be done in about 3 hours) can you please test it and report back so we can merge the fix for f-11?

thanks!
  kyle

Comment 5 Chuck Ebbert 2009-05-26 05:44:13 UTC
the fix went in 2.6.29.4-162


Note You need to log in before you can comment on or make changes to this bug.