Bug 414981

Summary: [RHEL5.2] Kernel panic upon unmounting ecryptfs overlay
Product: Red Hat Enterprise Linux 5 Reporter: Jarod Wilson <jarod>
Component: kernelAssignee: Eric Sandeen <esandeen>
Status: CLOSED CURRENTRELEASE QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: high    
Version: 5.2CC: dzickus, karsten, lwang, mhalcrow
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-08-04 20:01:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 448732    

Description Jarod Wilson 2007-12-06 22:33:38 UTC
Description of problem:
The following kernel panic occurred following an attempt to unmount an ecryptfs
overlay. Of possible interest is that on the cli, I proposed twofish encryption,
but since the tools still insist on asking me for the cipher, I chose AES-256
instead.

Version-Release number of selected component (if applicable):
kernel 2.6.18-58.el5 + eric's ecryptfs backport

How reproducible:

[root@xw4400-01 .ecryptfs]# mount -t ecryptfs -o
key=openssl:keyfile=/root/.ecryptfs/pki/openssl/key.pem:ecryptfs_cipher=twofish:ecryptfs_passthrough=no:passthrough=no
/secret /secret
Passphrase: 
Cipher
1) Twofish
2) AES-128
3) AES-192
4) AES-256
5) CAST6
6) Triple-DES
7) Blowfish
8) CAST5
Selection [AES-128]: 1
Attempting to mount with the following options:
  ecryptfs_cipher=twofish
  ecryptfs_key_bytes=16
  ecryptfs_sig=94733f578c4b75f6
Mounted eCryptfs
[root@xw4400-01 .ecryptfs]# cat /secret/twofish.txt 
twofish encryption in the house
[root@xw4400-01 .ecryptfs]# umount
Usage: umount [-hV]
       umount -a [-f] [-r] [-n] [-v] [-t vfstypes] [-O opts]
       umount [-f] [-r] [-n] [-v] special | node...
[root@xw4400-01 .ecryptfs]# umount /secret/
[root@xw4400-01 .ecryptfs]# mount -t ecryptfs -o
key=openssl:keyfile=/root/.ecryptfs/pki/openssl/key.pem:ecryptfs_cipher=twofish:ecryptfs_passthrough=no:passthrough=no
/secret /secret
Passphrase: 
Cipher
1) Twofish
2) AES-128
3) AES-192
4) AES-256
5) CAST6
6) Triple-DES
7) Blowfish
8) CAST5
Selection [AES-128]: 4
Attempting to mount with the following options:
  ecryptfs_cipher=aes
  ecryptfs_key_bytes=32
  ecryptfs_sig=94733f578c4b75f6
Mounted eCryptfs
[root@xw4400-01 .ecryptfs]# cat /secret/twofish.txt 
twofish encryption in the house
[root@xw4400-01 .ecryptfs]# umount /secret/

BUG: Dentry ffff810019a20df8{i=7ff37,n=twofish.txt} still in use (-1) [unmount
of ecryptfs ecryptfs]
Unable to handle kernel NULL pointer dereference at 0000000000000098 RIP: 
 [<ffffffff885b58b8>] :ecryptfs:ecryptfs_show_options+0x1c/0x83
PGD 2a2c7067 PUD 2a2c8067 PMD 0 
Oops: 0000 [1] SMP 
last sysfs file: /fs/ecryptfs/version
CPU 1 
Modules linked in: twofish ecryptfs(U) md5 aes ipt_MASQUERADE iptable_nat
ip_nat xt_state ip_conntrack nfnetlink ipt_REJECT xt_tcpudp iptable_filter
ip_tables x_tables bridge ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc
netxen_nic cpufreq_ondemand dm_multipath video sbs backlight i2c_ec i2c_core
button battery asus_acpi acpi_memhotplug ac lp snd_hda_intel snd_hda_codec
snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
snd_pcm_oss snd_mixer_oss sg snd_pcm snd_timer ata_piix snd shpchp floppy
soundcore parport_pc ide_cd snd_page_alloc firewire_ohci cdrom parport
firewire_core tg3 serio_raw pcspkr dm_snapshot dm_zero dm_mirror dm_mod ahci
libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 2997, comm: hald Not tainted 2.6.18-58.el5 #1
RIP: 0010:[<ffffffff885b58b8>]  [<ffffffff885b58b8>]
:ecryptfs:ecryptfs_show_options+0x1c/0x83
RSP: 0018:ffff81002a2cde68  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff810022db76c0 RCX: 0000000000000000
RDX: 0000000000000040 RSI: 0000000000000000 RDI: 00000000000000d0
RBP: ffff81003ff8a580 R08: ffff810023fb7259 R09: ffff810022db76c0
R10: ffffffffffffffff R11: 0000000000000000 R12: 0000000000000000
R13: ffff810022db76c0 R14: 0000000000000000 R15: 00002aaaaaab4000
FS:  00002aaaaaac9b00(0000) GS:ffff810037ca17c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000098 CR3: 000000002a2c6000 CR4: 00000000000006e0
Process hald (pid: 2997, threadinfo ffff81002a2cc000, task ffff81002a0147e0)
Stack:  ffff81003ff8a580 ffff810022db76c0 ffff81003ff8a580 0000000000000000
 0000000000001000 ffffffff8003076a ffff810022db76c0 ffff81003ff8a580
 0000000000000241 ffffffff8003ebb9 ffff81002a2cdf50 ffff81002a5524c0
Call Trace:
 [<ffffffff8003076a>] show_vfsmnt+0x10f/0x129
 [<ffffffff8003ebb9>] seq_read+0x1b8/0x28c
 [<ffffffff8000b270>] vfs_read+0xcb/0x171
 [<ffffffff80011508>] sys_read+0x45/0x6e
 [<ffffffff8005b28d>] tracesys+0xd5/0xe0


Code: 48 8b 80 98 00 00 00 4c 8b 20 48 8b 68 08 e8 b3 62 a8 f7 48 
RIP  [<ffffffff885b58b8>] :ecryptfs:ecryptfs_show_options+0x1c/0x83
 RSP <ffff81002a2cde68>
CR2: 0000000000000098
 <0>Kernel panic - not syncing: Fatal exception

Comment 1 Jarod Wilson 2007-12-06 22:50:22 UTC
Thus far, I've been unable to reproduce this one, but will keep an eye out for it...

Comment 2 Eric Sandeen 2007-12-18 19:49:02 UTC
Based on the size of the function (0x83) looks like this is prior to the patch I
did which shows actual mount options... and we know that we had some bad pointer
manipulation when bad mount options were given, so this may just be random
corruption.

The real problem here is likely:

BUG: Dentry ffff810019a20df8{i=7ff37,n=twofish.txt} still in use (-1) [unmount
of ecryptfs ecryptfs]

the "-1" is atomic_read(&dentry->d_count).  Hm, refcounting problems?

and then it looks like the old show_options tried to manipulate some of the
dentries, and sb->s_root was null... hm, interesting, it was the hald thread
that oopsed.  But after the above BUG() message, I think all bets are off.  I'd
like to know why the dentry was in use.

If we don't see it again I'll close it and chalk it up to the memory corruption
problems we fixed, but I'll leave it open a while to see if we see it again.

Comment 3 Eric Sandeen 2007-12-19 16:34:11 UTC
Ah... any chance you had a readonly mount underneath, or some other file open
failure?  As phro pointed out:

Index: ecryptfs-kernel-2.6.24-rc3/main.c
===================================================================
--- ecryptfs-kernel-2.6.24-rc3.orig/main.c
+++ ecryptfs-kernel-2.6.24-rc3/main.c
@@ -138,11 +138,14 @@ int ecryptfs_init_persistent_file(struct
 		inode_info->lower_file = dentry_open(lower_dentry,
 						     lower_mnt,
 						     (O_RDWR | O_LARGEFILE));
-		if (IS_ERR(inode_info->lower_file))
+		if (IS_ERR(inode_info->lower_file)) {
+			dget(lower_dentry);
+			mntget(lower_mnt);
 			inode_info->lower_file = dentry_open(lower_dentry,
 							     lower_mnt,
 							     (O_RDONLY
 							      | O_LARGEFILE));
+		}
 		if (IS_ERR(inode_info->lower_file)) {
 			printk(KERN_ERR "Error opening lower persistent file "
 			       "for lower_dentry [0x%p] and lower_mnt [0x%p]\n",

this might explain the -1 refcount.

Comment 4 Mike Gahagan 2008-04-09 21:32:41 UTC
I'm not able to reproduce this so far using the -88 kernel.


Comment 6 Eric Sandeen 2008-08-04 20:01:43 UTC
We've never seen this again, and I think that the patch as shown in comment #3 (which is in the rhel5.2 and upstream code, now) is the likely (fixed) culprit.