Error exists upstream too, but capturing here. testcase, on an ecryptfs filesystem: # mknod foo c 0 0; rm -f foo The device file gets created in the lower fs but is unopenable; same is true on any filesystem, it returns "No such device or address" But for this reason, the lower persistent file cannot be opened. Apparently after the error condition, refcounts then get off, because eventually we hit a BUG at: BUG_ON(!atomic_read(&lower_dentry->d_count)); if we try to remove the file, for example.
Test case "mknod-foo" in RHTS. Might crash system (but that's general kernel side issue, see bug 435115).
Hm, FWIW I'm not actually seeing a crash now, on 2.6.18-88.el5, though I do see the error message.
I'm able to get the -88 xen kernel by running the rhts mknod-foo test after modifying so it does not check the return code for mknod before umounting, after running the test twice I get this panic: test111.test.redhat.com login: ecryptfs_parse_options: eCryptfs: unrecognized option 'verbosity=0' ecryptfs_parse_options: You must supply at least one valid auth tok signature as a mount parameter; see the eCryptfs README Error parsing options; rc = [-22] ecryptfs_parse_options: eCryptfs: unrecognized option 'verbosity=0' padlock: VIA PadLock not detected. Error opening lower persistent file for lower_dentry [0xd32c8c40] and lower_mnt [0xc0e35cc0] ecryptfs_interpose: Error attempting to initialize the persistent file for the dentry with name [foo]; rc = [-6] ecryptfs_lookup: Error interposing ecryptfs_parse_options: eCryptfs: unrecognized option 'verbosity=0' ------------[ cut here ]------------ kernel BUG at fs/ecryptfs/inode.c:299! invalid opcode: 0000 [#1] SMP last sysfs file: /fs/ecryptfs/version Modules linked in: aes_generic aes_i586 ecryptfs autofs4 hidp rfcomm l2cap bluetooth sunrpc xennet ipv6 xfrm_nalgo crypto_api dm_multipath parport_pc lp parport pcspkr dm_snapshot dm_zero dm_mirror dm_mod xenblk ext3 jbd uhci_hcd ohci_hcd ehci_hcd CPU: 0 EIP: 0061:[<e143a0b8>] Not tainted VLI EFLAGS: 00010246 (2.6.18-88.el5xen #1) EIP is at ecryptfs_lookup+0x1d3/0x4a2 [ecryptfs] eax: 00000000 ebx: d2702888 ecx: c0eb8080 edx: d32c8c40 esi: c0e35cc0 edi: d2109698 ebp: d201ecc0 esp: d7327e5c ds: 007b es: 007b ss: 0069 Process mknod (pid: 3974, ti=d7327000 task=d3603aa0 task.ti=d7327000) Stack: 00000000 d7327eb0 c0469812 d32c8c40 d2a452b0 d32c8c40 c04816bb d2702f70 dd0cb9a0 e144afe0 d2702888 d2702f70 d7327eb0 c0479b0a 00000000 d201ecc0 d3604403 0024db2a d2702f70 d201ea40 c0479f39 0024db2a 00000003 d3604400 Call Trace: [<c0469812>] kmem_cache_alloc+0x54/0x5e [<c04816bb>] d_alloc+0x14f/0x17d [<c0479b0a>] __lookup_hash+0xb1/0xe1 [<c0479f39>] lookup_one_len+0x4a/0x58 [<e143a01f>] ecryptfs_lookup+0x13a/0x4a2 [ecryptfs] [<c0469812>] kmem_cache_alloc+0x54/0x5e [<c04816bb>] d_alloc+0x14f/0x17d [<c0479b0a>] __lookup_hash+0xb1/0xe1 [<c0479b80>] lookup_create+0x38/0x68 [<c047b769>] sys_mknodat+0x69/0x164 [<c060a37a>] do_page_fault+0x6de/0xbf1 [<c0444349>] audit_syscall_entry+0x14b/0x17d [<c047b877>] sys_mknod+0x13/0x17 [<c0405413>] syscall_call+0x7/0xb ======================= Code: b2 0d 00 00 8b 44 24 1c 8b 54 24 20 8b 78 0c 8b 42 0c 8b 50 48 8b 40 44 89 55 48 89 45 44 8b 54 24 1c 83 c4 10 8b 02 85 c0 75 08 <0f> 0b 2b 01 72 15 44 e1 a1 80 c6 44 e1 ba d0 00 00 00 e8 ef f6 EIP: [<e143a0b8>] ecryptfs_lookup+0x1d3/0x4a2 [ecryptfs] SS:ESP 0069:d7327e5c <0>Kernel panic - not syncing: Fatal exception The error messages seen before the "cut here" are from the first run.
FWIW the current path to the oops seems to be: mount ecryptfs mknod foo c 0 0 umount ecryptfs remount ecryptfs rm foo; rm foo (if 2nd is needed...) :) -Eric
it's actually not the creation failure so much as that it fails to open the file. Having failed to open, it goes down an error path to dput the dentry, and the dentry goes to count 0. The 2nd time it tries to unlink (and therefore open the lower file), it finds the dentry with 0 refcount and hits the explicit BUG().
So Eric and I spent the day sorting out what was what in our local ecryptfs trees, found out we weren't actually running the same thing, thus some of the differences in behaviour we were seeing... After that, Eric got things finally not oopsing and/or panicking, at least not on this attack vector... :) Patches sent off to Mike for sanity-checking, but they do indeed help things out a lot here.
So, the following 2 patches submitted upstream seem to resolve the issue. Unfortunately they don't yet seem to be in -mm or elsewhere. I'll try to give them a nudge, because this is a simple way to oops the kernel w/ ecryptfs. :( [PATCH] eCryptfs: Do not try to open device files on mknod [PATCH] eCryptfs: Make all persistent file opens delayed -Eric
Actually the patches in comment #9 are upstream, I was just missing a refresh :)
in kernel-2.6.18-115.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-0225.html