Bug 804345
Summary: | init_module sometimes fails with "Bad message" (EBADMSG) | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Tom <thomasbelvin> | ||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | high | ||||||
Version: | rawhide | CC: | agk, bmarzins, dhowells, dwysocha, extras-orphan, gansalmon, heinzm, itamar, jonathan, kernel-maint, madhu.chinakonda, mbroz, prockai, rjones | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-03-28 13:33:43 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Tom
2012-03-18 00:13:54 UTC
That error "crypt: unknown target type" means you do not have the dm-crypt kernel module available. If this is during boot, check your initrd, dracut etc. If later, check where the module is and why it's not loading (lsmod, modprobe, /usr issue maybe etc.) I see that I have /initrd/lib/modules/3.3.0-0.rc7.git2.1.fc18.x86_64/kernel/drivers/md/dm-crypt.ko in both kernels, but it isn't loading for this one during boot. I tried recreating initramfs with dracut, but it didn't help. I can unlock drive normally again with the next kernel-3.3.0-0.rc7.git2.3.fc17 *** Bug 804365 has been marked as a duplicate of this bug. *** According to comment #3, it works with kernel-3.3.0-0.rc7.git2.3.fc17/fc18 again? Seems like a glitch in the kernel build then. I have kicked off another libguestfs build against the git2.3 kernel, so we'll see if this is resolved. (In reply to comment #6) > I have kicked off another libguestfs build against the > git2.3 kernel, so we'll see if this is resolved. ... or at least, I would do if the git2.3 kernel was available in F18, which it isn't. It seems like comment 3 "fixed" things just because Tom tried a F17 kernel, not an F18 kernel. I disabled LUKS tests in libguestfs and rebuilt, and it failed again with a strange module loading bug: md: personality for level 5 is not loaded! mdadm: RUN_ARRAY failed: Invalid argument Since this is the same git2.1 kernel, I wonder if there is a generic module loading bug? This was all working 2 days ago ... Full log (huge) is here: http://koji.fedoraproject.org/koji/getfile?taskID=3905826&name=build.log Something is broken in this kernel build... # modprobe dm-crypt ERROR: could not insert 'dm_crypt': Bad message strace insmod dm-crypt.ko ... stat(".", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 stat("/lib/modules/3.3.0-0.rc7.git2.1.fc18.x86_64/kernel/drivers/md", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 stat("/lib/modules/3.3.0-0.rc7.git2.1.fc18.x86_64/kernel/drivers/md/./dm-crypt.ko", {st_mode=S_IFREG|0744, st_size=34864, ...}) = 0 open("/lib/modules/3.3.0-0.rc7.git2.1.fc18.x86_64/kernel/drivers/md/./dm-crypt.ko", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1", 6) = 6 lseek(3, 0, SEEK_SET) = 0 fstat(3, {st_mode=S_IFREG|0744, st_size=34864, ...}) = 0 mmap(NULL, 34864, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fdd4584a000 init_module(0x7fdd4584a000, 34864, "") = -1 EBADMSG (Bad message) munmap(0x7fdd4584a000, 34864) = 0 close(3) = 0 write(2, "Error: could not insert module ."..., 58Error: could not insert module ./dm-crypt.ko: Bad message ) = 58 exit_group(1) = ? And kernel-3.3.0-2.fc18.x86_64 works again... Are you sure? libguestfs builds fail, different module, but same error message: [ 0.000000] Linux version 3.3.0-2.fc18.x86_64 (mockbuild.fedoraproject.org) (gcc version 4.7.0 20120308 (Red Hat 4.7.0-0.19) (GCC) ) #1 SMP Mon Mar 19 03:14:14 UTC 2012 [...] mount -o /dev/vda2 /sysroot/ ERROR: could not insert 'fuse': Bad message ntfs-3g-mount: fuse device is missing, try 'modprobe fuse' as root guestfsd: error: /dev/vda2 on /: ERROR: could not insert 'fuse': Bad message ntfs-3g-mount: fuse device is missing, try 'modprobe fuse' as root (Note that ntfs-3g worked a few days ago) Full log is here: http://koji.fedoraproject.org/koji/getfile?taskID=3909368&name=build.log Rawhide is using signed modules now. EBADMSG is the error returned with there is a corrupted signature in the module. I get the same results for fuse.ko on an actual machine so it seems occasionally either something is corrupting the signature on random modules, or stripping portions out, etc. [jwboyer@vader fuse]$ uname -a Linux vader 3.3.0-2.fc18.x86_64 #1 SMP Mon Mar 19 03:14:14 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux [jwboyer@vader fuse]$ sudo modprobe fuse ERROR: could not insert 'fuse': Bad message [jwboyer@vader fuse]$ David, any ideas on how to track this down? Whenever someone mentions "something" "random" "corrupting" and "stripping" in the same sentence, my mind turns to prelink. (In reply to comment #13) > Whenever someone mentions "something" "random" "corrupting" > and "stripping" in the same sentence, my mind turns to prelink. prelink doesn't touch kernel modules AFAIK. They don't link against anything. prelink has nothing to do with it here, I can reproduce it on my local build (on different module though) and prelink is not even installed on this machine. Is there anyone who know how to debug it except adding some printk in kernel? It is not good idea to switch on module verification and keep it broken for everyone. You can ask for debugging information by doing: echo 1 >/sys/module/kernel/parameters/modsign_debug It is possible to get more debugging information, but you have to alter and recompile your kernel. In kernel/module-verify-sig.c there's a line: #undef MODSIGN_DEBUG Switch that to #define and it'll produce checksums. This is debug for some wrong module: [ 3388.252317] looking for sig section '.note.module.sig' [ 3388.252323] sig in section 27 (size 286) [ 3388.252328] ffffff89011b0400010800 [ 3388.252357] Trying parser 'pgp' [ 3388.252371] Parser recognised the format (ret -131936347604992) [ 3388.252377] 0000003d 56 digested the .altinstr_replacement section, size 12 [ 3388.252386] 000000b1 88 digested the .altinstructions section, size 72 [ 3388.252391] 000000d1 97 digested the .bss section, size 4356 [ 3388.252396] 000000f2 d0 digested the .data section, size 0 [ 3388.252414] 00000387 60 digested the .gnu.linkonce.this_module section, size 608 [ 3388.252428] 00000581 55 digested the .init.text section, size 468 [ 3388.252435] 0000061f 2c digested the .modinfo section, size 122 [ 3388.252440] 00000671 ae digested the .note.gnu.build-id section, size 36 [ 3388.252450] 0000075a 76 digested the .parainstructions section, size 188 [ 3388.252463] 000008f7 5a digested the .rela.altinstructions section, nrels 9 [ 3388.252472] 00000985 8e digested the .rela.gnu.linkonce.this_module section, nrels 2 [ 3388.252496] 00000c84 05 digested the .rela.init.text section, nrels 16 [ 3388.252512] 00000e9a 11 digested the .rela.parainstructions section, nrels 12 [ 3388.252539] 00001286 06 digested the .rela.rodata section, nrels 24 [ 3388.252614] 00001dc5 81 digested the .rela.text section, nrels 60 [ 3388.252628] 00001f83 bc digested the .rela__ksymtab section, nrels 10 [ 3388.252638] 000020a5 b6 digested the .rela__ksymtab_gpl section, nrels 6 [ 3388.253917] 000125b8 75 digested the .rodata section, size 66800 [ 3388.253925] 0001262e 0b digested the .rodata.str1.1 section, size 76 [ 3388.253934] 00012730 d7 digested the .rodata.str1.8 section, size 216 [ 3388.254042] 00013ca9 f1 digested the .text section, size 5464 [ 3388.254049] 00013d1e 1c digested the __ksymtab section, size 80 [ 3388.254055] 00013d77 9e digested the __ksymtab_gpl section, size 48 [ 3388.254063] 00013e19 5d digested the __ksymtab_strings section, size 117 [ 3388.254066] Contributed 81433 bytes to the digest (csum 0xa0) [ 3388.254077] verify-sig : -74 The interesting difference with working module is: sig in section 25 (size 287) vs sig in section 27 (size 286) sig_size = note->n_descsz; _debug("sig in section %d (size %d)\n", mvdata->sig_index, sig_size); So why is the signature length -1 here? Is it normal? Ah... this is the Fedora rawhide kernel, not RHEL? Yes, this is rawhide. Someone had the idea to enable it there :-) # ls -l dm-raid.ko.digest.sig -rw-r--r-- 1 root root 287 Mar 20 14:29 dm-raid.ko.digest.sig # modprobe dm-raid ERROR: could not insert 'dm_raid': Bad message # dmesg -c [ 8828.854588] looking for sig section '.note.module.sig' [ 8828.854595] sig in section 27 (size 286) And perhaps someone could patch kmod as well to print something more useful than "Bad message" :) Created attachment 571759 [details]
Proposed patch
Signature contains stored using MPI (multiprecise integers).
According to RFC 4880 sec 3.2, length of MPI is computed starting from most significant bit.
So length can be shorter than checked size in current code.
RFC 3447 sec 8.2.2 RSA size check must be relaxed to conform with shorter signatures if we want to use signatures produced by GPG.
See attached patch where I tried to fix it.
An example of such signature (note data size != 2048)
# gpg --list-packets test.sig
:signature packet: algo 1, keyid B63419B399B11B0E
version 4, created 1332279490, md5len 0, sigclass 0x00
digest algo 8, begin of digest fc 8f
hashed subpkt 2 len 4 (sig created 2012-03-20)
subpkt 16 len 8 (issuer key ID B63419B399B11B0E)
data: [2037 bits]
David, is the patch ok? If so, please can anyone rebuild rawhide kernel with the patch applied? building kernel-3.4.0-0.rc0.git1.2.fc18 with that patch (munged to have file headers. What happened there?) New kernel has fixed libguestfs builds. The patch is okay. This should be fixed in rawhide now. |