Bug 1195899
Summary: | Live boot sometimes fails with a NULL pointer dereference when mounting /run/initramfs/squashfs | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Adam Williamson <awilliam> |
Component: | kernel | Assignee: | Mikuláš Patočka <mpatocka> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Storage QE <storage-qe> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 22 | CC: | admiller, agk, bugzilla, gansalmon, harald, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab, mfabian, mschmidt, msnitzer, robatino, samuel-rhbugs, satellitgo, walters |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-10-20 20:38:34 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Adam Williamson
2015-02-24 20:13:10 UTC
Created attachment 994828 [details]
Full journal from a successful boot with the same image (for comparison)
There are some suspicious messages from systemd just before the crash, but there are similar messages in a successful boot too. See the attached log from a successful boot with the same image. The attached journal shows some issues with the squashfs unit failing to mount but dm probably shouldn't be oopsing in any case. Feb 24 20:03:30 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 Feb 24 20:03:30 localhost kernel: IP: [<ffffffff8160735a>] dm_exception_store_set_chunk_size+0x6a/0x140 Feb 24 20:03:30 localhost kernel: PGD 0 Feb 24 20:03:30 localhost kernel: Oops: 0000 [#1] SMP Feb 24 20:03:30 localhost kernel: Modules linked in: isofs squashfs 8021q garp stp llc mrp 8139too virtio_scsi virtio_blk virtio_balloon qxl drm_kms_helper ttm crct10dif_pclmul crc32_pclmul crc32c_intel drm ghash_clmulni_intel sym53c8xx 8139cp scsi_transport_spi virtio_pci virtio_ring virtio mii ata_generic pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua sunrpc loop Feb 24 20:03:30 localhost kernel: CPU: 1 PID: 548 Comm: dmsetup Not tainted 4.0.0-0.rc1.git0.1.fc22.x86_64 #1 Feb 24 20:03:30 localhost kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Feb 24 20:03:30 localhost kernel: task: ffff88007fa9cf00 ti: ffff88007ea1c000 task.ti: ffff88007ea1c000 Feb 24 20:03:30 localhost kernel: RIP: 0010:[<ffffffff8160735a>] [<ffffffff8160735a>] dm_exception_store_set_chunk_size+0x6a/0x140 Feb 24 20:03:30 localhost kernel: RSP: 0018:ffff88007ea1fb08 EFLAGS: 00010246 Feb 24 20:03:30 localhost kernel: RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000001 Feb 24 20:03:30 localhost kernel: RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88007a6a7e00 Feb 24 20:03:30 localhost kernel: RBP: ffff88007ea1fb28 R08: 0000000000000000 R09: 0000000000000001 Feb 24 20:03:30 localhost kernel: R10: 000000000000000a R11: f000000000000000 R12: ffff88007faf9180 Feb 24 20:03:30 localhost kernel: R13: ffffc900044f6088 R14: ffff88007ea1fbc4 R15: ffff88007faf9180 Feb 24 20:03:30 localhost kernel: FS: 00007f4ced679800(0000) GS:ffff88007d700000(0000) knlGS:0000000000000000 Feb 24 20:03:30 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Feb 24 20:03:30 localhost kernel: CR2: 0000000000000098 CR3: 000000007ea2d000 CR4: 00000000000007e0 Feb 24 20:03:30 localhost kernel: Stack: Feb 24 20:03:30 localhost kernel: 0000000000000008 ffffc900044f6040 ffff88007a6a7e00 ffff88007faf9550 Feb 24 20:03:30 localhost kernel: ffff88007ea1fb98 ffffffff816075ef 0000000000000168 000000007ac01600 Feb 24 20:03:30 localhost kernel: ffffffff81ccbb20 ffff88007a6a7f30 000000087faf9258 00000000fef9ff77 Feb 24 20:03:30 localhost kernel: Call Trace: Feb 24 20:03:30 localhost kernel: [<ffffffff816075ef>] dm_exception_store_create+0x1bf/0x230 Feb 24 20:03:30 localhost kernel: [<ffffffff816052fa>] snapshot_ctr+0x12a/0x650 Feb 24 20:03:30 localhost kernel: [<ffffffff815f9448>] ? dm_split_args+0x68/0x170 Feb 24 20:03:30 localhost kernel: [<ffffffff815f96ab>] dm_table_add_target+0x15b/0x3b0 Feb 24 20:03:30 localhost kernel: [<ffffffff815f9207>] ? dm_table_create+0x87/0x140 Feb 24 20:03:30 localhost kernel: [<ffffffff815fceab>] table_load+0x14b/0x370 Feb 24 20:03:30 localhost kernel: [<ffffffff815fcd60>] ? retrieve_status+0x1c0/0x1c0 Feb 24 20:03:30 localhost kernel: [<ffffffff815fdb52>] ctl_ioctl+0x232/0x520 Feb 24 20:03:30 localhost kernel: [<ffffffff81317818>] ? SYSC_semtimedop+0x2c8/0xea0 Feb 24 20:03:30 localhost kernel: [<ffffffff815fde53>] dm_ctl_ioctl+0x13/0x20 Feb 24 20:03:30 localhost kernel: [<ffffffff81235056>] do_vfs_ioctl+0x2c6/0x4d0 Feb 24 20:03:30 localhost kernel: [<ffffffff812352e1>] SyS_ioctl+0x81/0xa0 Feb 24 20:03:30 localhost kernel: [<ffffffff81786029>] system_call_fastpath+0x12/0x17 Feb 24 20:03:30 localhost kernel: Code: 8b 80 98 00 00 00 48 8b 80 78 03 00 00 48 85 c0 74 0c 0f b7 80 28 06 00 00 66 85 c0 75 73 49 8b 7c 24 08 e8 59 cf ff ff 48 8b 00 <48> 8b 80 98 00 00 00 48 8b 80 78 03 00 00 48 85 c0 74 0c 0f b7 Feb 24 20:03:30 localhost kernel: RIP [<ffffffff8160735a>] dm_exception_store_set_chunk_size+0x6a/0x140 Feb 24 20:03:30 localhost kernel: RSP <ffff88007ea1fb08> Feb 24 20:03:30 localhost kernel: CR2: 0000000000000098 Feb 24 20:03:30 localhost kernel: ---[ end trace c82c3f5c1b456124 ]--- This is the only recent change to dm-snapshot (which went in during the 4.0 merge window): http://git.kernel.org/linus/22aa66a3ee5 Not clear why/if it'd pose a problem on snapshot_ctr's call to dm_exception_store_set_chunk_size though... Assigning to Mikulas to triage further. *** Bug 1196236 has been marked as a duplicate of this bug. *** It is caused by the bdev field of struct dm_dev being NULL. But I don't know what causes it. I couldn't find how device mapper could set this field to NULL. Maybe there is some general memory corruption in other kernel component and you need to bisect it. So, we're trying to pin it down a bit. We're doing a few more tests, but so far, it looks like the 02-18 nightly live images do not have this bug, while 02-21 nightly lives do. Unfortunately nightly generation failed on 02-19 and 02-20, so that's as tightly as we can pin it down. The 02-18 nightlies have kernel 3.20.0-0.rc0.git7.3.fc22 . The 02-21 nightlies have 3.20.0-0.rc0.git10.1.fc22 . So that looks to be the delta so far as the kernel goes. Note, 02-18 nightly boots do *not* have these systemd messages: Feb 24 20:03:27 localhost systemd[1]: Reached target Basic System. Feb 24 20:03:27 localhost systemd[1]: Starting Basic System. Feb 24 20:03:30 localhost kernel: squashfs: version 4.0 (2009/01/31) Phillip Lougher Feb 24 20:03:30 localhost kernel: ISO 9660 Extensions: Microsoft Joliet Level 3 Feb 24 20:03:30 localhost kernel: ISO 9660 Extensions: RRIP_1991A Feb 24 20:03:30 localhost systemd[1]: Unit run-initramfs-squashfs.mount is bound to inactive unit. Stopping, too. Feb 24 20:03:30 localhost systemd[1]: Unmounting /run/initramfs/squashfs... Feb 24 20:03:30 localhost umount[526]: umount: /run/initramfs/squashfs: target is busy Feb 24 20:03:30 localhost umount[526]: (In some cases useful info about processes that Feb 24 20:03:30 localhost umount[526]: use the device is found by lsof(8) or fuser(1).) Feb 24 20:03:30 localhost systemd[1]: run-initramfs-squashfs.mount mount process exited, code=exited status=32 Feb 24 20:03:30 localhost systemd[1]: Failed unmounting /run/initramfs/squashfs. Feb 24 20:03:30 localhost systemd[1]: Unit run-initramfs-squashfs.mount is bound to inactive unit. Stopping, too. Feb 24 20:03:30 localhost systemd[1]: Unmounting /run/initramfs/squashfs... Feb 24 20:03:30 localhost umount[529]: umount: /run/initramfs/squashfs: target is busy Feb 24 20:03:30 localhost umount[529]: (In some cases useful info about processes that Feb 24 20:03:30 localhost umount[529]: use the device is found by lsof(8) or fuser(1).) Feb 24 20:03:30 localhost systemd[1]: run-initramfs-squashfs.mount mount process exited, code=exited status=32 Feb 24 20:03:30 localhost systemd[1]: Failed unmounting /run/initramfs/squashfs. Feb 24 20:03:30 localhost systemd[1]: Unit run-initramfs-squashfs.mount is bound to inactive unit. Stopping, too. Feb 24 20:03:30 localhost systemd[1]: Unmounting /run/initramfs/squashfs... Feb 24 20:03:30 localhost systemd[1]: Unmounted /run/initramfs/squashfs. Feb 24 20:03:30 localhost systemd[1]: Unit run-initramfs-squashfs.mount entered failed state. Feb 24 20:03:30 localhost systemd[1]: run-initramfs-squashfs.mount failed to run 'mount' task: No such file or directory Feb 24 20:03:30 localhost systemd[1]: Failed to mount /run/initramfs/squashfs. Feb 24 20:03:30 localhost systemd[1]: Mounting /run/initramfs/squashfs... The 02-18 journal around that point just looks like this: Feb 25 15:48:28 localhost systemd[1]: Starting Basic System. Feb 25 15:48:28 localhost systemd[1]: Reached target Basic System. Feb 25 15:48:28 localhost unknown[1]: <audit-1130> pid=1 uid=0 auid=4294967295 s Feb 25 15:48:30 localhost kernel: squashfs: version 4.0 (2009/01/31) Phillip Lou Feb 25 15:48:30 localhost kernel: ISO 9660 Extensions: Microsoft Joliet Level 3 Feb 25 15:48:30 localhost kernel: ISO 9660 Extensions: RRIP_1991A Feb 25 15:48:30 localhost systemd[1]: Started dracut initqueue hook. Feb 25 15:48:30 localhost kernel: audit_printk_skb: 6 callbacks suppressed Feb 25 15:48:30 localhost kernel: audit: type=1130 audit(1424897310.802:13): pid Feb 25 15:48:30 localhost unknown[1]: <audit-1130> pid=1 uid=0 auid=4294967295 s Feb 25 15:48:30 localhost systemd[1]: Starting Remote File Systems (Pre). Feb 25 15:48:30 localhost systemd[1]: Reached target Remote File Systems (Pre). so perhaps there's some kind of dracut or systemd issue involved here too. A dracut build landed on 02-19: http://koji.fedoraproject.org/koji/buildinfo?buildID=612988 which has two 'interesting' changelog entries: - fixed shutdown routine for dm devices - Include squashfs support for rd.live.fsimg We *also* went from systemd 218 to systemd 219 between 02-18 (the working 02-18 nightly has systemd-218-5, 02-21 has 219-4). systemd changelog: 219-4 - Revive fedora-disable-resolv.conf-symlink.patch to unbreak composes. 219-3 - arm: disabling gold did not help; disable lto instead (#1193212) 219-2 - Update 90-default.present for dbxtool. 219-1 - New upstream release - This removes the sysctl/bridge hack, a different solution needs to be found for this (see #634736) - This removes the /etc/resolv.conf hack, anaconda needs to fix their handling of /etc/resolv.conf as symlink - This enables "%check" - disable gold on arm, as that is broken (see #1193212) 218-6 - aarch64 now has seccomp support dracut looks like a more likely suspect than systemd to me, but obviously it'd be a good idea to look into both changes. CCing harald and michal - can you guys see if any of your changes could be related to this bug? Especially harald. It will be a proposed Alpha blocker so it's a priority issue. Proposing as an Alpha blocker, conditional violation of "All release-blocking images must boot in their supported configurations." Exactly how serious the bug is seems to vary from image to image and possibly system to system, but satellit says 02-25 nightly always fails to boot for him, and I found the 02-21 32-bit Workstation nightly always fails to boot for me in a KVM. 02-24 is more intermittent (sometimes works, sometimes fails). We'll see how badly TC5 is affected when it lands. Created attachment 995347 [details]
rd.debug log (up to switch root) from an 02-21 (buggy) boot
This is the log of a boot of the 02-21 32-bit Workstation nightly with 'rd.debug' added. Interestingly, this boot succeeded - the first time I've ever got that image to boot.
This should provide more detail about what's going on with those /run/initramfs/squashfs mount messages. I will attach an equivalent log from 02-18 next for comparison.
Created attachment 995348 [details]
rd.debug log (up to switch root) from an 02-18 (not buggy) boot
Created attachment 995351 [details]
rd.debug log (up to switch root) from an 02-18 (not buggy) boot
It's kernel memory corruption, so it can't be caused by dracut or systemd. I suggest that you disable kernel modules, until you find out which one caused that crash in dm-snapshot. Maybe it's bug in the squashfs kernel driver. Mikulas: sorry, I should clarify. The kernel crasher *per se* can't be caused by dracut or systemd, no. But our theory is that something has changed in dracut or systemd which is causing us to hit a codepath that triggers an underlying kernel bug, whereas we didn't before. As I mentioned above, there's a bunch of stuff happening to a mount - /run/initramfs/squashfs - in the 02-18 logs *right before the crash is triggered* which does not happen at all in the 02-21 logs. It'd be good to fix the kernel crasher in any case, clearly, but we suspect there are actually two bugs here: i) the kernel crasher ii) whatever changed in the whole 'init the live environment' area which leads to us hitting the kernel crasher and right now I'm trying to debug ii), not i). On that path, here's a systemd commit which looks rather like it might be relevant: http://cgit.freedesktop.org/systemd/systemd/commit/?id=06e97888883e2cc12eb6514e80c7f0014295f59b So, turns out Colin Walters had already done a systemd build with that commit reverted in order to fix *another* bug: http://koji.fedoraproject.org/koji/buildinfo?buildID=614715 https://bugzilla.redhat.com/show_bug.cgi?id=1195761 I just built a live image with that build of systemd, and confirmed that it does not hit any of the "Unit run-initramfs-squashfs.mount is bound to inactive unit" messages. So I believe that reversion also resolves this bug. Of course, upstream might not want to simply revert the change, presumably they'll want to find a way to keep the benefit without causing the breakage. So, I'm thinking we can keep this bug for the kernel crash - because presumably the kernel shouldn't crash even when systemd is trying to unmount then remount /run/initramfs/squashfs - but drop the blocker status, and I'll file a new bug on systemd for the systemd part of it. Mikulas: so for your purposes, the new information is that this seems to be triggered by systemd doing something it didn't do before - repeatedly trying to unmount /run/initramfs/squashfs , then try to mount it again. That may help you track it down, maybe? the underlying bug may have been in the kernel for some time - we can't say it only appeared in the *kernel* recently, as what changed to make us start seeing this bug was *systemd's* behaviour. Mikulas, I'm familiar at all with the dm code, but is there a possibility that the bdev pointer is freed out from under dm if the block device goes away? The live image seems to do all kinds of weird things with loop devices, so I'm curious if the destruction of a loopdev after dm code is called could cause the NULL pointer error. Looking in dm_get_device I see: } else { /* convert the path to a device */ struct block_device *bdev = lookup_bdev(path); if (IS_ERR(bdev)) return PTR_ERR(bdev); dev = bdev->bd_dev; bdput(bdev); } It gets bdev, and then assigns the local dev based on bdev->bd_dev. But then it does a bdput on bdev, which I think essentially frees the reference count dm had on that block device. Or is there another way that bdev is pinned and stored by dm? Ugh, my previous comment should have started: "I'm NOT (as in not at all) familiar with..." The problem is that there is no clear code path that could set dm_dev->bdev to NULL. dm_get_device calls dm_get_table_device and dm_get_table_device fills the dm_dev structure. dm_get_table_device performs td->dm_dev.bdev = NULL and then it calls open_table_device. Could open_table_device succeed and leave td->dm_dev.bdev NULL? If you look at the code, it couldn't. That would happen if blkdev_get_by_dev returned NULL, but it can't return NULL. That's why I assume that the bug is not in dm, but somewhere else in the kernel - something is probably corrupting random memory. I suggest: compile the kernel with debugging (CONFIG_DEBUG_PAGEALLOC, CONFIG_SLUB_DEBUG_ON and others) and hope that it catches the memory corruption sooner. Or - try removing modules from the initrd. Remove the module for squashfs (or replace it with a dummy that fails at the beginning of squashfs_fill_super without doing anything) and see if the crash goes away. If it won't help, try removing other modules from the initrd image and find the one that causes it. It seems unusual for a bug that involves 'corrupting random memory' to always cause the *exact same thing to get set to NULL*, though? Usually when there's a bug along those lines we get a variety of somewhat different backtraces, whereas in this case we seem to keep getting this same one. It looks like our OpenQA box actually hit this backtrace in a run *from the x86_64 network install image*, btw - I tried booting it in a VM several times myself and couldn't hit it there, though. It's certainly more common with live images. The 2015-02-21 i686 Workstation nightly has a debugging kernel (3.20.0-0.rc0.git10.1.fc22.i686), I tried booting that with _slub_debug=1 but it doesn't crash or show any warnings any earlier, still just hits the same trace. *** Bug 1197352 has been marked as a duplicate of this bug. *** This is still affecting Fedora 23 Atomic: Aug 07 16:27:26 localhost.localdomain systemd[1]: Unmounted /sysroot. Aug 07 16:27:25 localhost.localdomain systemd[1]: Unmounting /sysroot... Aug 07 16:27:25 localhost.localdomain systemd[1]: sysroot.mount: Unit is bound to inactive unit dev-mapper-atomicos\x2droot.device. Stopping, too. Aug 07 16:27:15 localhost ostree-prepare-root[541]: Resolved OSTree target to: /sysroot/ostree/deploy/fedora-atomic/deploy/d69a7e73c4ccd9aba218f408791e721e46e1924bc2d55884d333a13bf229424e.0 Aug 07 16:27:15 localhost ostree-prepare-root[541]: Examining /sysroot//ostree/boot.0/fedora-atomic/c9dd3ab6bbf7cfc1971ce2ec7b3a436653a1a73c638a03213b65095e46e36ddb/0 Aug 07 16:27:14 localhost systemd[1]: Mounted /sysroot. Aug 07 16:27:12 localhost systemd[1]: Mounting /sysroot... *********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 22 kernel bugs. Fedora 22 has now been rebased to 4.2.3-200.fc22. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 23, and are still experiencing this issue, please change the version to Fedora 23. If you experience different issues, please open a new bug report for those. I think we actually fixed this one through a different bug. We bisected that bug back to 3.20 kernels and Mike and company were able to provide a fix. I'm closing this out. If we see it again, we can reopen or just open a new bug. |