Bug 1433899
Summary: | Workstation Live panics during boot | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Kamil Páral <kparal> | ||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 26 | CC: | awilliam, cz172638, gansalmon, gmarr, ichavero, itamar, jkurik, jonathan, jsedlak, kernel-maint, madhu.chinakonda, mchehab, mruckman, robatino, sumukher | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-03-23 21:14:21 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1349184 | ||||||||
Attachments: |
|
Created attachment 1264737 [details]
another panic screenshot, maybe better
Both I and jsedlak reproduced this (F25 host). However, after trying a few times, it started working (the mediacheck is performed and completed fine, image boots) and we can't reproduce this anymore, even when trying many times. So perhaps this is a race condition? Proposing as a conditional blocker under: "All release-blocking images must boot in their supported configurations. " https://fedoraproject.org/wiki/Fedora_26_Alpha_Release_Criteria#Release-blocking_images_must_boot Let's see how many people and how often can reproduce this (please try multiple times). So, after few minutes, jsedlak reproduced this again. Also, this sometimes seems to happen before mediacheck is started, and sometimes after it reaches 100%. I was able to reproduce it using serial console. The VM running this has two CPU cores. This is its output: [jsedlak@dhcp-28-124 ~]$ sudo virsh console fedora25 Connected to domain fedora25 Escape character is ^] [ 3.380815] dracut-pre-udev[364]: rpcbind: /run/rpcbind/rpcbind.lock: No such file or directory [ 3.742400] general protection fault: 0000 [#1] SMP [ 3.743016] Modules linked in: garp stp llc mrp virtio_blk virtio_net virtio_console crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw virtio_pci virtio_ring virtio ata_generic pata_acpi qemu_fw_cfg sunrpc scsi_transport_iscsi loop [ 3.744636] CPU: 1 PID: 21 Comm: rcuos/1 Not tainted 4.11.0-0.rc2.git2.2.fc26.x86_64 #1 [ 3.745218] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014 [ 3.745839] task: ffff9fe97a14a580 task.stack: ffffc1aac03c0000 [ 3.746307] RIP: 0010:rcu_nocb_kthread+0x15d/0x500 [ 3.746692] RSP: 0018:ffffc1aac03c3e78 EFLAGS: 00010282 [ 3.747088] RAX: ff0074757074756f RBX: ffff9fe97d51a3c0 RCX: ffff9fe97a14a580 [ 3.747623] RDX: 0000000080000000 RSI: 0000000000000200 RDI: ffff9fe97ed0c000 [ 3.748161] RBP: ffffc1aac03c3ef8 R08: ffff9fe97ed0c600 R09: 000000018010000d [ 3.748702] R10: fffff43041fdec40 R11: 0000000000003d00 R12: 000000000000007c [ 3.749219] R13: 000000000000007c R14: ffff9fe97ed0c000 R15: 2d316f6974726976 [ 3.749740] FS: 0000000000000000(0000) GS:ffff9fe97d500000(0000) knlGS:0000000000000000 [ 3.750314] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.750738] CR2: 000056183737d2f8 CR3: 000000007f9cc000 CR4: 00000000003406e0 [ 3.751433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3.752551] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3.753220] Call Trace: [ 3.753517] ? get_state_synchronize_sched+0x20/0x20 [ 3.753974] kthread+0x11e/0x140 [ 3.754196] ? kthread_park+0x90/0x90 [ 3.754453] ret_from_fork+0x2c/0x40 [ 3.754803] Code: 01 00 00 00 e8 85 63 75 00 4d 8b 3e 4d 85 ff 74 ee 65 81 05 b2 a9 ef 47 00 02 00 00 49 8b 46 08 4c 89 f7 48 3d ff 0f 00 00 76 2e <ff> d0 be 00 02 00 00 48 c7 c7 ef 28 11 b8 45 8d 6c 24 01 e8 0b [ 3.757637] RIP: rcu_nocb_kthread+0x15d/0x500 RSP: ffffc1aac03c3e78 [ 3.758500] ---[ end trace 68270651d3d36818 ]--- [ 3.759157] Kernel panic - not syncing: Fatal exception in interrupt [ 3.760101] Kernel Offset: 0x37000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 3.761565] ---[ end Kernel panic - not syncing: Fatal exception in interrupt I have the same problem while booting in Fedora 26 Aplha 1.1 in Virtual Machine Manager. Petr Schindler has hit this as well. Even though it seems like a race, it's clearly very common. https://fedoraproject.org/wiki/Fedora_26_Final_Release_Criteria#Media_consistency_verification seems like the most relevant criterion here, and is for Final. This didn't happen on my latest bare metal installation. Discussed during the 2017-03-20 blocker review meeting: [1] The decision was made to classify this bug as an AcceptedBlocker (Final) as it violates the following criteria: "Validation of install media must work correctly for all release-blocking images." [1] https://meetbot.fedoraproject.org/fedora-blocker-review/2017-03-20/f26-blocker-review.2017-03-20-16.06.txt See bug 1434462 comment 11, which was supposed to be present here (Stephen talking about booting Live). Re-proposing for Alpha, it's happening even without mediacheck. My suspicion is that this is the same bug as bug 1434462. Does it happen only in VM ? If so, we might use the same criterion as in https://bugzilla.redhat.com/show_bug.cgi?id=1434462#c14 and leave this blocker for Beta, instead of blocking Alpha. It's probably the same bug, as Kamil said. I was expecting us to wind up marking them as dupes. I also suspect https://bugzilla.redhat.com/show_bug.cgi?id=1430297 is the same bug, and they're all the same as https://bugzilla.kernel.org/show_bug.cgi?id=194911 . As 1430297 is the earliest report, and we're fairly sure these are all the same problem, marking as a dupe of that. A kernel build with a potential fix is currently running, we will ask all affected people to test with that build once it's done. We can un-dupe reports later if there turn out to be separate bugs. *** This bug has been marked as a duplicate of bug 1430297 *** This got fixed before Alpha, so doesn't need commonbugs. |
Created attachment 1264736 [details] kernel panic screenshot Description of problem: If media test is attempted, Workstation Live panics on boot (even before mediacheck is started). If I don't attempt media test, the image boots fine. Version-Release number of selected component (if applicable): Fedora-Workstation-Live-x86_64-26_Alpha-1.1.iso dracut-044-177.fc26.x86_64 kernel-4.11.0-0.rc2.git2.2.fc26.x86_64 How reproducible: always Steps to Reproduce: 1. use a default virt-manager VM 2. mount Workstation Live Alpha RC1.1 and try to boot it with media check performed 3. immediate kernel panic