Bug 1433899 - Workstation Live panics during boot
Summary: Workstation Live panics during boot
Keywords:
Status: CLOSED DUPLICATE of bug 1430297
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 26
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F26AlphaBlocker
TreeView+ depends on / blocked
 
Reported: 2017-03-20 10:46 UTC by Kamil Páral
Modified: 2019-01-09 12:54 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-23 21:14:21 UTC
Type: Bug


Attachments (Terms of Use)
kernel panic screenshot (15.18 KB, image/png)
2017-03-20 10:46 UTC, Kamil Páral
no flags Details
another panic screenshot, maybe better (15.93 KB, image/png)
2017-03-20 10:50 UTC, Kamil Páral
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1430297 None None None Never
Red Hat Bugzilla 1434462 None None None Never

Internal Links: 1430297 1434462

Description Kamil Páral 2017-03-20 10:46:14 UTC
Created attachment 1264736 [details]
kernel panic screenshot

Description of problem:
If media test is attempted, Workstation Live panics on boot (even before mediacheck is started). If I don't attempt media test, the image boots fine.

Version-Release number of selected component (if applicable):
Fedora-Workstation-Live-x86_64-26_Alpha-1.1.iso
dracut-044-177.fc26.x86_64
kernel-4.11.0-0.rc2.git2.2.fc26.x86_64

How reproducible:
always

Steps to Reproduce:
1. use a default virt-manager VM
2. mount Workstation Live Alpha RC1.1 and try to boot it with media check performed
3. immediate kernel panic

Comment 1 Kamil Páral 2017-03-20 10:50:52 UTC
Created attachment 1264737 [details]
another panic screenshot, maybe better

Comment 2 Kamil Páral 2017-03-20 10:53:03 UTC
Both I and jsedlak reproduced this (F25 host). However, after trying a few times, it started working (the mediacheck is performed and completed fine, image boots) and we can't reproduce this anymore, even when trying many times. So perhaps this is a race condition?

Comment 3 Kamil Páral 2017-03-20 10:55:27 UTC
Proposing as a conditional blocker under:
"All release-blocking images must boot in their supported configurations. "
https://fedoraproject.org/wiki/Fedora_26_Alpha_Release_Criteria#Release-blocking_images_must_boot

Let's see how many people and how often can reproduce this (please try multiple times).

Comment 4 Kamil Páral 2017-03-20 11:07:41 UTC
So, after few minutes, jsedlak reproduced this again. Also, this sometimes seems to happen before mediacheck is started, and sometimes after it reaches 100%.

Comment 5 Jan Sedlák 2017-03-20 11:12:52 UTC
I was able to reproduce it using serial console. The VM running this has two CPU cores. This is its output:

[jsedlak@dhcp-28-124 ~]$ sudo virsh console fedora25
Connected to domain fedora25
Escape character is ^]
[    3.380815] dracut-pre-udev[364]: rpcbind: /run/rpcbind/rpcbind.lock: No such file or directory
[    3.742400] general protection fault: 0000 [#1] SMP
[    3.743016] Modules linked in: garp stp llc mrp virtio_blk virtio_net virtio_console crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw virtio_pci virtio_ring virtio ata_generic pata_acpi qemu_fw_cfg sunrpc scsi_transport_iscsi loop
[    3.744636] CPU: 1 PID: 21 Comm: rcuos/1 Not tainted 4.11.0-0.rc2.git2.2.fc26.x86_64 #1
[    3.745218] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
[    3.745839] task: ffff9fe97a14a580 task.stack: ffffc1aac03c0000
[    3.746307] RIP: 0010:rcu_nocb_kthread+0x15d/0x500
[    3.746692] RSP: 0018:ffffc1aac03c3e78 EFLAGS: 00010282
[    3.747088] RAX: ff0074757074756f RBX: ffff9fe97d51a3c0 RCX: ffff9fe97a14a580
[    3.747623] RDX: 0000000080000000 RSI: 0000000000000200 RDI: ffff9fe97ed0c000
[    3.748161] RBP: ffffc1aac03c3ef8 R08: ffff9fe97ed0c600 R09: 000000018010000d
[    3.748702] R10: fffff43041fdec40 R11: 0000000000003d00 R12: 000000000000007c
[    3.749219] R13: 000000000000007c R14: ffff9fe97ed0c000 R15: 2d316f6974726976
[    3.749740] FS:  0000000000000000(0000) GS:ffff9fe97d500000(0000) knlGS:0000000000000000
[    3.750314] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.750738] CR2: 000056183737d2f8 CR3: 000000007f9cc000 CR4: 00000000003406e0
[    3.751433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    3.752551] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    3.753220] Call Trace:
[    3.753517]  ? get_state_synchronize_sched+0x20/0x20
[    3.753974]  kthread+0x11e/0x140
[    3.754196]  ? kthread_park+0x90/0x90
[    3.754453]  ret_from_fork+0x2c/0x40
[    3.754803] Code: 01 00 00 00 e8 85 63 75 00 4d 8b 3e 4d 85 ff 74 ee 65 81 05 b2 a9 ef 47 00 02 00 00 49 8b 46 08 4c 89 f7 48 3d ff 0f 00 00 76 2e <ff> d0 be 00 02 00 00 48 c7 c7 ef 28 11 b8 45 8d 6c 24 01 e8 0b 
[    3.757637] RIP: rcu_nocb_kthread+0x15d/0x500 RSP: ffffc1aac03c3e78
[    3.758500] ---[ end trace 68270651d3d36818 ]---
[    3.759157] Kernel panic - not syncing: Fatal exception in interrupt
[    3.760101] Kernel Offset: 0x37000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[    3.761565] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

Comment 6 sumantro 2017-03-20 12:06:18 UTC
I have the same problem while booting in Fedora 26 Aplha 1.1 in Virtual Machine Manager.

Comment 7 Kamil Páral 2017-03-20 14:15:28 UTC
Petr Schindler has hit this as well. Even though it seems like a race, it's clearly very common.

Comment 8 Adam Williamson 2017-03-20 15:33:05 UTC
https://fedoraproject.org/wiki/Fedora_26_Final_Release_Criteria#Media_consistency_verification seems like the most relevant criterion here, and is for Final.

Comment 9 Mike Ruckman 2017-03-20 16:19:44 UTC
This didn't happen on my latest bare metal installation.

Comment 10 Geoffrey Marr 2017-03-20 21:00:21 UTC
Discussed during the 2017-03-20 blocker review meeting: [1]

The decision was made to classify this bug as an AcceptedBlocker (Final) as it violates the following criteria:

"Validation of install media must work correctly for all release-blocking images."

[1] https://meetbot.fedoraproject.org/fedora-blocker-review/2017-03-20/f26-blocker-review.2017-03-20-16.06.txt

Comment 11 Kamil Páral 2017-03-22 13:15:16 UTC
See bug 1434462 comment 11, which was supposed to be present here (Stephen talking about booting Live). Re-proposing for Alpha, it's happening even without mediacheck. My suspicion is that this is the same bug as bug 1434462.

Comment 12 Jan Kurik 2017-03-23 13:23:36 UTC
Does it happen only in VM ? If so, we might use the same criterion as in https://bugzilla.redhat.com/show_bug.cgi?id=1434462#c14 and leave this blocker for Beta, instead of blocking Alpha.

Comment 13 Adam Williamson 2017-03-23 15:38:31 UTC
It's probably the same bug, as Kamil said. I was expecting us to wind up marking them as dupes.

Comment 14 Adam Williamson 2017-03-23 17:13:04 UTC
I also suspect https://bugzilla.redhat.com/show_bug.cgi?id=1430297 is the same bug, and they're all the same as https://bugzilla.kernel.org/show_bug.cgi?id=194911 .

Comment 15 Adam Williamson 2017-03-23 21:14:21 UTC
As 1430297 is the earliest report, and we're fairly sure these are all the same problem, marking as a dupe of that. A kernel build with a potential fix is currently running, we will ask all affected people to test with that build once it's done. We can un-dupe reports later if there turn out to be separate bugs.

*** This bug has been marked as a duplicate of bug 1430297 ***

Comment 16 Adam Williamson 2017-04-04 19:18:35 UTC
This got fixed before Alpha, so doesn't need commonbugs.


Note You need to log in before you can comment on or make changes to this bug.