Bug 1176087
Summary: | [6.6-3.5]kernel panic occurred when boot hypervisor from UEFI machine | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | cshao <cshao> | ||||||||
Component: | ovirt-node | Assignee: | Fabian Deutsch <fdeutsch> | ||||||||
Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||
Severity: | urgent | Docs Contact: | |||||||||
Priority: | urgent | ||||||||||
Version: | 3.5.0 | CC: | cshao, ecohen, fdeutsch, gklein, hadong, huiwa, iheim, lsurette, msnitzer, rbarry, yaniwang, ycui | ||||||||
Target Milestone: | --- | Keywords: | Regression, TestBlocker | ||||||||
Target Release: | 3.5.0 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | node | ||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2014-12-22 10:39:37 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | Node | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1164308, 1164311 | ||||||||||
Attachments: |
|
Mike, the change between the previous RHEV-H which was not affected by this bug, and the build which is affected is (beyond others): -kernel-2.6.32-504.1.3.el6.src.rpm +kernel-2.6.32-504.3.3.el6.src.rpm I saw that many dm patches went in between those two versions. Can you tell if this might be related to those patches? (In reply to Fabian Deutsch from comment #1) > Mike, the change between the previous RHEV-H which was not affected by this > bug, and the build which is affected is (beyond others): > > -kernel-2.6.32-504.1.3.el6.src.rpm > +kernel-2.6.32-504.3.3.el6.src.rpm > > I saw that many dm patches went in between those two versions. Can you tell > if this might be related to those patches? The changes that went in were focused on improving DM thin-provisioning. $ git log rhel-6.6.z/master -- drivers/md | grep "RHEL6.7 PATCH" | tac O-Subject: [RHEL6.7 PATCH 01/25] dm thin: fix DMERR typo in pool_status error path O-Subject: [RHEL6.7 PATCH 02/25] dm thin: cleanup noflush_work to use a proper completion O-Subject: [RHEL6.7 PATCH 03/25] dm thin metadata: do not allow the data block size to change O-Subject: [RHEL6.7 PATCH 04/25] dm bufio: use kzalloc when allocating dm_bufio_client O-Subject: [RHEL6.7 PATCH 05/25] dm bufio: update last_accessed when relinking a buffer O-Subject: [RHEL6.7 PATCH 06/25] dm bufio: switch from a huge hash table to an rbtree O-Subject: [RHEL6.7 PATCH 07/25] dm bufio: evict buffers that are past the max age but retain some buffers O-Subject: [RHEL6.7 PATCH 08/25] dm bio prison: switch to using a red black tree O-Subject: [RHEL6.7 PATCH 09/25] dm thin metadata: change dm_thin_find_block to allow blocking, but not issuing, IO O-Subject: [RHEL6.7 PATCH 10/25] dm transaction manager: add support for prefetching blocks of metadata O-Subject: [RHEL6.7 PATCH 11/25] dm thin: prefetch missing metadata pages O-Subject: [RHEL6.7 PATCH 12/25] dm thin: throttle incoming IO O-Subject: [RHEL6.7 PATCH 14/25] dm thin: adjust max_sectors_kb based on thinp blocksize O-Subject: [RHEL6.7 PATCH 15/25] dm: improve documentation and code clarity in dm_merge_bvec O-Subject: [RHEL6.7 PATCH 16/25] dm thin: implement thin_merge O-Subject: [RHEL6.7 PATCH 17/25] dm thin: grab a virtual cell before looking up the mapping O-Subject: [RHEL6.7 PATCH 18/25] dm thin: performance improvement to discard processing O-Subject: [RHEL6.7 PATCH 19/25] dm thin: factor out remap_and_issue_overwrite O-Subject: [RHEL6.7 PATCH 20/25] dm thin: defer whole cells rather than individual bios O-Subject: [RHEL6.7 PATCH 21/25] dm thin: remap the bios in a cell immediately O-Subject: [RHEL6.7 PATCH 22/25] dm thin: direct dispatch when breaking sharing O-Subject: [RHEL6.7 PATCH 23/25] dm thin: sort the deferred cells O-Subject: [RHEL6.7 PATCH 24/25] dm thin: optimize retry_bios_on_resume O-Subject: [RHEL6.7 PATCH 25/25] dm thin: refactor requeue_io to eliminate spinlock bouncing O-Subject: [RHEL6.7 PATCH 26/25] dm thin: fix potential for infinite loop in pool_io_hints O-Subject: [RHEL6.7 PATCH v2 27/25] dm thin: fix pool_io_hints to avoid looking at max_hw_sectors I see you're using old DM snapshot (which has nothing to do with dm-thinp).. and there are errors about trying to use "DM_snapshot_cow" has a filesystem type when mounting. But beyond that I have no context to be able to _really_ say what the system was doing. But I really doubt these DM changes have anything to do with you your UEFI boot problem. (In reply to Mike Snitzer from comment #2) > I see you're using old DM snapshot (which has nothing to do with dm-thinp).. NOTE: dm-snapshot does use dm-bufio. And there were a handful of dm-bufio changes listed in comment#2. But I'm not aware of any potential for dm-snapshot regression with these dm-bufio changes. I think you need to first silence the "mount: unknown filesystem type 'DM_snapshot_cow'" errors. Created attachment 971897 [details]
r510.log
Hi fabiand,
I just obtain the panic log info via serial console, provides for you to debug.
Thanks!
(In reply to shaochen from comment #4) > Created attachment 971897 [details] > r510.log > > Hi fabiand, > > I just obtain the panic log info via serial console, provides for you to > debug. > Thanks! Chen, Thanks. We also need more, please add _rdshell_ _rdinitdebug_ and removing _quiet_ to get /init.log for helps, btw rdsosreport is not available a on rhel 6.6. Thanks Ying (In reply to Ying Cui from comment #5) > (In reply to shaochen from comment #4) > > Created attachment 971897 [details] > > r510.log > > > > Hi fabiand, > > > > I just obtain the panic log info via serial console, provides for you to > > debug. > > Thanks! > > Chen, Thanks. > We also need more, please add _rdshell_ _rdinitdebug_ and removing _quiet_ > to get /init.log for helps, btw rdsosreport is not available a on rhel 6.6. > > Thanks > Ying OK, I have added "rdshell" "rdinitdebug" to CMD and obtain the new log , Please check "r510-new-output.log" & "init.log" for more details. Thanks! Created attachment 971916 [details]
r510-new-output.log
Created attachment 971917 [details]
init.log
Test version: rhev-hypervisor6-6.6-20141218.0.el6ev ovirt-node-3.1.0-0.37.20141218gitcf277e1.el6.noarch Test 5 times after pull out the usb disk, didn't met kernel panic issue any more, so close this bug as WORKSFORME. Thanks! Due to env. issue, I consider to close it as notabug. Thanks. |
Created attachment 971101 [details] kernel-panic-r510.png Description of problem: [6.6-3.5]kernel panic occurred when boot from UEFI machine(Dell-R510) Version-Release number of selected component (if applicable): rhev-hypervisor6-6.6-20141218.0.el6ev ovirt-node-3.1.0-0.37.20141218gitcf277e1.el6.noarch How reproducible: I tested about 4 times, 2 times encountered this bug) Steps to Reproduce: 1. Enter UEFI mode on Dell-r510. 2. Attach virtual-media and boot from it. 3. Reinstall the hypervisor on dell-r510 with uefi mode. 4. Reboot 5. Boot the hypervisor with uefi mode. Actual results: kernel panic occurred when boot from UEFI machine(Dell-R510) Expected results: Boot the hypervisor can succeed on UEFI mode. Additional info: We didn't met this issue on rhev-hypervisor6-6.6-20141119.0.iso(6.6-3.5), so consider it is a regression bug. Due to met kernel panic issue, so I can provide more log info,