Created attachment 971101 [details] kernel-panic-r510.png Description of problem: [6.6-3.5]kernel panic occurred when boot from UEFI machine(Dell-R510) Version-Release number of selected component (if applicable): rhev-hypervisor6-6.6-20141218.0.el6ev ovirt-node-3.1.0-0.37.20141218gitcf277e1.el6.noarch How reproducible: I tested about 4 times, 2 times encountered this bug) Steps to Reproduce: 1. Enter UEFI mode on Dell-r510. 2. Attach virtual-media and boot from it. 3. Reinstall the hypervisor on dell-r510 with uefi mode. 4. Reboot 5. Boot the hypervisor with uefi mode. Actual results: kernel panic occurred when boot from UEFI machine(Dell-R510) Expected results: Boot the hypervisor can succeed on UEFI mode. Additional info: We didn't met this issue on rhev-hypervisor6-6.6-20141119.0.iso(6.6-3.5), so consider it is a regression bug. Due to met kernel panic issue, so I can provide more log info,
Mike, the change between the previous RHEV-H which was not affected by this bug, and the build which is affected is (beyond others): -kernel-2.6.32-504.1.3.el6.src.rpm +kernel-2.6.32-504.3.3.el6.src.rpm I saw that many dm patches went in between those two versions. Can you tell if this might be related to those patches?
(In reply to Fabian Deutsch from comment #1) > Mike, the change between the previous RHEV-H which was not affected by this > bug, and the build which is affected is (beyond others): > > -kernel-2.6.32-504.1.3.el6.src.rpm > +kernel-2.6.32-504.3.3.el6.src.rpm > > I saw that many dm patches went in between those two versions. Can you tell > if this might be related to those patches? The changes that went in were focused on improving DM thin-provisioning. $ git log rhel-6.6.z/master -- drivers/md | grep "RHEL6.7 PATCH" | tac O-Subject: [RHEL6.7 PATCH 01/25] dm thin: fix DMERR typo in pool_status error path O-Subject: [RHEL6.7 PATCH 02/25] dm thin: cleanup noflush_work to use a proper completion O-Subject: [RHEL6.7 PATCH 03/25] dm thin metadata: do not allow the data block size to change O-Subject: [RHEL6.7 PATCH 04/25] dm bufio: use kzalloc when allocating dm_bufio_client O-Subject: [RHEL6.7 PATCH 05/25] dm bufio: update last_accessed when relinking a buffer O-Subject: [RHEL6.7 PATCH 06/25] dm bufio: switch from a huge hash table to an rbtree O-Subject: [RHEL6.7 PATCH 07/25] dm bufio: evict buffers that are past the max age but retain some buffers O-Subject: [RHEL6.7 PATCH 08/25] dm bio prison: switch to using a red black tree O-Subject: [RHEL6.7 PATCH 09/25] dm thin metadata: change dm_thin_find_block to allow blocking, but not issuing, IO O-Subject: [RHEL6.7 PATCH 10/25] dm transaction manager: add support for prefetching blocks of metadata O-Subject: [RHEL6.7 PATCH 11/25] dm thin: prefetch missing metadata pages O-Subject: [RHEL6.7 PATCH 12/25] dm thin: throttle incoming IO O-Subject: [RHEL6.7 PATCH 14/25] dm thin: adjust max_sectors_kb based on thinp blocksize O-Subject: [RHEL6.7 PATCH 15/25] dm: improve documentation and code clarity in dm_merge_bvec O-Subject: [RHEL6.7 PATCH 16/25] dm thin: implement thin_merge O-Subject: [RHEL6.7 PATCH 17/25] dm thin: grab a virtual cell before looking up the mapping O-Subject: [RHEL6.7 PATCH 18/25] dm thin: performance improvement to discard processing O-Subject: [RHEL6.7 PATCH 19/25] dm thin: factor out remap_and_issue_overwrite O-Subject: [RHEL6.7 PATCH 20/25] dm thin: defer whole cells rather than individual bios O-Subject: [RHEL6.7 PATCH 21/25] dm thin: remap the bios in a cell immediately O-Subject: [RHEL6.7 PATCH 22/25] dm thin: direct dispatch when breaking sharing O-Subject: [RHEL6.7 PATCH 23/25] dm thin: sort the deferred cells O-Subject: [RHEL6.7 PATCH 24/25] dm thin: optimize retry_bios_on_resume O-Subject: [RHEL6.7 PATCH 25/25] dm thin: refactor requeue_io to eliminate spinlock bouncing O-Subject: [RHEL6.7 PATCH 26/25] dm thin: fix potential for infinite loop in pool_io_hints O-Subject: [RHEL6.7 PATCH v2 27/25] dm thin: fix pool_io_hints to avoid looking at max_hw_sectors I see you're using old DM snapshot (which has nothing to do with dm-thinp).. and there are errors about trying to use "DM_snapshot_cow" has a filesystem type when mounting. But beyond that I have no context to be able to _really_ say what the system was doing. But I really doubt these DM changes have anything to do with you your UEFI boot problem.
(In reply to Mike Snitzer from comment #2) > I see you're using old DM snapshot (which has nothing to do with dm-thinp).. NOTE: dm-snapshot does use dm-bufio. And there were a handful of dm-bufio changes listed in comment#2. But I'm not aware of any potential for dm-snapshot regression with these dm-bufio changes. I think you need to first silence the "mount: unknown filesystem type 'DM_snapshot_cow'" errors.
Created attachment 971897 [details] r510.log Hi fabiand, I just obtain the panic log info via serial console, provides for you to debug. Thanks!
(In reply to shaochen from comment #4) > Created attachment 971897 [details] > r510.log > > Hi fabiand, > > I just obtain the panic log info via serial console, provides for you to > debug. > Thanks! Chen, Thanks. We also need more, please add _rdshell_ _rdinitdebug_ and removing _quiet_ to get /init.log for helps, btw rdsosreport is not available a on rhel 6.6. Thanks Ying
(In reply to Ying Cui from comment #5) > (In reply to shaochen from comment #4) > > Created attachment 971897 [details] > > r510.log > > > > Hi fabiand, > > > > I just obtain the panic log info via serial console, provides for you to > > debug. > > Thanks! > > Chen, Thanks. > We also need more, please add _rdshell_ _rdinitdebug_ and removing _quiet_ > to get /init.log for helps, btw rdsosreport is not available a on rhel 6.6. > > Thanks > Ying OK, I have added "rdshell" "rdinitdebug" to CMD and obtain the new log , Please check "r510-new-output.log" & "init.log" for more details. Thanks!
Created attachment 971916 [details] r510-new-output.log
Created attachment 971917 [details] init.log
Test version: rhev-hypervisor6-6.6-20141218.0.el6ev ovirt-node-3.1.0-0.37.20141218gitcf277e1.el6.noarch Test 5 times after pull out the usb disk, didn't met kernel panic issue any more, so close this bug as WORKSFORME. Thanks!
Due to env. issue, I consider to close it as notabug. Thanks.