Bug 719905

Summary: Kdump is not operational while configured restore(Local)
Product: Red Hat Enterprise Linux 6 Reporter: Guohua Ouyang <gouyang>
Component: ovirt-nodeAssignee: Alan Pevec <apevec>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.2CC: apevec, cshao, leiwang, mburns, moli, ovirt-maint, ycui
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: ovirt-node-2.0.2-0.7.gitb88a4ee.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 19:17:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
ovirt.log
none
ovirt.log-restore(local) none

Description Guohua Ouyang 2011-07-08 11:22:04 UTC
Description of problem:
Kdump is not operational while configured remote kdump location, it's operational only with "default reboot" 


Version-Release number of selected component (if applicable):
rhev-hypervisor-6.2-0.3.el6

How reproducible:
Always. 

Steps to Reproduce:
1. Configure kernel dump, such as nfs: 10.66.9.237:/var/crash
2. reboot
3. Check kdump status
  
Actual results:
Kdump is not operational


Expected results:
Kdump is operational

Additional info:
if without configure remote kernel dump location, kdump is operational.

Comment 2 Guohua Ouyang 2011-07-09 05:53:44 UTC
correct: Kdump is not operational while configured restore(Local).

# cat /etc/kdump.conf 
default reboot
ext4 /dev/HostVG/Data
path /core

note: configure with ssh/nfs or without configure (only "default reboot"), kdump is operational.

Comment 3 Guohua Ouyang 2011-07-09 06:01:52 UTC
# service kdump restart
Stopping kdump:                                            [  OK  ]
No kdump initial ramdisk found.                            [WARNING]
Rebuilding /boot-kdump/initrd-2.6.32-131.4.1.el6.x86_64kdump.img
WARNING: No module squashfs found for kernel 2.6.32-131.4.1.el6.x86_64, continuing anyway
WARNING: No module mbcache found for kernel 2.6.32-131.4.1.el6.x86_64, continuing anyway
WARNING: No module radeon found for kernel 2.6.32-131.4.1.el6.x86_64, continuing anyway
WARNING: No module ttm found for kernel 2.6.32-131.4.1.el6.x86_64, continuing anyway
WARNING: No module drm-kms-helper found for kernel 2.6.32-131.4.1.el6.x86_64, continuing anyway
WARNING: No module drm found for kernel 2.6.32-131.4.1.el6.x86_64, continuing anyway
WARNING: No module hwmon found for kernel 2.6.32-131.4.1.el6.x86_64, continuing anyway
WARNING: No module i2c-algo-bit found for kernel 2.6.32-131.4.1.el6.x86_64, continuing anyway
WARNING: No module i2c-core found for kernel 2.6.32-131.4.1.el6.x86_64, continuing anyway
device node not found
mount: /dev/mapper/HostVG-Data already mounted or /tmp/tmp.Vdj0rRl8I9 busy
/etc/kdump.conf: Bad mount point /dev/HostVG/Data
Failed to run mkdumprd

Comment 4 Guohua Ouyang 2011-07-09 06:14:03 UTC
*** Bug 719901 has been marked as a duplicate of this bug. ***

Comment 8 cshao 2011-08-26 06:53:11 UTC
Created attachment 520015 [details]
ovirt.log

Comment 12 Alan Pevec 2011-08-30 22:33:38 UTC
Side-effect of the patch from bug 675868 comment 16
it's going to be reverted in the next build

Comment 14 cshao 2011-09-05 09:02:43 UTC
Test version:
rhev-hypervisor-6.2-0.15.el6

Configure kdump to NFS and SSH can succeed.
But configured kdump to restore(Local) will fail. 
There is no any dump file store in /data/core.
So change bug status to ASSIGNED

Please see serial output:
========================================================================
]# echo "c" > /proc/sysrq-trigger
SysRq : Trigger a crash
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff8131aa36>] sysrq_handle_crash+0x16/0x20
PGD 22d0d9067 PUD 231504067 PMD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/firmware/memmap/11/type
CPU 3
Modules linked in: ebtable_nat ebtables lockd sunrpc bridge ipt_REJECT nf_conntr                                        ack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables xt_physdev ip6t_REJECT nf_connt                                        rack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack xt_multiport ip6table_filter ip6_                                        tables ipv6 ext4 jbd2 8021q garp stp llc sha256_generic cbc cryptoloop dm_crypt                                         cryptd aes_x86_64 aes_generic vhost_net macvtap macvlan tun kvm_intel kvm xhci_h                                        cd dcdbas sg shpchp dm_snapshot squashfs ext2 mbcache dm_round_robin sr_mod cdro                                        m sd_mod crc_t10dif usb_storage 8139too 8139cp mii e1000e pata_acpi ata_generic                                         ata_piix nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core video output dm_mu                                        ltipath dm_mod [last unloaded: scsi_wait_scan]

Modules linked in: ebtable_nat ebtables lockd sunrpc bridge ipt_REJECT nf_conntr                                        ack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables xt_physdev ip6t_REJECT nf_connt                                        rack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack xt_multiport ip6table_filter ip6_                                        tables ipv6 ext4 jbd2 8021q garp stp llc sha256_generic cbc cryptoloop dm_crypt                                         cryptd aes_x86_64 aes_generic vhost_net macvtap macvlan tun kvm_intel kvm xhci_h                                        cd dcdbas sg shpchp dm_snapshot squashfs ext2 mbcache dm_round_robin sr_mod cdro                                        m sd_mod crc_t10dif usb_storage 8139too 8139cp mii e1000e pata_acpi ata_generic                                         ata_piix nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core video output dm_mu                                        ltipath dm_mod [last unloaded: scsi_wait_scan]
Pid: 11847, comm: bash Not tainted 2.6.32-131.12.1.el6.x86_64 #1 OptiPlex 780                                           
RIP: 0010:[<ffffffff8131aa36>]  [<ffffffff8131aa36>] sysrq_handle_crash+0x16/0x2                                        0
RSP: 0018:ffff88022dd77e18  EFLAGS: 00010096
RAX: 0000000000000010 RBX: 0000000000000063 RCX: 0000000000000b5f
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000063
RBP: ffff88022dd77e18 R08: ffffffff81bfe340 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000000
R13: ffffffff81af9f00 R14: 0000000000000286 R15: 0000000000000001
FS:  00007fdc42e79700(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000230296000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process bash (pid: 11847, threadinfo ffff88022dd76000, task ffff88023083ab40)
Stack:
 ffff88022dd77e68 ffffffff8131acf2 ffff88023083ab40 ffff880200000000
<0> 0000000000000022 0000000000000002 ffff88022be4d3c0 00007fdc42e7f000
<0> 0000000000000002 fffffffffffffffb ffff88022dd77e98 ffffffff8131adae
Call Trace:
 [<ffffffff8131acf2>] __handle_sysrq+0x132/0x1a0
 [<ffffffff8131adae>] write_sysrq_trigger+0x4e/0x50
 [<ffffffff811d509e>] proc_reg_write+0x7e/0xc0
 [<ffffffff811727f8>] vfs_write+0xb8/0x1a0
 [<ffffffff810d1b52>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff81173231>] sys_write+0x51/0x90
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
Code: d0 88 81 a3 a2 fc 81 c9 c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5                                         0f 1f 44 00 00 c7 05 bd c6 77 00 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01                                         c9 c3 55 48 89 e5 0f 1f 44 00 00 8d 47
RIP  [<ffffffff8131aa36>] sysrq_handle_crash+0x16/0x20
 RSP <ffff88022dd77e18>
CR2: 0000000000000000
do_IRQ: 0.97 No irq handler for vector (irq -1)
[Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 53003c)
â–’Mounting proc filesystem
Mounting sysfs filesystem
Creating /dev
Creating initial device nodes
Free memory/Total memory (free %): 79280 / 113240 ( 70.0106 )
Loading dm-mod.ko module
Loading dm-log.ko module
Loading dm-region-hash.ko module
Loading dm-mirror.ko module
Loading dm-zero.ko module
Loading dm-snapshot.ko module
Loading ebtables.ko module
Loading sunrpc.ko module
Loading ipt_REJECT.ko module
Loading nf_defrag_ipv4.ko module
Loading ip_tables.ko module
Loading xt_physdev.ko module
Loading nf_conntrack.ko module
Loading xt_multiport.ko module
Loading ip6_tables.ko module
Loading ipv6.ko module
Loading jbd2.ko module
Loading llc.ko module
Loading sha256_generic.ko module
Loading cbc.ko module
Loading cryptoloop.ko module
Loading dm-crypt.ko module
Loading cryptd.ko module
Loading aes_generic.ko module
Loading macvlan.ko module
Loading tun.ko module
Loading kvm.ko module
Loading xhci-hcd.ko module
Loading dcdbas.ko module
Loading sg.ko module
Loading shpchp.ko module
Loading ext2.ko module
insmod: can't insert '/lib/modules/2.6.32-131.12.1.el6.x86_64/ext2.ko': unknown                                         symbol in module, or unknown parameter
Loading cdrom.ko module
Loading crc-t10dif.ko module
Loading usb-storage.ko module
Waiting 8 seconds for driver initialization.
Loading mii.ko module
Loading e1000e.ko module
Loading pata_acpi.ko module
Loading ata_generic.ko module
Loading ata_piix.ko module
Loading dm-multipath.ko module
Loading ebtable_nat.ko module
Loading lockd.ko module
Loading nf_conntrack_ipv4.ko module
Loading iptable_filter.ko module
Loading ip6t_REJECT.ko module
Loading nf_defrag_ipv6.ko module
Loading xt_state.ko module
Loading ip6table_filter.ko module
Loading ext4.ko module
insmod: can't insert '/lib/modules/2.6.32-131.12.1.el6.x86_64/ext4.ko': unknown                                         symbol in module, or unknown parameter
Loading stp.ko module
Loading aes-x86_64.ko module
Loading macvtap.ko module
Loading kvm-intel.ko module
Loading dm-round-robin.ko module
Loading sr_mod.ko module
Loading sd_mod.ko module
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] Assuming drive cache: write through
Loading 8139too.ko module
Loading 8139cp.ko module
Loading bridge.ko module
Loading nf_conntrack_ipv6.ko module
Loading garp.ko module
Loading vhost_net.ko module
Loading 8021q.ko module
Waiting for required block device discovery
Creating Block Devices
Creating block device loop0
Creating block device loop1
Creating block device loop2
Creating block device loop3
Creating block device loop4
Creating block device loop5
Creating block device loop6
Creating block device loop7
Creating block device ram0
Creating block device ram1
Creating block device ram10
Creating block device ram11
Creating block device ram12
Creating block device ram13
Creating block device ram14
Creating block device ram15
Creating block device ram2
Creating block device ram3
Creating block device ram4
Creating block device ram5
Creating blocsd 0:0:0:0: [sda] Assuming drive cache: write through
k device ram6
Creating block device ram7
Creating block device ram8
Creating block device ram9
Creating block device sda
Creating block device sdb
Creating block device sr0
Making device-mapper control node
Scanning logical volumes
  Reading all physical volumes.  This may take a while...
  Found volume group "HostVG" using metadata type lvm2
Activating logical volumes
  4 logical volume(s) in volume group "HostVG" now active
Free memory/Total memory (free %): 70504 / 113240 ( 62.2607 )
Saving to the local filesystem /dev/HostVG/Data
e2fsck 1.41.12 (17-May-2010)
DATA: recovering journal
DATA: clean, 14/51296 files, 7529/204800 blocks
mount: mounting /dev/HostVG/Data on /mnt failed: No such device
Restarting system.

Comment 15 cshao 2011-09-05 09:03:59 UTC
Created attachment 521456 [details]
ovirt.log-restore(local)

Comment 16 cshao 2011-09-05 09:43:12 UTC
There is no ext4.ko file in path /lib/modules/2.6.32-131.12.1.el6.x86_64.
The file ext4.ko path is /lib/modules/2.6.32-131.12.1.el6.x86_64/kernel/fs/ext4/ext4.ko

====================================================================
[root@localhost 2.6.32-131.12.1.el6.x86_64]# pwd
/lib/modules/2.6.32-131.12.1.el6.x86_64
[root@localhost 2.6.32-131.12.1.el6.x86_64]# ls
build          modules.alias.bin  modules.dep.bin      modules.isapnpmap    modules.order     modules.symbols.bin  vdso
extra          modules.block      modules.drm          modules.modesetting  modules.pcimap    modules.usbmap       weak-updates
kernel         modules.ccwmap     modules.ieee1394map  modules.networking   modules.seriomap  source
modules.alias  modules.dep        modules.inputmap     modules.ofmap        modules.symbols   updates
[root@localhost 2.6.32-131.12.1.el6.x86_64]# find / -name ext4.ko
/lib/modules/2.6.32-131.12.1.el6.x86_64/kernel/fs/ext4/ext4.ko
[root@localhost 2.6.32-131.12.1.el6.x86_64]#

Comment 17 Alan Pevec 2011-09-05 21:08:13 UTC
(In reply to comment #16)
> There is no ext4.ko file in path /lib/modules/2.6.32-131.12.1.el6.x86_64.
> The file ext4.ko path is
> /lib/modules/2.6.32-131.12.1.el6.x86_64/kernel/fs/ext4/ext4.ko

Paths are different in kdump initramfs and normal rootfs.

> insmod: can't insert '/lib/modules/2.6.32-131.12.1.el6.x86_64/ext4.ko':
> unknown symbol in module, or unknown parameter

kernel/fs/mbcache.ko dep is missing, mkdumprd actually complains about that:
WARNING: No module mbcache found for kernel 2.6.32-131.12.1.el6.x86_64, continuing anyway

Comment 18 Alan Pevec 2011-09-05 21:12:04 UTC
--- a/recipe/common-minimizer.ks
+++ b/recipe/common-minimizer.ks
@@ -56,6 +56,7 @@ keeprpm ConsoleKit-libs
 # filesystems
 drop /lib/modules/*/kernel/fs
 keep /lib/modules/*/kernel/fs/ext*
+keep /lib/modules/*/kernel/fs/mbcache*
 keep /lib/modules/*/kernel/fs/jbd*
 keep /lib/modules/*/kernel/fs/btrfs
 keep /lib/modules/*/kernel/fs/fat

Comment 24 errata-xmlrpc 2011-12-06 19:17:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1783.html