Bug 703057

Summary: kdump - switch_root not necessary? (at least in some situations)
Product: Red Hat Enterprise Linux 5 Reporter: Ondrej Valousek <ondrejv>
Component: kexec-toolsAssignee: Cong Wang <amwang>
Status: CLOSED CURRENTRELEASE QA Contact: Qian Cai <qcai>
Severity: low Docs Contact:
Priority: unspecified    
Version: 5.6CC: cward, czhang, nhorman, qcai, rkhan, ruyang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-26 13:00:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Ondrej Valousek 2011-05-09 07:04:32 UTC
Description of problem:

In diskless environment (root is mounted over network - for example nfs) where 
you want to transfer vmcore to the remote machine via (for example) nfs, you do not need to do switch_root because:
1) it will never succeed - mkdumprd is simply not "clever" enough to do it properly
2) it is not needed at all.

I believe that in general if we want to transfer vmcore over network to the remote machine, no local disks need to be initialized, switch_root could be omitted and vmcore can be captured directly.

I can confirm that at least in my case it worked properly (I hacked initrd manually).

Version-Release number of selected component (if applicable):
all (including RHEL-6)

How reproducible:
always

Steps to Reproduce:
1. make diskless machine mounting / over nfs
2. invoke kernel panic
3. see kdump being unable copy vmcore to the requested destination.
  
Actual results:
vmcore is not saved

Expected results:
vmcore is saved

Additional info:
I would also welcome an explanation why we still care about switch_root - is it necessary in general? Why?

Comment 1 Han Pingtian 2011-08-10 05:32:31 UTC
I believe this one needs OtherQA, since we don't have diskless envriments for testing.

Comment 2 Ondrej Valousek 2011-08-10 06:52:56 UTC
No problem. But in general, I do not understand why we need switch_root (even in the disk based situations) because you can dump the core even without it - saving yourselves a lot of hassle.

Comment 5 Dave Young 2011-10-26 06:45:38 UTC
Strange, I tested for RHEL-5.7 and RHEL-6.2 with diskless nfsroot, all saving vmcore successfully, also did not see the switch root occur

Comment 6 Dave Young 2011-10-26 06:52:00 UTC
BTW, For create the test machine, I just:

1) install RHEL on kvm guest as VM-a
create nfs-initrd for nfs client
copy kernel/nfs-initrd to kvm host

2) mount the VM-a image as /dev/loopX
nfs export the mount point

3) lunch kvm guest VM-b with VM-a's kernel and nfs-initrd

4) in VM-b
   add nfs target in kdump.conf
   start /etc/init.d/kdump
   crash

Comment 7 Ondrej Valousek 2011-10-26 08:13:57 UTC
The problem only arises if you want to store the vmcore on the network (for example a nfs server) - you need something like this in your kdump.conf:

net dorado.prague.s3group.com:/exports/ext1/tmp

then inird_kdump.img is trying to do switch_root BEFORE actually capturing the core. This is wrong. We should not do switch_root at all.

Note that it is obviously a bug because RHEL-6 does not have this problem.

Comment 8 Dave Young 2011-10-26 08:41:28 UTC
(In reply to comment #7)
> The problem only arises if you want to store the vmcore on the network (for
> example a nfs server) - you need something like this in your kdump.conf:
> 
> net dorado.prague.s3group.com:/exports/ext1/tmp

Yes, mine is similar to yours except I use IP address instead of hostname

Could you post the console log?

> 
> then inird_kdump.img is trying to do switch_root BEFORE actually capturing the
> core. This is wrong. We should not do switch_root at all.
> 
> Note that it is obviously a bug because RHEL-6 does not have this problem.

Comment 9 Ondrej Valousek 2011-10-26 09:13:28 UTC
I have updated kexec-tools and I can not replicate the problem, either.
Looks like it was fixed in the mean time.
You can close the call, sorry for the noise.

Ondrej

Comment 10 Dave Young 2011-10-26 09:21:18 UTC
(In reply to comment #9)
> I have updated kexec-tools and I can not replicate the problem, either.
> Looks like it was fixed in the mean time.
> You can close the call, sorry for the noise.

No problem, glad to hear that.

Comment 11 Cong Wang 2011-10-26 13:00:01 UTC
Thanks for testing, closing it now.