Bug 600567

Summary:	kdump fails to save vmcore on machine with 1TB memory
Product:	Red Hat Enterprise Linux 6	Reporter:	Qian Cai <qcai>
Component:	kexec-tools	Assignee:	Neil Horman <nhorman>
Status:	CLOSED WORKSFORME	QA Contact:	Red Hat Kernel QE team <kernel-qe>
Severity:	high	Docs Contact:
Priority:	urgent
Version:	6.0	CC:	amwang, cward, dmaley, jwest, ltroan, martinez, mfuruta, nhorman, phan, sbest, shiyer, tao
Target Milestone:	rc	Keywords:	Regression
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2010-07-15 15:12:27 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Comment 1 Cong Wang 2010-06-07 05:01:34 UTC

Since RHEL6 supports more than 1T memory, I doubt we need this patch. I mean, if this is still an issue, probably we need another way to fix it.

Comment 3 Neil Horman 2010-06-30 11:25:05 UTC

Note: This patch is required for RHEL6, but it appears the number of supported physical bits has been expanded from 40 bits in RHEL5 to 46 bits (see the efinition of MAX_PHYSMEM_BITS in the kernel).  So this patch needs some adjustment prior to inclusion.

Comment 4 Neil Horman 2010-07-14 11:24:32 UTC

so what should we do with this cai?  It just occured to me that Amerigo and I were saying pretty well the same thing here.  While we could/should adapt this patch handle 46 physical bits of address space, that would require a system with 2^46 = 64Tb of ram to validate.  If we don't have such a system, this is an untestable bug.  Do we have such a system?

Comment 8 Steve Best 2010-07-15 14:50:47 UTC

(In reply to comment #4)
> so what should we do with this cai?  It just occured to me that Amerigo and I
> were saying pretty well the same thing here.  While we could/should adapt this
> patch handle 46 physical bits of address space, that would require a system
> with 2^46 = 64Tb of ram to validate.  If we don't have such a system, this is
> an untestable bug.  Do we have such a system?    

Neil,

I don't think we are going to find a 64Tb system. would 1 or 2 TB system help in the testing effort?

-Steve

Comment 9 Neil Horman 2010-07-15 15:12:27 UTC

Steve, thanks for offering,but I'm afraid that a 1 or 2TB system won't really help much.  The nature of this bug is that the kernel reports, but refuses to use, or otherwise recognize memory above 64TB, and the kexec-tools userspace component has to do the same thing to get a valid vmcore on such systems.  On systems with anything less than that kexec and the kernel agree on memory size and will work accordingly. So without an actuall 64TB system we have no way to tell if we've fixed the issue.  If we can't get a 64Tb system, then I say we close this as WORKSFORME and repoen it if/when a user has such a system that we can work with.