Bug 239717

Summary: [LSI 5.1 bug] kmap returns wrong kernel virtual address on x86_64 with >4GB memory
Product: Red Hat Enterprise Linux 5 Reporter: Atul Mukker <atul.mukker>
Component: kernelAssignee: Chip Coldwell <coldwell>
Status: CLOSED NOTABUG QA Contact: Martin Jenner <mjenner>
Severity: urgent Docs Contact:
Priority: medium    
Version: 5.0CC: andriusb, coldwell, coughlan, dwa, martin.wilck
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-06-08 18:32:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 217103    

Description Atul Mukker 2007-05-10 18:45:44 UTC
Description of problem:
kmap() generates wrong kernel virtual address on X86_64 on system with memory
greater than 4GB

Version-Release number of selected component (if applicable):
RHEL 5

How reproducible:
Have a X86_64 system with more than 4GB memory. Verify that kmap() kernel
virtual address does not correspond to DMA address using, for example, a disk read.

Steps to Reproduce:
1.
2.
3.
  
<details provided to Red Hat in a related mail>
 

-----Original Message-----
From: Mukker, Atul 
Sent: Thursday, May 10, 2007 2:35 PM
To: Chip Coldwell; Andrius T. Benokraitis
Cc: Kolli, Neela; Yang, Bo; Patro, Sumant; McLanahan, Kevin; Bagalkote,
Sreenivas; Tom Coughlan; David W. Aquilina
Subject: RE: Data corruption issue seen with RHEL 5 - request for a confernce call

Actually, the issue has been observed on stock X86_64 kernel with 4GB (and more)
memory. Here are more details about the issue.

As part of our driver framework, we must be able to access the IO buffers within
the driver (For example, to generate the RAID 5 data for degraded array). Since,
by default, mid-layer scatter gather list does not have virtual address, we
spawn internal thread (using work queue) in the driver. The function of this
thread is to call  "kmap()" for the required buffers and then route the request
back to our RAID core.

The problem we are facing on RHEL5 is that kmap() seems to be generating invalid
kernel virtual addresses. This we have verified using two test cases:
1.	When the RAID 5 arrays is optimal (so that there is a DMA on the buffers
directly from the disk), dump a few bytes using the virtual address generated by
the kmap(). We do not see the expected data.

2.	When the RAID 5 array is degraded, generated the missing data in the buffer
pointed at by the kernel virtual address provided by kmap().
The application gets junk data, but driver dump of the buffer shows the correct
data.

This issues does not happen for memory <4GB. Also, RHEL 4 does not seem to have
this issue, it works perfectly for memory up to 6GB (the max we tried).

Best regards,
-Atul Mukker


-----Original Message-----
From: Chip Coldwell [coldwell]
Sent: Thursday, May 10, 2007 10:42 AM
To: Andrius T. Benokraitis
Cc: Kolli, Neela; Yang, Bo; Patro, Sumant; Mukker, Atul; McLanahan, Kevin;
Bagalkote, Sreenivas; Tom Coughlan; Chip Coldwell; David W. Aquilina
Subject: Re: Data corruption issue seen with RHEL 5 - request for a confernce call

On Thu, 10 May 2007, Andrius T. Benokraitis wrote:

> Hi Neela,
> 
> Has a bugzilla been submitted for this yet? I'd like to have Tom/Chip 
> take a look at the details first if possible before setting up a call...
> 
> Thanks!
> 
> Andrius.
> 
> Kolli, Neela wrote:
> > Hi Andrius,
> > 
> > Atul Mukker, our SWR 5 engineer, detected an issue with BigMem kernel.

There is no RHEL-5 HugeMem kernel ... is this problem with a different RHEL?

Chip

--
Charles M. "Chip" Coldwell
Senior Software Engineer
Red Hat, Inc
978-392-2426


Actual results:


Expected results:


Additional info:

Comment 1 Andrius Benokraitis 2007-05-21 03:55:58 UTC
Has LSI tested this with a different/open source driver?

Comment 2 Martin Wilck 2007-05-21 13:11:42 UTC
I find it not likely that kmap() is broken, because a) it is a trivial function
on 64bit systems and b) if kmap() didn't work, hardly any driver would.


Comment 3 Atul Mukker 2007-05-21 13:23:19 UTC
This issue was not duplicated on the LSI's open source MegaRAID SAS driver. This
leads us to a hypothesis that:

Since our software raid stack generates kernel virtual addresses in a low
priority thread (outside the mid-layer's queuecommand interface) using work
queues, these addresses may not be valid when we access them later.
(Experimental) MegaRAID SAS driver kmaps the buffer in ISR and therefore has
correct virtual addresses.

Is this a correct hypothesis?

To verify this, we are implementing a double buffering mechanism in our software
raid driver. With double buffering in place, we would be able to generate
virtual addresses in the ISR context.

Also, as we informed Red Hat in our last conference call; the Linux specific
portion of our software stack can be shared with Red Hat. That's the code which
does the translation for the raid stack.

Thanks
-Atul

Comment 4 Atul Mukker 2007-05-21 13:25:52 UTC
> I find it not likely that kmap() is broken, because a) it is a trivial function
> on 64bit systems and b) if kmap() didn't work, hardly any driver would.

Please see a previous data sent to Red Hat and also the fact that we are
generating these addresses in a low priority thread.

<quote>
One more piece of information I would like to add. If I limit the amount of
kernel memory available by providing "mem=" kernel parameter, the behavior can
be toggled.

At about "mem=4350M", things seem to work ok, "mem=4360M", data corruption
happens. In between, it works sometimes and sometimes doesn't.

Other stark contrast is the about the page address and the corresponding kernel
virtual address:

Working configuration:
----------------------
PAGE ADDRESS	KERNEL VIRTUAL ADDRESSS
0xBF7C9000		0xFFFF8100BF7C9000
0x37FB7000		0xFFFF810037FB7000
0xBF0EA000		0xFFFF8100BF0EA000
0xBDA1A000		0xFFFF8100BDA1A000

Non Working Configuration:
--------------------------
PAGE ADDRESS	KERNEL VIRTUAL ADDRESSS
0x91D5000		0xFFFF81010982F000
0x91D6000		0xFFFF81010995F000
0x91D7000		0xFFFF81010AF00000
0x91D8000		0xFFFF81010DED7000

In the working configuration, the kernel virtual address contain the page
address as lower word, whereas the non working configuration does not?
</quote>

Comment 5 Chip Coldwell 2007-05-21 15:09:43 UTC
(In reply to comment #4)
>
> Other stark contrast is the about the page address and the corresponding kernel
> virtual address:
> 
> Working configuration:
> ----------------------
> PAGE ADDRESS	KERNEL VIRTUAL ADDRESSS
> 0xBF7C9000		0xFFFF8100BF7C9000
> 0x37FB7000		0xFFFF810037FB7000
> 0xBF0EA000		0xFFFF8100BF0EA000
> 0xBDA1A000		0xFFFF8100BDA1A000
> 
> Non Working Configuration:
> --------------------------
> PAGE ADDRESS	KERNEL VIRTUAL ADDRESSS
> 0x91D5000		0xFFFF81010982F000
> 0x91D6000		0xFFFF81010995F000
> 0x91D7000		0xFFFF81010AF00000
> 0x91D8000		0xFFFF81010DED7000
> 
> In the working configuration, the kernel virtual address contain the page
> address as lower word, whereas the non working configuration does not?

In the tables above, I assume that the PAGE ADDRESS column refers to
page_address(page), not the address of the page pointer itself, right?

Chip


Comment 6 Atul Mukker 2007-05-21 16:38:08 UTC
Actually, that would be the scatterlist::dma_address, wrong header there.
kmap(page) and page_address(page) returns the same value.

Comment 7 Martin Wilck 2007-05-22 07:43:27 UTC
Atul, it would be really helpful if you could come up with a small piece of
open-source code that demonstrates the problem.

Comment 8 Atul Mukker 2007-06-08 18:30:55 UTC
Please close this defect. We found out that we have to set the DMA mask to
0xffffffffffffffff for it to work ok.