The Red Hat 2.1 enterprise and summit kernels (2.4.9-e.XXenterprise,
2.4.9-e.XXsummit) seem to have a broken implementation of page_to_phys(). They
define the macro with the code
#define page_to_phys(page) ((u64)(page - mem_map) << PAGE_SHIFT)
#define page_to_phys(page) ((page - mem_map) << PAGE_SHIFT)
but in their autoconf.h they have
#define CONFIG_HIGHMEM64G_HIGHPTE 1
Since CONFIG_HIGHMEM64G is NOT set, they get the wrong definition of
page_to_phys() and truncate the resulting addresses to 32 bits, so things get
screwed up on machines with more than 4G of RAM.
The use of page_to_phys() in the RH 2.1 kernel seems to be limited, but this bug
does affect our out-of-tree driver.
This applies to all enterprise and summit kernels up to (at least) 2.4.9-e.27.
Looks like an easy to fix bug that should just be fixed. Jason?
Scsi drivers and co shouldn't really get 64 bit addresses actually, more stuff
is broken than just this...
unless this is causing a specific problem, i vote to close this.
This affects our (out-of-tree) InfiniBand drivers. Right now we just have
# if defined(__i386__) && (LINUX_VERSION_CODE ==
KERNEL_VERSION(2,4,9)) && defined(CONFIG_HIGHMEM64G_HIGHPTE)
/* Work around RH AS 2.1 configuration bug */
# undef page_to_phys
# define page_to_phys(page) ((u64)(page - mem_map) << PAGE_SHIFT)
which is ugly but works. However I'm not sure why you want to leave
an obvious, simple-to-fix bug in your kernel (lurking to bite other
people in the future, since it causes someone to silently get the
wrong physical address for a page, leading to all sorts of fun
depending on how that address is used).
Up to you I guess.
page_to_phys() is referenced by page_to_bus(), which is in turn used
by pci_map_sg(), which is used all over the place in driver code.
Severe problems will arise whenever a driver capable of using bus
addresses >32bit does DMA. AFAICS, several such drivers come with
RHAS2.1 (think megaraid, for example),
I think that this is a serious bug that may lead to machine crashes
and/or data corruption, and is unacceptable in an enterprise Linux
Please fix it as soon as possible, or show me why my argument is wrong.
If, as comment #3 suggests, "more stuff [of similar severity] is
broken than just this", then I would really like to know what that
broken stuff is and what our enterprise customers should do to avoid
being hit by it.
Added myself to cc list.
"Severe problems will arise whenever a driver capable of using bus
addresses >32bit does DMA. "
that's the flawed reasoning, since in AS2.1 no driver will.
(in your example: even though megaraid tells the kernel it can do >
4Gb the kernel will NEVER give it an address, and will pretend
megaraid told it that it's limit was 4Gb).
Created attachment 100153 [details]
Fix for the problem
I do not understand why this simple fix isn't just applied.
thanks for your reply.
Can you point me to the code where the kernel makes sure no address
>4GB is ever used in an SG list?
I can't find it.
+ bounce_limit = (unsigned long)SHpnt->pci_dev->dma_mask;
that line in drivers/scsi/scsi_merge.c
Uff, the "(unsigned long)". Thanks. Man, that line deserves a comment.
I'll be the first to admit that it's really really subtle and I only
know it's there because I put it there ;(
It helps only for SCSI drivers though. What about network drivers? Or
other PCI devices, or even 3rd party modules? I still think
page_to_phys() should be fixed, or at least a big fat comment should
be placed there that it's only valid below 4GB.
network driver are unaffected by this; others are expected to follow
the example. I can agree with the idea of putting a comment there,
sure. I wonder if that's worth it at this stage in the product