Bug 210013 - kernel unaligned access messages in rhel5a1
kernel unaligned access messages in rhel5a1
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
ia64 Linux
medium Severity high
: ---
: ---
Assigned To: Dave Jones
Brian Brock
:
Depends On: 198572
Blocks:
  Show dependency treegraph
 
Reported: 2006-10-09 11:13 EDT by Prarit Bhargava
Modified: 2015-01-04 17:28 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-10-12 02:18:58 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Prarit Bhargava 2006-10-09 11:13:15 EDT
+++ This bug was initially created as a clone of Bug #198572 +++

Description of problem:
When booting a system with a new install of rhel5a1, I'm getting a lot of
unaligned access messages with the kernel.

Version or Release number of selected component (if n/a, use `uname -a`):
Linux max-mont.diablo.test 2.6.16-1.2290_EL #1 SMP Thu Jun 15 15:08:40 EDT 2006
ia64 ia64 ia64 GNU/Linux
	
How reproducible: 100%
	
	
Steps to Reproduce:
1. Install rhel5a1 on an rx4640 with montecitos
2. Boot the installed system and watch for the unaligned messages
	    
Actual results:
ACPI: bus type pci registered
ACPI: Subsystem revision 20060127
ACPI: Interpreter enabled
ACPI: Using IOSAPIC for interrupt routing
ACPI: PCI Root Bridge [L000] (0000:00)
kernel unaligned access to 0xe0000040fee5140c, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee51444, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee5147c, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee514cc, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee51404, ip=0xa000000100317af1
ACPI: PCI Root Bridge [L001] (0000:20)
ACPI: PCI Root Bridge [L002] (0000:40)
kernel unaligned access to 0xe0000040fee50dc4, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee50dfc, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee50e4c, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee50dbc, ip=0xa000000100317af1
kernel unaligned access to 0xe0000040fee50dc4, ip=0xa000000100317b01
ACPI: PCI Root Bridge [L004] (0000:80)
ACPI: PCI Root Bridge [L005] (0000:a0)
kernel unaligned access to 0xe0000040fee50994, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee509cc, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee50a1c, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee5098c, ip=0xa000000100317af1
kernel unaligned access to 0xe0000040fee50994, ip=0xa000000100317b01
ACPI: PCI Root Bridge [L006] (0000:c0)
kernel unaligned access to 0xe0000040fee5077c, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee507b4, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee50804, ip=0xa000000100318fc1
kernel unaligned access to 0xe0000040fee50774, ip=0xa000000100317af1
kernel unaligned access to 0xe0000040fee5077c, ip=0xa000000100317b01
Linux Plug and Play Support v0.97 (c) Adam Belay

Expected results:
ACPI: bus type pci registered
ACPI: Subsystem revision 20060127
ACPI: Interpreter enabled
ACPI: Using IOSAPIC for interrupt routing
ACPI: PCI Root Bridge [L000] (0000:00)
ACPI: PCI Root Bridge [L001] (0000:20)
ACPI: PCI Root Bridge [L002] (0000:40)
ACPI: PCI Root Bridge [L004] (0000:80)
ACPI: PCI Root Bridge [L005] (0000:a0)
ACPI: PCI Root Bridge [L006] (0000:c0)
Linux Plug and Play Support v0.97 (c) Adam Belay

Additional info:
You can find a patch for this problem here:
https://launchpad.net/distros/ubuntu/+bug/43913

-- Additional comment from bjohnson@redhat.com on 2006-08-30 15:55 EST --
Moving from kernel-maint to kernel-mgr, per process.

-- Additional comment from prarit@redhat.com on 2006-09-11 21:48 EST --
Doug, did you get any farther with the "unaligned access" code you were looking at?

P.

-- Additional comment from dchapman@redhat.com on 2006-09-12 09:26 EST --
I dug into it but with little success.  I can probably get back on to looking at
this one later this week.


-- Additional comment from dchapman@redhat.com on 2006-09-12 15:34 EST --
After much more digging......

It appears that the location of the signature in the modules is whatis not
aligned.  I made a minor change to the src to print out the location of
sctx->buffer in sha1_update and I see:

in sha1_update sctx->buffer = 0xe0000707e65c7db4

from digging through the code it appears that this is based on the address in
the module itself.  I looked at the scritps used to insert the signature into
the .ko files and I can't see a way to force alignment on it.  It simply does an
objcopy to add the .module_sig section.  Looking at the man page for objcopy I
don't see a way to force this but I imagine there has to be a way.

I think we need someone familiar with the module signing code look at this.

FYI, I forced a kernel panic at the first unaligned access to see the full call
stack trace:

Call Trace:
 [<a000000100013e60>] show_stack+0x40/0xa0
                                sp=e0000707f59171d0 bsp=e0000707f5911638
 [<a000000100014760>] show_regs+0x840/0x880
                                sp=e0000707f59173a0 bsp=e0000707f59115d8
 [<a000000100037b60>] die+0x1c0/0x2a0
                                sp=e0000707f59173a0 bsp=e0000707f5911590
 [<a000000100037c90>] die_if_kernel+0x50/0x80
                                sp=e0000707f59173c0 bsp=e0000707f5911560
 [<a00000010061d730>] ia64_bad_break+0x270/0x4a0
                                sp=e0000707f59173c0 bsp=e0000707f5911538
 [<a00000010000c700>] __ia64_leave_kernel+0x0/0x280
                                sp=e0000707f5917470 bsp=e0000707f5911538
 [<a000000100039ff0>] ia64_handle_unaligned+0x3b0/0x2d80
                                sp=e0000707f5917640 bsp=e0000707f5911470
 [<a00000010000cd10>] ia64_prepare_handle_unaligned+0x30/0x60
                                sp=e0000707f5917830 bsp=e0000707f5911470
 [<a00000010000c700>] __ia64_leave_kernel+0x0/0x280
                                sp=e0000707f5917a40 bsp=e0000707f5911470
 [<a0000001002a4850>] sha_transform+0x30/0x500
                                sp=e0000707f5917c10 bsp=e0000707f5911450
 [<a000000100264b80>] sha1_update+0xc0/0x160
                                sp=e0000707f5917c10 bsp=e0000707f5911408
 [<a000000100263650>] update_kernel+0x50/0xa0
                                sp=e0000707f5917d50 bsp=e0000707f59113d0
 [<a0000001000c6d90>] module_verify_signature+0x14b0/0x1660
                                sp=e0000707f5917d50 bsp=e0000707f5911330
 [<a0000001000c5850>] module_verify+0x1130/0x11c0
                                sp=e0000707f5917d80 bsp=e0000707f59112d8
 [<a0000001000bfe20>] load_module+0x1a0/0x30c0
                                sp=e0000707f5917e00 bsp=e0000707f5911198
 [<a0000001000c43d0>] sys_init_module+0xb0/0x400
                                sp=e0000707f5917e30 bsp=e0000707f5911128
 [<a00000010000c560>] ia64_ret_from_syscall+0x0/0x40
                                sp=e0000707f5917e30 bsp=e0000707f5911128
 [<a000000000010620>] __start_ivt_text+0xffffffff00010620/0x400
                                sp=e0000707f5918000 bsp=e0000707f5911128



-- Additional comment from prarit@redhat.com on 2006-09-12 18:44 EST --
Adding David to this as he is most familiar with the module signing code...

P.

-- Additional comment from dchapman@redhat.com on 2006-09-12 19:15 EST --
I did some investigating of the .ko files themselves with readelf.  I noticed
something that might be related.  The elf section header has a field called
sh_addralign.  Using readelf we can see what this is:

# readelf -a lockd.ko | grep -A1 .module_sig
  [43] .module_sig       PROGBITS         0000000000000064  0003a796
       0000000000000041  0000000000000000           0     0     1

That last "1" on the second line is the alignment.  It would seem that this
would make it byte aligned but the code that is failing should be probably word
aligned or possibly just int aligned.

I am not sure how this is used but if I had to guess I would say that when the
module is loaded it will force this section to start on an address matching that
alignment.  Since it is 1 it will just load wherever it fits next.

I don't see any way to change this value.  The .module_sig section is inserted
via objcopy but that doesn't seem to have any alignment options.


-- Additional comment from yanmin.zhang@intel.com on 2006-09-13 22:47 EST --
Function should be changed. The assumption that parameter in is aligned with 4 
bytes is incorrect.

I will post a patch to fix it. The patch will be posted to LKML firstly.

-- Additional comment from yanmin.zhang@intel.com on 2006-09-14 05:09 EST --
Created an attachment (id=136241)
Patch to fix it

I tried upstream 2.6.18-rc7 and didn't find the same problem. Perhaps mm tree
has it.

Anyway, I post the patch here firstly.


-- Additional comment from dchapman@redhat.com on 2006-09-14 11:20 EST --
(In reply to comment #8)
> Created an attachment (id=136241) [edit]
> Patch to fix it
> 
> I tried upstream 2.6.18-rc7 and didn't find the same problem. Perhaps mm tree
> has it.
> 
> Anyway, I post the patch here firstly.
> 

I have verified the patch does indeed fix the unaligned access errors.  Thanks!

Prarit, can we see about getting this into the fedora and RHEL5 trees?


-- Additional comment from prarit@redhat.com on 2006-09-14 13:33 EST --
Yanmin,

IIRC this only happens if you use the module signing code from Fedora & RHEL5. 
That's probably why you're not seeing it by doing a make, make install, etc., on
an upstream tree.

P.

-- Additional comment from yanmin.zhang@intel.com on 2006-09-14 22:14 EST --
Prarit,

You are right. I checked 2.6.18-rc7 and mm tree, the upstream doesn't use 
module signing code. So it's of Fedora & RHEL5.

Thanks,
Yanmin


-- Additional comment from pm-rhel@redhat.com on 2006-09-18 15:20 EST --
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

-- Additional comment from jturner@redhat.com on 2006-09-19 15:42 EST --
QE ack for 5.0.0.

-- Additional comment from prarit@redhat.com on 2006-09-20 07:45 EST --
Created an attachment (id=136728)
New patch using get_unaligned

After submission, dhowells noted that most arches can handle an unaligned
address without throwing an exception -- it's better to use get_unaligned here,
rather than memcpy.

-- Additional comment from prarit@redhat.com on 2006-09-21 06:37 EST --
*** Bug 206367 has been marked as a duplicate of this bug. ***

-- Additional comment from dzickus@redhat.com on 2006-09-26 18:04 EST --
kernel-2.6.18-1.2699.el5

-- Additional comment from dchapman@redhat.com on 2006-09-27 11:08 EST --
I have verified the fix is in the latest RHEL5 kernel 2.6.18-1.2702.el5.  The
fix has not made it's way into rawhide yet as of 2.6.18-1.2699.fc6.

Note You need to log in before you can comment on or make changes to this bug.