Bug 198573

Summary: hald unaligned access messages
Product: Red Hat Enterprise Linux 5 Reporter: Bryan Stillwell <stillwell>
Component: halAssignee: David Zeuthen <davidz>
Status: CLOSED RAWHIDE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 5.0CC: davidz, erikj, mclasen, rick.hester
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: 0.5.8.1-2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-09-27 20:12:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 150223    
Attachments:
Description Flags
lshal output
none
Requested log file
none
Fix hald unaligned access messages none

Description Bryan Stillwell 2006-07-12 00:39:52 UTC
Description of problem:
When booting an rx4640 after the installation of rhel5a1, I'm getting a few
'hald unaligned access' messages.

Version-Release number of selected component (if applicable):
Linux max-mont.diablo.test 2.6.16-1.2290_EL #1 SMP Thu Jun 15 15:08:40 EDT 2006
ia64 ia64 ia64 GNU/Linux

How reproducible:100%


Steps to Reproduce:
1. Install rhel5a1 on an rx4640
2. Boot the installed system and watch for hald unaligned access messages
  
Actual results:
eth0: no IPv6 routers present
hald(2488): unaligned access to 0x60000ffffeeeb5dc, ip=0x200000000051e630
hald(2488): unaligned access to 0x60000ffffeeeb5dc, ip=0x200000000051e521
Removing netfilter NETLINK layer.

Expected results:
eth0: no IPv6 routers present
Removing netfilter NETLINK layer.

Additional info:

Comment 1 erikj 2006-07-28 03:31:50 UTC
This was observed by SGI QA on SGI Altix as well.

Starting HAL daemon: hald(15205): unaligned access to 0x60000ffffe0675dc,
ip=0x200000000052a630
hald(15205): unaligned access to 0x60000ffffe0675dc, ip=0x200000000052a521

This also was with 2.6.16-1.2290_EL (the rhel5 alpha kernel).

Comment 2 John (J5) Palmieri 2006-07-28 14:42:35 UTC
David,  have you see anything like this?  It looks like HAL is trying to probe a
device and running into a bug either in the way HAL probes the device, the way
the kernel accesses it or in the device itself.

Eric and Brian can you attach the output of lshal?  Note that in order for this
to be fixed in RHEL4 you will need to go through your support rep. RHEL5 we
should be able to get in if we can identify the issue.

Comment 3 John (J5) Palmieri 2006-07-28 14:44:36 UTC
Reading the bugs again I noticed this was a RHEL5 only issue but was filed
against RHEL 4.  Please confirm.

Comment 4 erikj 2006-07-28 14:56:14 UTC
Created attachment 133241 [details]
lshal output

WRT Comment #3, SGI is seeing this in RHEL% so if we convert this bug to a
RHEL4
bug, I can opena  RHEL5 one :)

With RHEL5 Alpha in mind... here is the requested lshal attachment.

Comment 5 David Zeuthen 2006-07-28 15:39:15 UTC
Hi,

So, does hald crash? I guess not since you can get lshal output. 

Also, do you see any peculiar output running

 # /usr/sbin/hald --daemon=no --verbose=yes

as root (remember to stop the haldaemon service, 'service haldaemon stop').

Thanks,
David


Comment 6 Bryan Stillwell 2006-07-28 20:06:51 UTC
John,

This is a rhel5 issue only.  My account didn't have access to file against rhel5
when I filed this bug, and now that it does I'm unable to change it.  I've been
working with Chris Williams at RH to try and get this fixed...  If you could
change it to "RHEL Beta" and version "5.0.0", that'd be great!

Also, hald doesn't crash.  I didn't see anything peculiar in the output of the
command you ask us to run, but I did see 'error', 'fail', and 'warn' show up a
few times:

[root@min ~]# /usr/sbin/hald --daemon=no --verbose=yes &>hald-log
hald(2931): unaligned access to 0x60000fffff24f51c, ip=0x200000000050a630
hald(2931): unaligned access to 0x60000fffff24f51c, ip=0x200000000050a630
hald(2931): unaligned access to 0x60000fffff24f51c, ip=0x200000000050a521
hald(2931): unaligned access to 0x60000fffff24f51c, ip=0x200000000050a521

[root@min ~]# grep -i error hald-log
14:12:55.771 [E] acpi.c:795: Couldn't open /proc/acpi/battery: Error opening
directory '/proc/acpi/battery': No such file or directory
14:12:55.771 [E] acpi.c:795: Couldn't open /proc/acpi/ac_adapter: Error opening
directory '/proc/acpi/ac_adapter': No such file or directory
14:12:55.771 [E] acpi.c:795: Couldn't open /proc/acpi/button/lid: Error opening
directory '/proc/acpi/button/lid': No such file or directory
2955: 14:12:56.478: probe-input.c:181: Error: EVIOCGID failed: Inappropriate
ioctl for device
[root@min ~]# grep -i fail hald-log
2936: 14:12:55.784: probe-smbios.c:140: Failed to execute dmidecode!
2955: 14:12:56.478: probe-input.c:181: Error: EVIOCGID failed: Inappropriate
ioctl for device
[root@min ~]# grep -i warn hald-log
2972: 14:13:00.038: probe-volume.c:592: warning: partition_number=2 not in [0;1[

Bryan

Comment 7 David Zeuthen 2006-07-28 20:43:55 UTC
Hi Bryan,

is it possible you can attach the full log and also pipe stderror to stdout so
we can see where this

 hald(2931): unaligned access to 0x60000fffff24f51c, ip=0x200000000050a521

happens? Or maybe that message comes from the kernel and is just printed to the
console from the kernel.... Either way, having the full log would be useful. Thanks.

I will need to find a box to reproduce this.

Comment 8 Bryan Stillwell 2006-07-28 21:38:47 UTC
Created attachment 133269 [details]
Requested log file

The unaligned access messages are definately coming from the kernel.  For the
most part they're harmless.  They basically tell you that the CPU had to use an
inefficient method for accessing the requested memory.	However, these messages
can be confusing to our customers, so we'd like them fixed before rhel5 ships.

Comment 9 Prarit Bhargava 2006-09-13 11:20:52 UTC
Created attachment 136145 [details]
Fix hald unaligned access messages

Patch submitted by Yanmin Zhang to fix unaligned access messages.

Comment 11 David Zeuthen 2006-09-26 17:59:50 UTC
I think we fixed this upstream some time ago. Please try with hal-0.5.8
available from here 

(you need to rebuild the SRPM for ia64, note that this src.rpm incorrectly
doesn't BR libvolume_id-devel (fixed in later version) so please install that
RPM separately. Thanks.)

 http://people.freedesktop.org/~david/hal-0.5.8.1-fc6/

Thanks. (built for FC6 but should work on RHEL5 too)



Comment 12 David Zeuthen 2006-09-27 20:12:46 UTC
Should be fixed in 0.5.8.1-2 otherwise please reopen.

Comment 13 erikj 2006-12-21 02:59:53 UTC
Yes, seems fixed - checked rhel5 rc snap3.