Description of problem: Please note that this is in addition to the issue fixed by the patch in bug 434826. Since the only way to test the fix was to try an install with an anaconda image built from the new libnl I was unable to test this until the previous fix was pulled into RHEL5.2 which didn't happen until last night. That patch does make things better but we still hit unaligned accesses during stage1 anaconda: loader(761): unaligned access to 0x60000000001a2b54, ip=0x40000000000e4420 loader(761): unaligned access to 0x60000000001a2b5c, ip=0x40000000000e4430 loader(761): unaligned access to 0x60000000001a2d3c, ip=0x40000000000e43e0 loader(761): unaligned access to 0x60000000001a2d44, ip=0x40000000000e4420 I dug through the binaries and did some dissasembly and found these ip's resolve to these lines of code in link_msg_parser(): lib/route/link.c:289 link->l_mask |= LINK_ATTR_MAP; lib/route/link.c:293 link->l_master = nla_get_u32(tb[IFLA_MASTER]); lib/route/link.c:294 link->l_mask |= LINK_ATTR_MASTER; I have confirmed the offsets of l_mask and l_master are properly aligned, so it would appear the problem is that link itself is not aligned properly. I manually traced the code to figure out how "link" is allocated: lib/route/link.c:188 link = rtnl_link_alloc(); lib/route/link.c:599 return (struct rtnl_link *) nl_object_alloc_from_ops(&rtnl_link_ops); lib/object.c:70 new = nl_object_alloc(ops->co_size); lib/object.c:49 new = calloc(1, size); Not sure at all how this is possible. Unless there is some way to trick it calloc should never be able to return an unaligned pointer. Perhaps my assumption is wrong here. I need to find a way to reproduce this outside of anaconda. Version-Release number of selected component (if applicable): libnl-1.0-0.10.pre5.5 How reproducible: 100% Steps to Reproduce: 1. do a network based install on ia64 2. note unaligned access messages during anaconda stage1 3. Actual results: Expected results: Additional info:
needinfo on tgraf...
Thomas; the only change in libnl-1.0-0.10.pre5.5 is the addition of the patch that doug attached to the original libnl unaligned access bug. It's based on libnl-1.0-pre5 with a few patches.
I am continuing to debug this issue... I was able to run anaconda's stage one "/sbin/loader" in userspace. So, I built my own copy of anaconda to try to do some debugging but with my binary I do _not_ see the unaligned accesses, but if I run the original binary I do see them. I need to check with release engineering to see if it is possible that last night's anaconda was still built with the older version of libnl somehow. It is statically linked so I don't know how to check this from the binary (if someone has a suggestion please let me know).
Ah HA! Anaconda stage 1 is statically linked, it has not been rebuilt since libnl was fixed. This explains why the stage2 messages are resolved, that part of anaconda is dynamicaly linked. Sorry for the noise.