Description of problem: Unaligned access in stage2 of the installer Version-Release number of selected component (if applicable): anaconda-11.1.2.128-1 How reproducible: Always Steps to Reproduce: 1. Install into text mode Actual results: Running anaconda, the Red Hat Enterprise Linux Server system installer - please wait... anaconda(527): unaligned access to 0x2000000001956b44, ip=0x2000000000018880 anaconda(527): unaligned access to 0x2000000001956b44, ip=0x2000000000018890 anaconda(527): unaligned access to 0x2000000001956b5c, ip=0x2000000000018880 anaconda(527): unaligned access to 0x2000000001956b5c, ip=0x2000000000018890 anaconda(527): unaligned access to 0x2000000001956b74, ip=0x2000000000018880 Probing for video card: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] Expected results: No unaligned access Additional info: This is on host roentgen
Was it on some type of special HW. s390, ia64, ppc? Is there a test log that we can see? It stops installation or does it continue? If it does not stop installation, do you see any behavior in the installed system? if you can install a system do you see those messages when you exec another app? thx for the info
ok, its ia64..... And I'm guessing it continues. And I'm further guessing that no other apps show such message. look at http://kbase.redhat.com/faq/FAQ_105_9111.shtm. Additional Question. Does this happen consistently? How many installs have presented this message. If this does not happen consistently, we might be able to safely ignore this.
As disscused with in anaconda irc, this is just the kernel being the kernel. Additionally if this message occurred in stage2 (which has no C code in anaconda) it is safe to say that the anaconda component did not cause the message. Pasting irc log: " <hansg> As for what an unaligned access is. On i386 and only on i386 (and x86_64) its allowed to read a 32 bit integer on an address which is not a multiple of 4 bytes <hansg> So on i386 you can read an integer starting at 2 bytes from the start of a page, then the hardware will do 2 32 bits reads, and shift them and or them together to get the intiger you want <hansg> On all other (read all sane hardware) this is not allowed (and on intel its dog slow) <hansg> On other hardware the kernel emulates the i386 behavior within the trap handler to stop code from crashing, and complains loudly while it does this <hansg> This was done as most code is written for and only ever tested on that shitty i386 architecture <hansg> So this can pretty much be ignored, it should be fixed one day but its not urgent "
Development Management has reviewed and declined this request. You may appeal this decision by reopening this request.
1. This BZ should not be closed. It is a very public facing issue and happens in the install. While it isn't an anaconda issue, this is a bug -- it should have been properly reassigned to the appropriate group (kernel). 2. Adding dchapman -- Doug, have you seen this? P.
Exception request: This type of bug will be seen during the install -- at a time when we definitely do not want any type of errors or warnings on the screen. This *must* be fixed prior to 5.3 shipping. P.
Just FYI - the machine is from HP, will test on Intel machines as well
I will dig into this one. FYI it is NOT a kernel bug: anaconda(527): unaligned access to 0x2000000001956b44, ip=0x2000000000018880 If it were in the kernel it would be "kernel unaligned access". This is probably in one of the shared libraries that anaconda uses. I will assign to myself and then re-assign to the proper component once that is known. - Doug
This bugzilla has Keywords: Regression. Since no regressions are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP.
May I suggest that as these messages are really harmless, and the entire issue now seems to be the fact that there are messages, that we just disable the messages during the installation ?
... and then we hope we don't encounter them when the OS boots? Let's find out what part of glibc is causing the problem and then decide what to do about them. P.
(In reply to comment #8) > I will dig into this one. FYI it is NOT a kernel bug: > > anaconda(527): unaligned access to 0x2000000001956b44, ip=0x2000000000018880 > > If it were in the kernel it would be "kernel unaligned access". This is > probably in one of the shared libraries that anaconda uses. I will assign to > myself and then re-assign to the proper component once that is known. > > - Doug Absolutely ;) -- I just had to put it somewhere :) P.
The problem appears to lie in python-pyblock, more specifically it appears to be something with /usr/lib/python2.4/site-packages/block/dmmodule.so being loaded at runtime via dlopen(). Here is a simple reproducer, compile using cc foo.c -ldl #include <dlfcn.h> main(){ dlopen("/usr/lib/python2.4/site-packages/block/dmmodule.so", RTLD_NOW); } This will reproduce the unaligned accesses on ia64. I will continue to dig.
Interestingly python-pyblock has not been rebuilt in over a year. Also, it no longer builds now (I tried to build it from .src.rpm with no success). I guess it must have been a glibc change that triggered this? suggestions welcome
OK, it appears this is a python-pyblock bug. With my reproducer from comment #13 I can reproduce it on a RHEL5.2 system so the bug has been there all along but evidently there was a change in anaconda that causes it to get loaded now when it wasn't in RHEL5.2. python-pyblock doesn't build anymore (which itself is a concern) so we need to fix that before we can debug this issue.
It turns out that python-pyblock is linked to libdmraid and the real issue is in libdmraid itself. I have narrowed it down to something in the source file lib/format/format.c in libdmraid but nothing obvious jumps out. But then again it is late, will dig more tomorrow.
There's debugging you can enable on that arch that should take you straight to the problem - install debug packages then run under gdb: prctl --unaligned=signal gdb <program> Then you break to the gdb prompt with: Program received signal SIGBUS, Bus error. But a common cause is casting variables into pointers that aren't aligned correctly - any casts of non-pointers into pointers are suspect. (One near the start of the upstream file I have in front of me needs checking, for example.) Fix may involve appending __attribute((aligned(8))) to troublesome declarations, such as char[N] or unaligned fields within structs.
(In reply to comment #15) > OK, it appears this is a python-pyblock bug. With my reproducer from comment > #13 I can reproduce it on a RHEL5.2 system so the bug has been there all along > but evidently there was a change in anaconda that causes it to get loaded now > when it wasn't in RHEL5.2. > > > python-pyblock doesn't build anymore (which itself is a concern) so we need to > fix that before we can debug this issue. It built ok when I did a scratch build. It also builds with if I use the command `make` and installs correctly when I `make install`, Am I missing something here. http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1496274
(In reply to comment #19) > (In reply to comment #15) > > OK, it appears this is a python-pyblock bug. With my reproducer from comment > > #13 I can reproduce it on a RHEL5.2 system so the bug has been there all along > > but evidently there was a change in anaconda that causes it to get loaded now > > when it wasn't in RHEL5.2. > > > > > > python-pyblock doesn't build anymore (which itself is a concern) so we need to > > fix that before we can debug this issue. > It built ok when I did a scratch build. It also builds with if I use the > command `make` and installs correctly when I `make install`, Am I missing > something here. > http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1496274 Strange, I get an error when I try to build via rpmbuild on a freshly installed RHEL5.3 system. I have another BZ open for that issue: https://bugzilla.redhat.com/show_bug.cgi?id=463857
The unaligned accesses come from this code in lib/format/format.c We have a packed struct: struct format_member { const char *msg; const unsigned short offset; const unsigned short flags; } __attribute__ ((packed)); then we declare an array of that type of struct: static struct format_member format_member[] = { { "name", offset(name), FMT_ALL }, { "description", offset(descr), FMT_ALL }, { "capabilities", offset(caps), 0 }, { "read", offset(read), FMT_ALL | FMT_METHOD }, { "write", offset(write), FMT_METHOD }, { "create", offset(create), FMT_METHOD }, ...... At link time when it tries to relocate this we hit an unaligned access on every other entry in the array. I imagine we would hit the same thing at runtime when handlers are registered. It appears that format_member[] is only used by functions in this file that verify the validity of handlers. Also, I don't see any cases of us casting format_member or using anything that would care that it is packed. In this situation it appears the only advantage of using __attribute__ ((packed)) here is we save a few bytes (a whopping total of 44 bytes). Am I missing something? I removed the packed attribute and it does get rid of the warnings during linking but I am not sure how to test it.
Doug, this is a regression, CHECK_FORMAT_HANDLER schouldn't be defined in format.c. Patch in CVS. Can I get a pm_ack to checkin and build ?
Fix checked in. Build dmraid-1_0_0_rc13-14_el5 done.
State -> MODIFIED
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-0078.html