Bug 427707

Summary: i686 io_getevents syscalls clobbers registers it shouldn't
Product: [Fedora] Fedora Reporter: Robert Scheck <redhat-bugzilla>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: rawhideCC: drepper, jakub, mebrown, mingo, roland, tglx
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-11 21:46:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 235706    
Attachments:
Description Flags
Output of "rpmbuild -ba --target i686,i386 glibc.spec"
none
build.log
none
io_getevents_bug.c none

Description Robert Scheck 2008-01-06 23:28:00 UTC
Description of problem:
I'm doing now glibc rebuilds for years now, today I had a behaviour which looks 
like a bug to me. Maybe it's just my dumbness, but who knows? I don't. I got 
gcc-4.3.0-0.4.i386 via Koji, installed on my Fedora 8/Rawhide system and tried 
to rebuild glibc-2.7.90-3 for i686 and i386. The rebuilding it self worked, but
make check had several errors, e.g.:

make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl/math/
test-float.out] Error 1
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl/math/
test-ildoubl.out] Error 1
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl/math/
test-ifloat.out] Error 1

Expected signal 'Alarm clock' from child, got none
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl/nptl/
tst-eintr2.out] Error 1
Expected signal 'Alarm clock' from child, got none
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl/nptl/
tst-eintr5.out] Error 1

Didn't expect signal from child: got `Segmentation fault'
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl/
rtkaio/tst-aiod.out] Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl/
rtkaio/tst-aiod64.out] Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl/
rtkaio/tst-aiod3.out] Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl/
rtkaio/tst-aiod4.out] Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl/
rtkaio/tst-aiod5.out] Error 1

make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/math/test-ifloat.out] Error 1
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/math/test-float.out] Error 1
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/math/test-ildoubl.out] Error 1

make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/nptl/tst-eintr5.out] Error 1

make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/rt/tst-cpuclock1.out] Error 1
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/rt/tst-cpuclock2.out] Error 1
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/rt/tst-mqueue5.out] Error 1

Didn't expect signal from child: got `Segmentation fault'
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/rtkaio/tst-aiod3.out] Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/rtkaio/tst-aiod4.out] Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/rtkaio/tst-aiod5.out] Error 1
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/rtkaio/tst-cpuclock1.out] Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/rtkaio/tst-aiod.out] Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/rtkaio/tst-aiod64.out] Error 1
make[2]: *** [/usr/src/rpm/BUILD/glibc-20080103T1958/build-i686-linuxnptl-
nosegneg/rtkaio/tst-cpuclock2.out] Error 1

And in /var/log/messages I found lines like:

kernel: ld-linux.so.2[3571]: segfault at 00000000 eip 00000000 esp 401cd248 
error 4
kernel: ld-linux.so.2[3579]: segfault at 00000000 eip 00000000 esp 401c6248 
error 4
kernel: ld-linux.so.2[3584]: segfault at 00000000 eip 00000000 esp 401c6248 
error 4
kernel: ld-linux.so.2[3588]: segfault at 00000000 eip 00000000 esp 401c6248 
error 4
kernel: ld-linux.so.2[13467]: segfault at 00000000 eip 00000000 esp 401c9248 
error 4
kernel: ld-linux.so.2[13496]: segfault at 00000000 eip 00000000 esp 401c9248 
error 4
kernel: ld-linux.so.2[13498]: segfault at 00000000 eip 00000000 esp 401c9248 
error 4

I'll attach the full build log soon as I'm too less clued to debug this. From 
what I can guess, the problems seem to be caused by gcc-4.3.0-0.4 somehow. But 
you are the experts.

Version-Release number of selected component (if applicable):
gcc-4.3.0-0.4
glibc-2.7.90-3

How reproducible:
Everytime, see above.

Actual results:
Strange messages like
 - Expected signal 'Alarm clock' from child, got none
 - Didn't expect signal from child: got `Segmentation fault'
 - Float errors (?)

Expected results:
Not such strange messages... ;-)

Additional info:
Please let me know, if you need further information or if I can help you 
somehow else regarding this.

Comment 1 Robert Scheck 2008-01-07 00:34:44 UTC
Created attachment 290923 [details]
Output of "rpmbuild -ba --target i686,i386 glibc.spec"

Comment 2 Robert Scheck 2008-01-11 16:44:10 UTC
Jakub, I'm seeing the same with gcc-4.3.0-0.5...

Comment 3 Robert Scheck 2008-01-30 23:58:01 UTC
I'm still seeing the same when building glibc-2.7.90-4 by using gcc-4.3.0-0.6

Comment 4 Robert Scheck 2008-02-06 22:45:30 UTC
Oh, same with gcc-4.3.0-0.7 and glibc-2.7.90-6 as well as in a local mock.

Comment 5 Robert Scheck 2008-02-10 22:13:33 UTC
Jakub? Roland?

Comment 6 Roland McGrath 2008-02-10 22:33:42 UTC
Those crashes are not coming up in the koji builds.
Maybe experiment with different kernels to see if that affects this showing up.

Comment 7 Robert Scheck 2008-02-10 22:51:19 UTC
Roland, is it enough to switch kernel-headers or should I completely change
the kernel including reboots etc.?

Comment 8 Roland McGrath 2008-02-10 23:01:28 UTC
I was referring to the kernel running when you do 'make check', not to the
headers used to build glibc.

Comment 9 Robert Scheck 2008-03-01 12:56:40 UTC
Roland, I can see this with any F8 kernel, e.g. 2.6.24.3-12.fc8. I didn't try all 
Rawhide kernels yet, but it seems only to succeed on a RHEL5 kernel, which is just 
horribly wrong then. Can you please really have a closer look to it?

Comment 10 Robert Scheck 2008-03-09 14:06:28 UTC
I think, I know why this not shows up in mockbuild, because of mock! :-( So here
are the parallels visible for me:

Mockbuild (part a):
make -s subdir=rtkaio -C rtkaio ..=../ tests
make[2]: Entering directory `/builddir/build/BUILD/glibc-20080305T0857/rtkaio'
failed to create a shared memory object: shm_open: Function not implemented
make[2]: Leaving directory `/builddir/build/BUILD/glibc-20080305T0857/rtkaio'

My build without mock (part a):
make -s subdir=rtkaio -C rtkaio ..=../ tests
make[2]: Entering directory `/usr/src/rpm/BUILD/glibc-20080305T0857/rtkaio'
Didn't expect signal from child: got `Segmentation fault'
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl/rtkaio/tst-aiod.out]
Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl/rtkaio/tst-aiod64.out]
Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl/rtkaio/tst-aiod3.out]
Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl/rtkaio/tst-aiod4.out]
Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl/rtkaio/tst-aiod5.out]
Error 1
make[2]: Target `tests' not remade because of errors.
make[2]: Leaving directory `/usr/src/rpm/BUILD/glibc-20080305T0857/rtkaio'
make[1]: *** [rtkaio/tests] Error 2


Mockbuild (part b):
make -s subdir=rtkaio -C rtkaio ..=../ tests
make[2]: Entering directory `/builddir/build/BUILD/glibc-20080305T0857/rtkaio'
make[2]: Leaving directory `/builddir/build/BUILD/glibc-20080305T0857/rtkaio'
make[2]: Entering directory `/builddir/build/BUILD/glibc-20080305T0857/rtkaio'
failed to create a shared memory object: shm_open: Function not implemented
make[2]: Leaving directory `/builddir/build/BUILD/glibc-20080305T0857/rtkaio'

My build without mock (part a):
make -s subdir=rtkaio -C rtkaio ..=../ tests
make[2]: Entering directory `/usr/src/rpm/BUILD/glibc-20080305T0857/rtkaio'
make[2]: Leaving directory `/usr/src/rpm/BUILD/glibc-20080305T0857/rtkaio'
make[2]: Entering directory `/usr/src/rpm/BUILD/glibc-20080305T0857/rtkaio'
Didn't expect signal from child: got `Segmentation fault'
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl-nosegneg/rtkaio/tst-aiod3.out]
Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl-nosegneg/rtkaio/tst-aiod5.out]
Error 1
Didn't expect signal from child: got `Segmentation fault'
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl-nosegneg/rtkaio/tst-aiod4.out]
Error 1
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl-nosegneg/rtkaio/tst-cpuclock1.out]
Error 1
Didn't expect signal from child: got `Segmentation fault'
Didn't expect signal from child: got `Segmentation fault'
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl-nosegneg/rtkaio/tst-aiod64.out]
Error 1
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl-nosegneg/rtkaio/tst-aiod.out]
Error 1
make[2]: ***
[/usr/src/rpm/BUILD/glibc-20080305T0857/build-i686-linuxnptl-nosegneg/rtkaio/tst-cpuclock2.out]
Error 1
make[2]: Target `tests' not remade because of errors.
make[2]: Leaving directory `/usr/src/rpm/BUILD/glibc-20080305T0857/rtkaio'
make[1]: *** [rtkaio/tests] Error 2

For me it seems, that it doesn't change anything when 'make check' is executed 
with another kernel running. Is it possible, that mock doesn't support shm_open?

Comment 11 Michael E Brown 2008-03-09 17:14:15 UTC
I have added support for shm to mock in git. Can you please test this version to
see if it fixes your problem?

Instructions on how to download and compile new version are here:
https://fedorahosted.org/mock

Comment 12 Robert Scheck 2008-03-09 20:40:48 UTC
Using the latest mock, I'm now able to see this segmentation faults in mock as 
well, thanks to Ricky Zhou for testing this on a more fast machine.

Okay...work for you, Roland and/or Jakub -- "make check" of glibc is now always
segfaulting for these special parts in mock...upgrading to blocker, as this seems
more critical to me now.

Comment 13 Robert Scheck 2008-03-09 20:42:10 UTC
Created attachment 297377 [details]
build.log

Comment 14 Jesse Keating 2008-04-01 20:19:51 UTC
Jakub, ping on this issue, do you think it's reasonable to have a fix for Fedora
9 or should we punt to 10?

Comment 15 Jakub Jelinek 2008-04-10 15:44:47 UTC
Got to test this on i686 kernel and this is a kernel bug.  See following small
testcase.
gcc -m32 -O2 -o io_getevents_bug io_getevents_bug.c
./io_getevents_bug

For i?86, glibc (and other userland stuff) assumes that syscalls only clobber
%eax register (with the return value), all other registers have their value
preserved.
I don't have access to many i686 kernels anymore, so all I could verify is that on
2.6.23.12-52.fc7.x86_64 and current RHEL5.2 beta x86_64 kernels %esi is preserved
across io_getevents int $0x80, but in 2.6.24.4-64.fc8.i686 and as ajax tested
for me also in 2.6.25-0.204.rc8.git4.fc9.i686 this prints:
Bug - %esi modified by io_getevents syscall, 2 * sizeof (struct kio_event) has
been added
(where instead of 2 it can print 1 through 10, basically io_getevents changes
%esi to point after the last struct kio_event).

This seems pretty serious bug, could kernel folks please investigate ASAP
at least which syscalls are broken in which kernels?

Comment 16 Jakub Jelinek 2008-04-10 15:47:08 UTC
Created attachment 302012 [details]
io_getevents_bug.c

Comment 17 Jakub Jelinek 2008-04-10 16:23:11 UTC
Looking at kernel code and googling around, this smells like
http://kerneltrap.org/node/6521, except that in this case it isn't a sibcall
that causes a problem, but probably high register presure together with inlining
the read_events function which increments the events argument.
As a quick hack adding noinline attribute to read_events could very likely help
(just guessing, haven't tried that) and guess we should revive PR27234.

Comment 18 Roland McGrath 2008-04-10 21:14:34 UTC
We definitely need that ABI-changing attribute to solve this sanely in the future.

I reproduced a build with the problem using vanilla upstream sources (and f8's
compiler).  I tried an asm hack akin to prevent_tail_call(), to keep the args
live at the end of the function, and that changed code generation not to tickle
the problem.  I'll turn it into a general macro hack and send it upstream.

Comment 19 Chuck Ebbert 2008-04-11 21:46:12 UTC
This is fixed in 2.6.25-rc9. A subset of that fix is in 2.6.24.4-80.fc8