Bug 162118 - apps segfault since updating kernel to 2.6.11-1.35_FC3
Summary: apps segfault since updating kernel to 2.6.11-1.35_FC3
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 3
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Dave Jones
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-06-30 02:47 UTC by Ignacio G.
Modified: 2015-01-04 22:20 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-07-28 18:57:14 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Ignacio G. 2005-06-30 02:47:46 UTC
We have some hardware compilers provided by third parties that started failing
since updating the kernel from kernel-smp-2.6.11-1.27-FC3.i686.rpm to
kernel-smp-2.6.11-1.35-FC3.i686.rpm.

These problems occur everytime when runing 2.6.11-1.35 with default values.

If, when running 2.6.11-1.35, we set /proc/sys/kernel/exec-shield to 0 or 1 (as
was the default in 2.6.11-1.27), these problems do not occur.

Booting the machine with 2.6.11-1.27 also fixes these problems, regardless of
the value in /proc/sys/kernel/exec-shield.


We don't have sources for these applications. If it helps, they use j2re v1.4.1.


Sample output:

With kernel-smp-2.6.11-1.35_FC3.i686.rpm:

$ fplc -o route.fpo -s route.fps ../../np/microcode/route.fpl

/sw/tools/agere-3.3.0.78/fplc: line 10: 31226 Segmentation fault     
"/usr/local/j2re-1.4.1/bin/java"
-Dagere.log_level="$LOG_LEVEL" -Dinstall.root="/sw/tools/agere-3.3.0.78"
-classpath
"/sw/tools/agere-3.3.0.78/lib:/sw/tools/agere-3.3.0.78/lib/agere.jar:/sw/tools/agere-3.3.0.78/lang:$CLASSPATH:/tmp"
-Xms"$MINHEAPSIZE" $MAXHEAPSIZE com.agere.fplc.fplc "$@"
$

With kernel-smp-2.6.11-1.27_FC3.i686.rpm:

$ fplc -o route.fpo -s route.fps ../../np/microcode/route.fpl
[Info] Agere FPL Compiler 3.3.0.78(Thu Jul  1 15:10:56 CDT 2004) swbuilds
[Info] Preprocessing Stage
[Info] Compiling ../../np/microcode/route.fpl for processor -- APP500
[Info] Compilation Stage
[Info] Output Generation Stage
[Info] Compilation Done
$

Additional info:

$ uname -a
Linux XXXXX 2.6.11-1.35_FC3smp #1 SMP Mon Jun 13 01:17:35 EDT 2005 i686 i686
i386 GNU/Linux
$
$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Intel(R) Xeon(TM) CPU 3.20GHz
stepping        : 1
cpu MHz         : 3201.365
cache size      : 1024 KB
physical id     : 0
siblings        : 2
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm pni monitor
ds_cpl cid cx16 xtpr
bogomips        : 6340.60

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Intel(R) Xeon(TM) CPU 3.20GHz
stepping        : 1
cpu MHz         : 3201.365
cache size      : 1024 KB
physical id     : 0
siblings        : 2
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm pni monitor
ds_cpl cid cx16 xtpr
bogomips        : 6389.76

$

Defaults for kernel-smp-2.6.11-1.35_FC3.i686.rpm:

$ grep '' /proc/sys/kernel/exec*
/proc/sys/kernel/exec-shield:2
/proc/sys/kernel/exec-shield-randomize:1
$


Defaults for kernel-smp-2.6.11-1.27_FC3.i686.rpm:

$ grep '' /proc/sys/kernel/exec*
/proc/sys/kernel/exec-shield:1
/proc/sys/kernel/exec-shield-randomize:1
$

Comment 1 Marek Kassur 2005-07-04 01:42:09 UTC
I have similar exec-shield problem with library application, which I don't have
source for. It stuck consuming all CPU power, and what's worse stopped my MTA
(sendmail[3872]: rejecting connections on daemon MTA: load average: 27).

Interesting part of strace:
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=39544576, ...}) = 0
mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 4, 0) = 0xb7d60000
mmap2(NULL, 184320, PROT_READ, MAP_PRIVATE, 4, 0xb78) = 0xb7d33000
mmap2(NULL, 28672, PROT_READ, MAP_PRIVATE, 4, 0xbc2) = 0xb7d2c000
close(4)                                = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
sigreturn()                             = ? (mask now [RTMIN])
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
sigreturn()                             = ? (mask now [RTMIN])
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
sigreturn()                             = ? (mask now [RTMIN])
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
sigreturn()                             = ? (mask now [RTMIN])
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
sigreturn()                             = ? (mask now [RTMIN])
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
sigreturn()
....

See also: bug #162182 and bug #162329

Comment 2 Dave Jones 2005-07-08 22:44:47 UTC
FWIW, the 2.6.12 based kernel in updates-testing resets the default exec-shield
setting to '1' again.

Hopefully that update will be going live soon.

Comment 3 Dave Jones 2005-07-15 18:29:19 UTC
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
may contain a fix for your problem.   Please update to this new kernel, and
report whether or not it fixes your problem.

If you have updated to Fedora Core 4 since this bug was opened, and the problem
still occurs with the latest updates for that release, please change the version
field of this bug to 'fc4'.

Thank you.

Comment 4 i.goyret 2005-07-21 23:08:25 UTC
I tried kernel 2.6.12-1.1372, but it fails to boot. See BugID # 163917 for the
failure details.

We are not planning to update into FC4 for the time being.

Comment 5 Dave Jones 2005-07-28 06:21:13 UTC
Ok, can you test that setting /proc/sys/kernel/exec-shield to 1 makes things
work again ?



Comment 6 Ignacio G. 2005-07-28 17:02:32 UTC
Yes, that makes things work fine, as I had mentioned in the original bug report.
Setting it to 0 also works fine.


Comment 7 Dave Jones 2005-07-28 18:57:14 UTC
ok, just go with that until we get an mkinitrd update pushed out, which should
resolve 163917, allowing you to update to the fixed kernel.


Comment 8 Ignacio G. 2005-08-03 23:11:04 UTC
Now that the mkinitrd problem was pushed out, I tried kernel 2.6.12-1.1372
but it fails a little differently than 2.6.11-1.35.

With 2.6.11-1.27, everything works perfectly.
With 2.6.11-1.35, each invocation of the compiler would fail.
With 2.6.12-1.1372, a few invocations of the compiler fail when doing multiple
simultaneous compiles. It seems that setting kernel.randomize_va_space=0 helps.

How about if the exec-shield/randomization patch is removed or disabled? It is
obviously not ready for prime time.

Comment 9 Dave Jones 2005-08-04 00:12:58 UTC
that sounds like a bug with the compiler. Please file a seperate bug.

The randomisation code is now upstream btw.

Comment 10 Ignacio G. 2005-08-04 02:32:59 UTC
Sorry, but I don't understand what you want me to do. File a separate bug for what?

Using the older kernel the app works flawlessly.

Using the newer kernel the app fails. Turning off exec-shield and the
randomization makes the app work again.

This is not a problem with the application but with the kernel.


Comment 11 Dave Jones 2005-08-04 04:15:14 UTC
There should be no reason for an application to fail due to randomisation. If it
goes away when you disable it, the application is faulty.


Comment 12 Ignacio G. 2005-08-04 04:59:24 UTC
And why would that be?

The fact that randomization of the user space introduces a problem means that
the problem lies with how the randomization is done or what it does, not with
the app that fails.

The proof is in the fact that the app works with previous kernels but doesn't
work with newer kernels. The problem is the kernel, not the app.


Note You need to log in before you can comment on or make changes to this bug.