Bug 375201

Summary: Illegal Instruction upon launching
Product: [Fedora] Fedora Reporter: Name Witheld <sakuramboo>
Component: libzzubAssignee: Alexander Kahl <fedora>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 7CC: hdegoede, ville.skytta
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: 0.2.3-10.fc7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-22 03:38:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Aldrin core dump
none
Aldrin backtrace and partial disassembly none

Description Name Witheld 2007-11-11 00:26:35 UTC
Description of problem:

Installed through yum and running it from a command line produced the output

"Illegal instruction"

Version-Release number of selected component (if applicable): aldrin-0.11-5.fc7


How reproducible: Reproducible every time for me.

Steps to Reproduce:
1. yum install aldrin
2. Try to launch aldrin
  
Actual results:
[sakuramboo@~]$ aldrin
Illegal instruction

Expected results:
It should launch.


Additional info:

Comment 1 Alexander Kahl 2007-11-11 03:54:07 UTC
Hello Name,

the error you've encountered usually means a program was built with instructions
not understandable for the CPU it is running on. Could you please give the
architecture - i386, x86_64 or ppc (aldrin is not available on ppc64) - you're
trying to run aldrin on?

This is most probably a libzzub or pyzzub bug.

Comment 2 Alexander Kahl 2007-11-11 03:56:08 UTC
Sorry for the double post, I've just realized you've correctly set the
architecture for the bug. I'm going to investigate the i386 build.

Comment 3 Alexander Kahl 2007-11-14 18:28:27 UTC
I've tested aldrin on f7-i386 and cannot confirm the bug.

Name, could you please give me:
- the output of 'rpm -q libzzub pyzzub aldrin'
- the output of 'strace -F aldrin 2>&1 > aldrin.log' (attach aldrin.log)
- a core dump:
(in a terminal / console) >>>
ulimit -c unlimited
aldrin
<<<

This should create a core file in the current directory, please attach it.

Comment 4 Name Witheld 2007-11-14 22:37:03 UTC
Created attachment 259071 [details]
Aldrin core dump

Comment 5 Name Witheld 2007-11-14 22:39:14 UTC
Comment on attachment 259071 [details]
Aldrin core dump

[sakuramboo@~/Desktop]$ rpm -q libzzub pyzzub aldrin
libzzub-0.2.3-8.fc7
pyzzub-0.2.3-8.fc7
aldrin-0.11-5.fc7


the strace log is completely empty.

Comment 6 Alexander Kahl 2007-11-14 22:57:33 UTC
Thank you, I'm going to continue the investigation tomorrow.
Could you check whether there are any non-i386/noarch packages installed on your
system? 
Furthermore the output of
cat /proc/cpuinfo
could be useful here.

Comment 7 Name Witheld 2007-11-14 23:28:22 UTC
[sakuramboo@~]$ cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 6
model name      : AMD Athlon(tm) XP 2100+
stepping        : 2
cpu MHz         : 1726.068
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 mtrr pge mca cmov pat pse36
mmx fxsr sse syscall mmxext 3dnowext 3dnow up ts
bogomips        : 3452.84
clflush size    : 32

there are a bunch of noarch packages installed, in fact aldrin is noarch (there
is no other option for me through yum). I wouldn't even know where to begin
checking to see which noarch packages might be the problem.

Comment 8 Alexander Kahl 2007-11-14 23:54:28 UTC
> [sakuramboo@~]$ cat /proc/cpuinfo
> processor       : 0
> vendor_id       : AuthenticAMD
> cpu family      : 6
> model           : 6
[..]
OK, illegal i686/extended instructions are highly unlikely here.
 
> there are a bunch of noarch packages installed, in fact aldrin is noarch (there
> is no other option for me through yum). I wouldn't even know where to begin
> checking to see which noarch packages might be the problem.
aldrin and pyzzub are only comprised by Python byte-compiled code but since
libzzub is architecture dependent and pyzzub a subpackage, pyzzub is non-noarch
as well.
From what we know until now, the faulty processor instruction can only be in either:
- libzzub
- python itself
- one of aldrin's/libzzub's bindings' underlying libraries (there are lots!)

This is why I need you to check for accidentally installed arch-dependent
packages that are neither i386 or i686. I've double-checked aldrin and pyzzub
but there are no arch-dependent files included, libzzub also works properly on
my f-7 i386 test system. Let's hope the core dump will be insightful when I
investigate it tomorrow.

Could you also try if other pygtk2-dependent programs crash on your system? You
can get a list with
rpm -q --whatrequires pygtk2
and for non-installed packages
repoquery --whatrequires pygtk2

(repoquery is available in yum-utils)

Comment 9 Name Witheld 2007-11-16 02:33:45 UTC
i checked out other pygtk2 programs and going down the list, all of them work
but one, aldrin.

Comment 10 Alexander Kahl 2007-11-16 17:25:42 UTC
Created attachment 261511 [details]
Aldrin backtrace and partial disassembly

Created while running
$ gdb python core.2805
(-> Attachment 259067)

Comment 11 Alexander Kahl 2007-11-16 17:27:14 UTC
OK I've examined the stack during the crash, the error occurred in libzzub
during an assignment (sic!):

Core was generated by `/usr/bin/python /usr/bin/aldrin'.
Program terminated with signal 4, Illegal instruction.
#0  player (this=0xb7b0f008) at src/libzzub/player.cpp:200
200         workFracs=0.0f;

I've attached a backtrace and the disassembled region where class player's
constructor was called from, I cannot help myself going any further than that.
Consulting a colleague revealed he's encountered signal 4 before when the
compiler was missing a prototype and produced code that only ran on some cpus,
this or something similar could be the case here. Either this is indeed a
libzzub or even a g++ bug.

Comment 12 Hans de Goede 2007-11-19 11:07:44 UTC
Alexander, have you checked with what kind of cflags libzzub gets compiled, I
think the use of invalid CFLAGS during compile is the most likely culprit here.


Comment 13 Ville Skyttä 2007-11-19 18:36:21 UTC
Indeed.  https://koji.fedoraproject.org/koji/getfile?taskID=213416&name=build.log:

[...]
g++ -o src/libzzub/host.os -c -D__SCONS__ -DPOSIX -fPIC -O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4
-m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -DNDEBUG
-mfpmath=sse -msse2 [...]

I'm pretty sure that -mfpmath=sse and especially -msse2 would be inappropriate
even for a i686 build, let alone i386.

Comment 14 Callum Lerwick 2007-11-20 04:57:12 UTC
Ummm yeah. Anything older than a Pentium 4 or Athlon64 is going to barf on SSE2
code. Which I dare say are probably a majority of machines out there. :) Smolt
doesn't seem to give out as specific information as this though. It would be
nice to know how many machines out there have cmov, sse and sse2.

Comment 15 Alexander Kahl 2007-11-20 09:03:50 UTC
You guys are right, explicit sse(2) optimization definitely doesn't belong into
the CFLAGS, I'm going to fix the build. This explains the illegal instruction
crash, I should have thought of this myself earlier.
Thanks a lot!

Comment 16 Alexander Kahl 2007-11-20 16:51:51 UTC
Name,

would you please install the new libzzub build from updates-testing and report
if it works for you? The command is
yum --enablerepo=updates-testing update libzzub

Thank you

Comment 17 Fedora Update System 2007-11-20 17:59:01 UTC
libzzub-0.2.3-10.fc7 has been pushed to the Fedora 7 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update libzzub'

Comment 18 Name Witheld 2007-11-20 22:46:45 UTC
Everything is working beautifully. The SSE2 CFLAG was the problem.

Thank you, everyone.

Comment 19 Alexander Kahl 2007-11-21 08:57:25 UTC
You're welcome.

Comment 20 Fedora Update System 2007-11-22 03:38:04 UTC
libzzub-0.2.3-10.fc7 has been pushed to the Fedora 7 stable repository.  If problems still persist, please make note of it in this bug report.