Bug 448124
| Summary: | RHEL4-U7/ppc gnome desktop fails to start | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | James Laska <jlaska> | ||||||||
| Component: | comps | Assignee: | Daniel Mach <dmach> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | James Laska <jlaska> | ||||||||
| Severity: | medium | Docs Contact: | |||||||||
| Priority: | low | ||||||||||
| Version: | 4.7 | CC: | atodorov, dgregor, dmach, jrb, jturner, mbarnes, mikem, npetrov, rstrode, zcerza | ||||||||
| Target Milestone: | rc | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | All | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | RHBA-2008-0655 | Doc Type: | Bug Fix | ||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2008-07-24 19:07:09 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Bug Depends On: | |||||||||||
| Bug Blocks: | 354111 | ||||||||||
| Attachments: |
|
||||||||||
|
Description
James Laska
2008-05-23 16:07:05 UTC
Created attachment 306515 [details]
Screenshot demonstrating error dialogs on login
Can you attach the output of
ps -ef
and
ls -la /tmp/orbit-root
and
rpm -qa libbonobo gnome-panel --qf="%{name}-%{version}-%{release}.%{arch}\n"
(In reply to comment #2) > Can you attach the output of > ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 13:21 ? 00:00:00 init [5] root 2 1 0 13:21 ? 00:00:00 [migration/0] root 3 1 0 13:21 ? 00:00:00 [ksoftirqd/0] root 4 1 0 13:21 ? 00:00:00 [migration/1] root 5 1 0 13:21 ? 00:00:00 [ksoftirqd/1] root 6 1 0 13:21 ? 00:00:00 [events/0] root 7 1 0 13:21 ? 00:00:00 [events/1] root 8 6 0 13:21 ? 00:00:00 [khelper] root 47 6 0 13:21 ? 00:00:00 [kblockd/0] root 48 6 0 13:21 ? 00:00:00 [kblockd/1] root 49 1 0 13:21 ? 00:00:00 [khubd] root 76 1 0 13:21 ? 00:00:00 [rtasd] root 77 6 0 13:21 ? 00:00:00 [pdflush] root 78 6 0 13:21 ? 00:00:00 [pdflush] root 79 1 0 13:21 ? 00:00:00 [kswapd0] root 80 6 0 13:21 ? 00:00:00 [aio/0] root 81 6 0 13:21 ? 00:00:00 [aio/1] root 240 6 0 13:21 ? 00:00:00 [khvcd] root 247 1 0 13:21 ? 00:00:00 [kseriod] root 467 7 0 13:21 ? 00:00:00 [ata/0] root 468 6 0 13:21 ? 00:00:00 [ata/1] root 469 7 0 13:21 ? 00:00:00 [ata_aux] root 473 1 0 13:21 ? 00:00:00 [scsi_eh_0] root 474 1 0 13:21 ? 00:00:00 [scsi_eh_1] root 482 1 0 13:21 ? 00:00:00 [scsi_eh_2] root 494 1 0 13:21 ? 00:00:00 [scsi_eh_3] root 527 1 0 13:21 ? 00:00:00 [kjournald] root 1042 7 0 13:21 ? 00:00:00 [kauditd] root 1911 1 0 13:21 ? 00:00:00 udevd root 2460 6 0 13:21 ? 00:00:00 [kmpathd/0] root 2461 6 0 13:21 ? 00:00:00 [kmpathd/1] root 2552 1 0 13:21 ? 00:00:00 [kjournald] root 2730 6 0 13:21 ? 00:00:00 [ib_mcast] root 2731 6 0 13:21 ? 00:00:00 [ib_inform] root 2732 7 0 13:21 ? 00:00:00 [local_sa] root 2735 7 0 13:21 ? 00:00:00 [ib_cm/0] root 2736 6 0 13:21 ? 00:00:00 [ib_cm/1] root 2756 7 0 13:21 ? 00:00:00 [ib_addr] root 2762 6 0 13:21 ? 00:00:00 [iw_cm_wq] root 2770 6 0 13:21 ? 00:00:00 [rdma_cm] root 3148 7 0 13:21 ? 00:00:00 [ipoib] root 3246 7 0 13:21 ? 00:00:00 [sdp] root 10173 1 0 13:21 ? 00:00:00 /sbin/dhclient -1 -q -lf /var/lib/dhcp/dhclient-eth0.leases -pf /var/run/dhclient-eth0.pid eth0 root 10988 1 0 13:21 ? 00:00:00 syslogd -m 0 root 11132 1 0 13:21 ? 00:00:00 klogd -x rpc 11598 1 0 13:21 ? 00:00:00 portmap rpcuser 12064 1 0 13:21 ? 00:00:00 rpc.statd root 13599 1 0 13:21 ? 00:00:00 rpc.idmapd root 15379 1 0 13:21 ? 00:00:01 /sbin/iprinit --daemon root 16282 1 0 13:21 ? 00:00:01 /sbin/iprupdate --daemon root 25674 1 0 13:22 ? 00:00:00 /usr/sbin/smartd root 25707 1 0 13:22 ? 00:00:00 cupsd root 25760 1 0 13:22 ? 00:00:00 /usr/sbin/sshd root 25775 1 0 13:22 ? 00:00:00 xinetd -stayalive -pidfile /var/run/xinetd.pid root 25811 1 0 13:22 ? 00:00:00 sendmail: accepting connections smmsp 25821 1 0 13:22 ? 00:00:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue root 25878 1 0 13:22 ? 00:00:00 gpm -m /dev/input/mice -t imps2 htt 25989 1 0 13:22 ? 00:00:00 /usr/sbin/htt -retryonerror 0 htt 25990 25989 0 13:22 ? 00:00:00 htt_server -nodaemon canna 26002 1 0 13:22 ? 00:00:01 /usr/sbin/cannaserver -syslog -u canna root 26014 1 0 13:22 ? 00:00:00 crond xfs 26081 1 0 13:22 ? 00:00:00 xfs -droppriv -daemon root 26091 1 0 13:22 ? 00:00:00 anacron -s root 26100 1 0 13:22 ? 00:00:00 /usr/sbin/atd dbus 26126 1 0 13:22 ? 00:00:00 dbus-daemon-1 --system root 26140 1 0 13:22 ? 00:00:00 cups-config-daemon root 26161 1 0 13:22 ? 00:00:00 /sbin/iprdump root 26394 1 0 13:22 hvsi0 00:00:00 /sbin/agetty hvsi0 9600 vt100-nav root 26395 1 0 13:22 tty1 00:00:00 /sbin/mingetty tty1 root 26396 1 0 13:22 tty2 00:00:00 /sbin/mingetty tty2 root 26397 1 0 13:22 tty3 00:00:00 /sbin/mingetty tty3 root 26398 1 0 13:22 tty4 00:00:00 /sbin/mingetty tty4 root 26399 1 0 13:22 tty5 00:00:00 /sbin/mingetty tty5 root 26400 1 0 13:22 tty6 00:00:00 /sbin/mingetty tty6 root 26401 1 0 13:22 ? 00:00:00 /usr/bin/gdm-binary -nodaemon root 27129 26401 0 13:22 ? 00:00:00 /usr/bin/gdm-binary -nodaemon root 27131 27129 0 13:22 ? 00:00:02 /usr/X11R6/bin/X :0 -audit 0 -auth /var/gdm/:0.Xauth -nolisten tcp gdm 27358 27129 0 13:22 ? 00:00:01 /usr/bin/gdmgreeter root 27740 25760 0 13:28 ? 00:00:00 sshd: root@pts/0 root 27742 27740 0 13:29 pts/0 00:00:00 -bash root 27923 1 0 13:31 ? 00:00:00 [rpciod] root 27924 1 0 13:31 ? 00:00:00 [lockd] root 28114 1 1 13:47 pts/0 00:00:00 Xvnc :1 -desktop ibm-l4a-lp1.test.redhat.com:1 (root) -httpd /usr/share/vnc/classes -auth /root/.Xauthority -geometry 1024x768 -depth 16 -rfbwait 30000 -rfbauth /root/.vnc/passwd -rfbport 5901 -pn root 28116 1 2 13:47 pts/0 00:00:00 /usr/bin/gnome-session root 28139 1 0 13:47 pts/0 00:00:00 /usr/bin/dbus-launch --exit-with-session /etc/X11/xinit/Xclients root 28140 1 0 13:47 ? 00:00:00 dbus-daemon-1 --fork --print-pid 8 --print-address 6 --session root 28146 1 2 13:47 pts/0 00:00:00 /usr/libexec/gconfd-2 13 root 28148 1 0 13:47 pts/0 00:00:00 /usr/bin/gnome-keyring-daemon root 28150 1 0 13:47 ? 00:00:00 /usr/libexec/bonobo-activation-server --ac-activate --ior-output-fd=19 root 28152 1 0 13:47 ? 00:00:00 /usr/bin/metacity --sm-client-id=default1 root 28156 1 1 13:47 ? 00:00:00 gnome-panel --sm-client-id default2 root 28158 1 1 13:47 ? 00:00:00 nautilus --no-default-window --sm-client-id default3 root 28162 1 0 13:47 ? 00:00:00 eggcups --sm-client-id default5 root 28164 1 0 13:47 ? 00:00:00 pam-panel-icon --sm-client-id default0 root 28166 1 0 13:47 ? 00:00:00 /usr/libexec/gam_server root 28171 28164 0 13:47 ? 00:00:00 /sbin/pam_timestamp_check -d root root 28174 27742 0 13:47 pts/0 00:00:00 ps -wef > ls -la /tmp/orbit-root total 56 drwx------ 2 root root 4096 May 23 13:47 . drwxrwxrwt 11 root root 4096 May 23 13:48 .. -rwx------ 1 root root 0 May 23 13:30 bonobo-activation-register.lock -rw-r--r-- 1 root root 661 May 23 13:47 bonobo-activation-server-ior srwxr-xr-x 1 root root 0 May 23 13:47 linc-6dd4-0-5f2a6b8338bc2 srwxr-xr-x 1 root root 0 May 23 13:47 linc-6df2-0-3ba4b83337f9e srwxr-xr-x 1 root root 0 May 23 13:47 linc-6df6-0-10e3cab25cc3e srwxr-xr-x 1 root root 0 May 23 13:47 linc-6df8-0-1686f0bdb062c srwxr-xr-x 1 root root 0 May 23 13:47 linc-6dfc-0-cc28e7824b5a srwxr-xr-x 1 root root 0 May 23 13:47 linc-6dfe-0-cc28e7839cf9 srwxr-xr-x 1 root root 0 May 23 13:47 linc-6e02-0-cc28e787477e > rpm -qa libbonobo gnome-panel --qf="%{name}-%{version}-%{release}.%{arch}\n" libbonobo-2.8.0-2.ppc64 libbonobo-2.8.0-2.ppc gnome-panel-2.8.1-9.el4.ppc so bonobo and multilib don't get along in rhel 4. We fixed this in rhel5 but we can't do the same fix in rhel 4 (it's way too invasive). For some reason the ppc64 bit libbonobo is getting pulled in now. That's what's causing this problem. We need to figure out why libbonobo.ppc64 is getting pulled in on ppc. Created attachment 308445 [details]
Find offending package
So jlaska gave me access to the machine over irc and I investigated a bit
today.
The above shell script tries to figure out which chain of packages are pulling
in libbonobo.ppc64
[root@ibm-l4a-lp1 ~]# sh find-the-evil-one.sh libbonobo.ppc64
gnome-vfs2.ppc64
gtkhtml2.ppc64
EVIL: gtkhtml2.ppc64
So it seems that gtkhtml2 is what is causing the problem. Note though that
gtkhtml2 isn't in the list of packages that were mentioned as new in 4.7 from
4.6.
I wonder if it got installed manually after the upgrade? jlaska, can you
reproduce this problem on any other machine? or on this machine doing a fresh
install and upgrade?
I've just reinstalled the ibm-l4a-lp1.test.redhat.com system using RHEL4-U7-re20080604.0 and it continues to show the failure on startup of the gnome desktop. $ vncviewer ibm-l4a-lp1.test.redhat.com:1 (passwd == redhat) Comparing the RHEL4/U6/AS/ppc tree with RHEL4-U7-re20080604.0/AS/ppc shows the following added packages: WARNING 55 package(s) added to 'RPMS' INFO Name Arch INFO ------------------------------------------ INFO aide ppc INFO bzip2-devel ppc64 INFO compat-boost-1331-devel ppc INFO compat-boost-1331 ppc INFO compat-dapl-1.2.5 ppc INFO compat-dapl-devel-1.2.5 ppc INFO compat-dapl-static-1.2.5 ppc INFO dapl-devel ppc INFO dapl-static ppc INFO dapl-utils ppc INFO dapl ppc INFO finch-devel ppc INFO finch ppc INFO gnome-vfs2 ppc64 INFO ibsim ppc INFO ibvexdmtools ppc INFO infiniband-diags ppc INFO libaio-devel ppc64 INFO libbonobo ppc64 INFO libcxgb3-static ppc INFO libehca-static ppc INFO libibcm-static ppc INFO libibcommon-static ppc INFO libibmad-static ppc INFO libibumad-static ppc INFO libibverbs-static ppc INFO libipathverbs ppc64 INFO libmlx4-static ppc INFO libmlx4 ppc INFO libmthca-static ppc INFO libnes-static ppc INFO libnes ppc INFO libpurple-devel ppc INFO libpurple-perl ppc INFO libpurple-tcl ppc INFO libpurple ppc INFO librdmacm-static ppc INFO libsmi-devel ppc INFO libsmi ppc INFO mpi-selector noarch INFO nspr-devel ppc INFO nspr ppc INFO nspr ppc64 INFO nss-devel ppc INFO nss ppc INFO nss ppc64 INFO ofed-docs ppc INFO opensm-static ppc INFO pexpect noarch INFO pidgin-devel ppc INFO pidgin-perl ppc INFO qperf ppc INFO srptools ppc INFO unixODBC-devel ppc64 INFO wacomexpresskeys ppc Additionally, there are some potentially interesting lines in the compose logs that might help identify what's pulling in the bad ppc64 packages: http://porkchop.devel.redhat.com/rel-eng/RHEL4-U7-re20080604.0/ppc/ppc-logs/ Diff'ing the comps.xml files for RHEL4/U6 and RHEL4/U7 shows that gnome-vfs2 has
been added to the compat-arch group
$ diff -u /mnt/redhat/released/RHEL-4/U6/AS/ppc/tree/RedHat/base/comps.xml
/mnt/redhat/rel-eng/RHEL4-U7-re20080604.0/ppc/ppc-AS/RedHat/base/comps.xml
@@ -1274,6 +1191,7 @@
<packagereq type='default'>gmp</packagereq>
<packagereq type='default'>gnome-keyring</packagereq>
<packagereq type='default'>gnome-themes</packagereq>
+ <packagereq type='default'>gnome-vfs2</packagereq>
<packagereq type='default'>gphoto2</packagereq>
<packagereq type='default'>gpm</packagereq>
<packagereq type='default'>gsl</packagereq>
ah that would explain it. It doesn't explain obviously explain why gtkhtml2.ppc64 is getting pulled in, also, though. Anyway, that's our problem, so we should figure out why it got added and solve that problem in a different way. Looks like it was added by Mike. Mike, why do we need gnome-vfs2 in comps? I take that back, Mike was just moving stuff around I think. Looking at cvs log we have: ---------------------------- revision 1.1 date: 2008/01/23 17:50:21; author: dmach; state: Exp; created rhel4.7 comps, added compat-libstdc++-33 to base and gnome-vfs2 to multilib Daniel, why was that needed? This change was for bug#354111 (Daniel, in future, please reference bug numbers in the changelog if possible when making comps changes). This situation nicely illustrates why I tend to push back on multilib changes, and why I prefer to get them in early so that there is time for the bugs to shake out. (see https://bugzilla.redhat.com/show_bug.cgi?id=354111#c43 ) The comps change went in /months/ ago. Why are we only seeing this issue come up now? (bug filed about 2 weeks ago) Mike: fair question. I suspect several factors ... it's ppc, it's RHEL4, and it's gnome. I suspect we *only* hit this if you perform an @everything or a @compat-arch install ... which is common for installation test, but not for desktop testing. I'll chase that portion down. Ray: Since this does seem to pop up every so often, is there any way we can detect such multilib failure conditions in an automated fashion? Or is this just specific to these *exact* versions of bonobo not being multilib friendly? Let me explain the problem. libbonobo ships with a program called bonobo-activation-server. Apps that link against libbonobo talk to bonobo-activation-server whenever they want to run a component that provides an interface via bonobo. The apps normally say something like "bonobo-activation-server, please launch a program that provides the function Foo()" and then bonobo-activation-server looks for the binary and starts it and then when the program is started the client tells the program to run the function Foo(). That works fine in multilib setups. Some bonobo components aren't separate binaries though. They run in the process of the client instead. In those cases, a client will say "bonobo-activation-server, please tell me the path of the library that provides the function Foo()" and then bonobo-activation-server looks for the library and returns it back to the client who dlopens the library and then runs Foo(). Now 64-bit programs can't dlopen() 32-bit libraries and vice versa. This means that for in-process components, bonobo-activation-server needs to return the path to a 32-bit library for 32-bit clients and 64-bit library for 64-bit clients. It doesn't do this in RHEL 4. Instead, if bonobo-activation-server is 32-bit it will always return a path to a 32-bit library and if it is 64-bit it will always return a path to a 64-bit library. I fixed this in rhel5, but the fix requires changing every package that ships shared library (in-process) bonobo components. We can't make a change that is that invasive in RHEL4. Now one important thing to keep in mind is we hardly ever want to run 64-bit ppc binaries (32-bit binaries on ppc outperform and in general are just better than 64-bit binaries on ppc). Normally we want a 64-bit ppc kernel and everything else 32-bit. That means we always want to ship a 32-bit bonobo-activation-server, 32-bit bonobo components and 32-bit programs to use those components. A bug in our multilib set up is that 64-bit packages always "win". If you've got the 64-bit and 32-bit package which bonobo-activation-server in it, then we'll end up with a 64-bit bonobo-activation-server on disk. For ppc, we really want 32-bit packages to win, but that's not going to change in RHEL4. As mentioned earlier, bonobo-activation-server is in the libbonobo package. This means we can't install the 64-bit libbonobo package on a multilib ppc installation. For automated testing purposes, you can detect this failure by identifying whether bonobo-activation-server is 32-bit or 64-bit and then checking the opposite bonobo server dir for shlib components. I'll attach a little shell script that i think should work as an example. Created attachment 308536 [details]
look for incompatible .server files
(note this test isn't sufficient for testing the fix deployed on rhel5, it's only sufficient for finding the problem on rhel4) Summary of previous comments: - it's not reasonable to fix libbonobo in rhel4 - to fix this issue, we have to revert bug#354111 (gnome-vfs2 was added to multilib list in comps, we need to remove it) - according the email I got from John Jarvis, IBM agrees with this revert gnome-vfs2 has been removed from compat-arch-support group. VERIFIED changes on RHEL4-U7-re20080618.0 and RHEL4/ppc desktop is now happy. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0655.html |