Bug 808152

Summary: libnm-glib.so.4(nm_remote_settings_new..) crashes gnome-shell Was:Xorg can't use basic framebuffer (/dev/fb0) driver.
Product: [Fedora] Fedora Reporter: Konrad Rzeszutek Wilk <ketuzsezr>
Component: NetworkManagerAssignee: Dan Williams <dcbw>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: awilliam, dcbw, jklimes, mishu, robatino, tflink, xen-maint, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-04 07:35:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 752649    
Attachments:
Description Flags
Xorg.0.log file
none
Output of running 'startx' (stdout)
none
More intersting output (the stderr) of startx none

Description Konrad Rzeszutek Wilk 2012-03-29 17:35:19 UTC
Description of problem:

After installing Fedora Core 17 in graphical mode (no trouble), the first reboot shows that the framebuffer is working just fine, but when Xorg starts I get the '
Oh no, Something has gone wrong".

Thinking this might be related to the framebuffer, I ran:

"exec /sbin/init 3"
logged in, and ran 'fbterm' without any trouble - so it looks like the kernel /dev/fb0 is OK.

The last entries in the Xorg.0.log  are:

[  9204.704] (EE) FBDEV(0): FBIOBLANK: Invalid argument

and thinking this might be ShadowFB related I added this file:

[root@f17 ~]# more /etc/X11/xorg.conf.d/01-turn-off.conf 
Section "Device"
	Identifier "ugh"
	Driver "fbdev"
	Option	"UseFBDev"	"true"
	Option "ShadowFB" "Off"
EndSection
[root@f17 ~]# 


But no luck.

Version-Release number of selected component (if applicable):

1.12.0
How reproducible:

Always
Steps to Reproduce:
1. Launch a Xen virtual guest, using these parameters (well, need to modify the http of course):

virt-install -x "console=hvc0 loglevel=8 root=live:http://build/tftpboot/f17-i386/LiveOS/squashfs.img" -n F17 --ram 1024 -p --disk /dev/vg_guest/f17_32 --location http://build/tftpboot/f17-i386/

2. Install ( also clicked on Fedore 17 
3. Boot guest
  
Actual results:

"Oh no, Something has gone wrong".


Expected results:

A nice Xorg screen.

Additional info:

Comment 1 Adam Jackson 2012-04-02 14:32:34 UTC
The FBIOBLANK message is harmless, that's just fbdev's way of saying the device doesn't have DPMS controls.

If the default depth of the framebuffer is 16bpp then the crash is a bug in Mesa that should be "fixed" in current builds.  In that it'll render wrong, but at least not crash.  Can you attach /var/log/Xorg.0.log from this scenario?

Comment 2 Konrad Rzeszutek Wilk 2012-04-02 18:38:28 UTC
Created attachment 574587 [details]
Xorg.0.log file

So downloaded the latest from the builds:

[root@f17 ~]# rpm -qa | grep mesa
mesa-libxatracker-8.0.2-1.fc17.i686
mesa-dri-filesystem-8.0.2-1.fc17.i686
mesa-libglapi-8.0.2-1.fc17.i686
mesa-dri-drivers-8.0.2-1.fc17.i686
mesa-libGL-8.0.1-9.fc17.i686
mesa-libGLU-8.0.2-1.fc17.i686

and I still get the issue. Let me upload the Xorg.0.log files.

Comment 3 Konrad Rzeszutek Wilk 2012-04-02 19:53:28 UTC
Adam recommend I try installing KDM as see if the problem disappears.

And sure enough, once I did the install and did:

system-switch-displaymanager kdm

and exec /sbin/init 5

I got a normal splash screen with blue-red-white fireworks. And I can login (which by default is set to gnome) everything works fine. (Also if I select the KDE Plasma I can login as well).

Comment 4 Tim Flink 2012-04-02 20:25:08 UTC
Since this only seems to affect xen and the use of KDM is a workaround, I'm -1 blocker on this and about +.5 NTH. Even though this happens right after first install, it could be fixed through updates as long as people are installing w/ updates enabled.

Comment 5 Adam Williamson 2012-04-02 20:35:51 UTC
so we're definitely on llvmpipe stuff here.

Annoyingly, I can't seem to get fbdev to work at all in a KVM/cirrus/vnc VM to see if I can reproduce in that config. It just says the framebuffer device doesn't exist, even though I can do ls -l /dev/fb0 and it's right there.

But we have multiple reports of https://bugzilla.redhat.com/show_bug.cgi?id=809052 , which shows pretty strongly that a standard KVM/cirrus/vnc config doesn't hit *this* bug. I'm therefore -1 beta blocker on this one. Even if it hits any attempt to use fbdev, honestly, that's niche enough to not be a beta blocker in my view.

Other votes?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 6 Konrad Rzeszutek Wilk 2012-04-03 14:52:23 UTC
Created attachment 574889 [details]
Output of running 'startx' (stdout)

Comment 7 Konrad Rzeszutek Wilk 2012-04-03 14:53:42 UTC
Created attachment 574891 [details]
More intersting output (the stderr) of startx

.. which shows:

ing up monitor for changes to config file:'/root/.config/tracker/tracker-store.cfg'
Tracker-Message: Setting up monitor for changes to config file:'/root/.config/tracker/tracker-store.cfg'
Gtk-Message: Failed to load module "pk-gtk-module"
Tracker-Message: Setting up monitor for changes to config file:'/root/.config/tracker/tracker-miner-fs.cfg'
Failed to play sound: File or data not found

(nm-applet:13422): GLib-WARNING **: GError set over the top of a previous GError or uninitialized memory.
This indicates a bug in someone's code. You must ensure an error is NULL before it's set. 
The overwriting error message was: Unit dbus-org.freedesktop.NetworkManager.service failed to load: No such file or directory. See system logs and 'systemctl status dbus-org.freedesktop.NetworkManager.service' for details.

** (nm-applet:13422): WARNING **: _nm_remote_settings_ensure_inited: (NMRemoteSettings) error initializing: @@\xa3      

*** glibc detected *** nm-applet: munmap_chunk(): invalid pointer: 0x09a33f50 ***
======= Backtrace: =========
/lib/libc.so.6[0x44dac302]
/lib/libglib-2.0.so.0[0x4ccbc27c]
/lib/libglib-2.0.so.0(g_free+0x28)[0x4ccbc438]
/lib/libglib-2.0.so.0(g_error_free+0x2a)[0x4cca1d7a]
/lib/libnm-glib.so.4[0x4d92df3d]
/lib/libnm-glib.so.4(nm_remote_settings_new+0x45)[0x4d92ffe5]
nm-applet[0x805ae8b]
/lib/libgobject-2.0.so.0(g_object_newv+0x2dd)[0x4cdb14dd]
/lib/libgobject-2.0.so.0(g_object_new_valist+0x21a)[0x4cdb1d1a]
/lib/libgobject-2.0.so.0(g_object_new+0x84)[0x4cdb1f74]
nm-applet(nm_applet_new+0x29)[0x8060129]
nm-applet(main+0x18a)[0x805781a]
/lib/libc.so.6(__libc_start_main+0xf5)[0x44d52785]
nm-applet[0x8057885]
======= Memory map: ========

Comment 8 Konrad Rzeszutek Wilk 2012-04-03 23:21:37 UTC
The more I play with this the more I coming to the conclusion that it is:
 - not related to xorg-x11-fbdrv
 - not related to xorg-x11-server

but rather that libnm-glibc4 is causing gnome-shell to die. This looks to be quite similar to Debian: http://groups.google.com/group/linux.debian.bugs.dist/browse_thread/thread/6d7adbda10fc2340/6170139fc5786b36?lnk=raot

which implies that the Red Hat BZ 802536 caused the regression.

Comment 9 Konrad Rzeszutek Wilk 2012-04-03 23:24:02 UTC
changing title and product..

Comment 10 Adam Williamson 2012-04-04 01:05:09 UTC
huh. well, that's interesting, but it's still the case that no-one else has seen it, afaict.

Ah. this is interesting, from the Debian report:

"Indeed, installing network-manager makes the bug disappear."

Are you doing some kind of custom package install where NetworkManager isn't included?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 11 Adam Williamson 2012-04-04 01:08:06 UTC
According to the Debian bug, this and #809123 are likely dupes.

Jiri, what's your take on the blockeriness of this? What exactly is the scenario that reproduces it? NetworkManager not running? Any interface being handled by network rather than NM? I can't tell from jlaska's report whether he has NM running at all in his case.

Comment 12 Adam Williamson 2012-04-04 01:30:18 UTC
We made a call on this bug at the 2012-04-03 emergency blocker review meeting, but the understanding of the bug has changed completely since then, so that call is no longer relevant. Leaving the status open for further review.

We _really_ need to know the exact circumstance that triggers this. I _think_ it's 'starting GNOME with NetworkManager disabled', but would like definite confirmation.

Comment 13 Jirka Klimes 2012-04-04 07:35:13 UTC
(In reply to comment #11)
> According to the Debian bug, this and #809123 are likely dupes.
> 
> Jiri, what's your take on the blockeriness of this? What exactly is the
> scenario that reproduces it? NetworkManager not running? Any interface being
> handled by network rather than NM? I can't tell from jlaska's report whether he
> has NM running at all in his case.

The error was in libnm-glib while initializing NMRemoteSettings. However, it was
in error path, so it only affected the cases when the initialization failed for
a reason. It could crash whatever binary using libnm-glib (NM, nm-applet, ...).
I've built and submitted an update for F17 (F16's update had been done before).

(In reply to comment #12)
> We _really_ need to know the exact circumstance that triggers this. I _think_
> it's 'starting GNOME with NetworkManager disabled', but would like definite
> confirmation.
I don't know the reproducer as I wasn't hit myself. But as I said before, the
problem is in the error path of settings initialization.

Duping to bug 809123 based on backtrace in comment #7.

*** This bug has been marked as a duplicate of bug 809123 ***