Created attachment 518063 [details]
RPM list for system WITH bug
Description of problem:
When using a remote desktop session via VNC (started using vncserver), gnome-panel will hang when clicking on the System menu. This causes Gnome in general to stop responding - Terminals are no longer selectable, windows no longer switchable, etc. As such, the desktop session is entirely unusable when this bug occurs. Kill-HUPing gnome-panel restores functionality, though the hang can easily re-occur. This bug appears to be due to some deep dependency (likely X11), as it is not tied to the specific version of gnome-panel (see Additional Info, below).
Version-Release number of selected component (if applicable):
2.16.1-7.el5 (though it would appear the bug may be due to a dependency, not with gnome-panel itself)
Always, on affected systems; never, on unaffected systems (see Additional Info). Occurs only for sessions started using vncserver; does not occur for console sessions accessed via vino-server.
Steps to Reproduce:
1. Start a VNC session using vncserver; make sure it runs Gnome (issue the 'startx' command in the .vnc/xstartup)
2. Log in to the session via a VNC client (any client). Open some Terminal windows (and any other windows you choose) to test step 4.
3. Click around in the menus - Applications and Places works fine, but when clicking on System, gnome-panel will hang and become completely unresponsive.
4. Note that the Terminals are now also unresponsive.
5. Via console access or ssh, issue a kill -HUP to gnome-panel, to get a new instance started; functionality is restored, though performing step 3 will repeat the hang.
Upon clicking the System menu, gnome-panel hangs, and causes other gnome services to become completely unresponsive.
Things should work properly when clicking the System menu.
This bug was originally filed for CentOS (see external bug link), and a workaround described there is to boot the system into runlevel 3. At runlevel 3, this bug does not manifest. However:
1) This bug manifests on RHEL as well (thus it is upstream from CentOS) and
2) I have another system at runlevel 5 where this bug does _not_ occur...
#2 means that the issue must be tied to some library (likely X11?) that is (a) different between the two systems, and (b) disabled at runlevel 3. Hopefully, this information can help to narrow down which library might be the culprit that is causing gnome-panel to experience these hangs.
The two RHEL systems are very similar, but not entirely identical; importantly, they are both running RHEL 5.5 with the _SAME_ versions of Gnome and all of its packages (except for a couple of gnome-python packages), but with slightly different versions of some X11 packages and a few other RPMs.
As a start, I've attached the list of installed RPMs for the two systems (the one with the bug, and the one without), along with a diff to highlight which packages are different. I am happy to provide any other info you need, though please note that I do not have root access or console access to these machines, so all the information has to be obtainable via a regular user.
Created attachment 518064 [details]
RPM list for system WITHOUT bug
Created attachment 518065 [details]
diff of YES vs. NO (showing differences only, side-by-side)
I'm a real newbie, but I have a little comment to do.... don't know if it can help...
At the office, we have 3 PC with CentOS 5.7.
I have install NX on each one, to be able to do remote control on these 3 PC from my Windows 7 computer:
yum install nx = nx-3.5.0-1.el5.centos.x86_64
yum install freenx = freenx.x86_64 0:0.7.3-8.el5.centos
On my Windows 7, I have install the NX Client for Windows from NoMachine (nxclient-3.5.0-7.exe).
I can do a remote very easily. I can open a browser, click on many desktop icons, click on the Gnome menus... but as soon I just move the mouse over the System menu (only this one), it hangs: no more mouse, no more keyboard.
So: not related to VNC only as the same occurs with NX.
Something else, really stange... At work, I have install these 3 CentOS stations in my own office. When in my office, I was able to do a remote control to these 3 stations from my Windows 7 computer. Then, I moved the 3 computers in another building of our company (at the desk of 3 users), and then the bug occurs... never when the stations were in my own office.
The fact that this affects NX as well as VNC makes me further suspect the XDM library, or some interaction between XDM and Xvnc, where the bug crops up when running on a "non-console" X session.
As to your last point: when the stations were in your office and you used NX to control them, were you logged into the console and used NX to control your console session? Or, did you start a "new" X server and use NX to log into that? When the stations moved to your 3 users, were you using NX to control a console-initiated session or one that you initiated remotely?
The reason I ask is because if I start an X session on the console (i.e. by just logging into the console GUI) and then use VNC to log into THAT session, the bug doesn't manifest. The bug only manifests when I start a "non-console" X session (e.g. using vncserver, the interface to Xvnc) and then use VNC to log into it. In other words, it's not the fact that one is using VNC or NX (or any other software) to connect, it's the fact that the X session was started by the console or by Xvnc. I think.
First of all, I have to specify 2 things to make sure you understand my situation:
1- I speak french and my english (beleive me) is so so... it means that I understand your answer partially, with the help of Google Translate ;-)
2- I'm a newbie: it means in my case that it's my very first steps in a non-Windows environment (we talk of one month for my first CentOS install...). It means also that "non-console X session", "XDM and Xvnc"... are just expressions I don't clearly understand for now... sorry about it...
That said, here are the step by step, in my own office (so with my Win7 computer and 3 CentOS fresh install):
1- I boot the CentOS station and have the graphical login interface (so I'm not in a kind of "DOS prompt").
2- I log in the station with root and install NX.
3- Maybe not requiered, but I restart the station at the end.
4- Station rebooted, I do not log into (with my account or another one). It stays on the graphical login interface.
5- From my Win7, I connect with NX Client to one of the CentOS station. It logs me with root (or another acocunt, it works well also).
6- I have the remote control of the CentOS station, click on everything and all go well.
7- at the end, always remotly, I do a logout and it closes my NX Client.
8- If I go physically on the Linux station and log with a user, then do a remote from my Win7 to the Linux station, it works well also.
9 I call the users and tell them that after some tests it works fine and I will deliver the 3 Linux station to their desk (in another building).
So, at the user desk:
1- I plug the Linux station and boot it (no monitor plugged as they will use remote control). I have no monitor, but I know that the system runs and is at the graphical login interface (as it was at my desk). I can ping the Linux station from another computer... it's online and of course no user is loggued at this moment...
2- I install NX CLient for Windows on their Windows XP computers.
3- I do a remote control to their Linux stations, click on everything, work well... users are happy... and suddently I just move the mouse over the System menu (no click... just a mouse over) and it hangs.
4- unable to do something, just hang. At that moment (with my limited knowledge), I power off the Linux station physically, then boot it. Redo a remote and the same behavior occur: hang. Since I have understand that I can do a SSH and reboot the station instead of power it off...
5- back to my office, on my Win7 PC, I try to do a remote: able to connect, able to click on everything, but always hang on the System menu.
Hope it answers your question... else, just ask me again (in an easy english please ;-))))
Well, it sounds like you are doing exactly the same thing with the computer at the user desks as you did at your desk, which disproves my hypothesis about the bug occurring only for sessions initiated when not on the console, at least when using NX... however, I can confirm that when using VNC, the bug occurs only when the X server is started separately from the console (e.g. via vncserver); when using VNC to log into the console X session (Gnome's "Remote Desktop" option, which runs through vino), the bug doesn't occur.
Hopefully someone more experienced with X (and VNC and NX) can help to figure out where this bug is residing...
I have the same issue that was sending my down a very dark rabbit trail. :-)
I built a computer and patched to CENTOS 5.7.
I logged into this system remotely using Xvnc and XRdp with Kerberos authentication.
Everything worked fine - could use any menu including system with no problem.
I moved this system to its final resting place.
I logged into this system remotely using Xvnc and XRdp with Kerberos authentication.
As soon as I touched the System Menu tab my Gnome session locked up.
If I do a "kill HUP <processid of gnome-panel>" the Remote Desktop responds again.
One major difference between these two locations is that the cable connecting the computer to the network switch is about 100 feet at the location that does NOT work and somewhat less at the other.
Any chance of a race/timer issue?
I also am very interested in someone more knowledgable than I in resolving this issue.
(In reply to comment #7)
> I logged into this system remotely using Xvnc and XRdp with Kerberos
> Everything worked fine - could use any menu including system with no problem.
> I moved this system to its final resting place.
> As soon as I touched the System Menu tab my Gnome session locked up.
When you logged into the system remotely during your first attempts (when you set the system up but before you put it into its "final resting place"), was there a user already logged into :0 on the console? For example, were you logged into the console, then logging remotely into :1 or some other instance of the X server?
When you logged into the system remotely after the system was moved, was someone logged into :0 on the console there?
My experience has been that this bug does NOT occur if someone is logged into :0 on the console... even for people who are logging into other instances (e.g. :1 or :2 or whatever) when the console user is logged in. However, when there is nobody logged into the console, this bug manifests for remote users.
As mentioned upthread, I've got two nearly identical systems where one exhibits the bug and the other does not; the differences between them are shown in the RPM lists. I have no idea which RPM(s) is/are the one(s) to blame for this problem... I can't troubleshoot by switching RPMs around because I don't have root on those systems.
Thanks for your advice.
I my case the system is headless where I am experiencing the problem.
Do anyone have any idea how to log into the console remotely into a headless system?
And guidance would be appreciated.
I should have mentioned that in my case the two systems are identical except for the IP address - they are both built off of a Kickstart configuration.
This leads me to believe that there is something going on with latency or something and the VNC GUI as the command line works fine.
Problem has been resolved - do not know exactly what caused or resolved it though.
Halt and Shutdown permissions were set to 750
A switch was inserted to reduce the direct cable length from the computer from 100 feet to 3 feet.
If someone else experiencing the problem could try these steps in their scenario and report back there findings it would be interesting.
Having VNCSERVER run on system startup and the system defaulting to runlevel 3 also worked for me.
I think with some combintation of the settings in:
SYSTEM | PREFERENCES | REMOTE DESKTOP | "Allow other users to view your desktop", you can also work around this issue.
I checked "Allow other users to view your desktop" and "Allow other users to control your desktop"
I also put a password in under the SECURITY heading and unchecked "Ask you for confirmation".
I think you can init to runlevel 3 or 5 in this case and the system will not freeze up. This is the same for vnc and nomachine (or NX).
Tanner, this is only applicable if you're logged into the console and want to VNC into your already-logged-in session. As I mentioned above, that scenario DOES work without a bug. However, if you are _not_ logged into the console (e.g. if the machine is headless, or if you have no physical access to the machine, etc.) and start your session with vncserver/Xvnc, then the bug manifests... VNCing into an Xvnc session rather than a console session is exactly what triggers this. Logging into the console isn't a workaround, unfortunately, since in many cases, console access is not available/desirable.
In my case, I have a headless application and having it boot to runlevel 3 works well.
I do notice a slightly different behavior however. If I have it boot to runlevel 5 with a monitor attached, I do not have the issue of the system menu causing a freeze. I do not have to logon to the console, but it does seem to require that a monitor is attached locally.
Hope this helps lead to a more general fix. :o)
Update - the fix that I thought worked by setting
Halt and Shutdown permissions to 750 did not actually solve the problem - the problem was just momentarily gone :-(
anyone know if 5.9 fixed the issue?
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unable to address this
request at this time.
Red Hat invites you to ask your support representative to
propose this request, if appropriate, in the next release of
Red Hat Enterprise Linux.
This Bugzilla has been reviewed by Red Hat and is not planned on being addressed in Red Hat Enterprise Linux 5, and therefore will be closed. If this bug is critical to production systems, please contact your Red Hat support representative and provide sufficient business justification.