Bug 180119

Summary: X eats all CPU, breaking display, when coming back from text mode
Product: [Fedora] Fedora Reporter: Alexandre Oliva <oliva>
Component: xorg-x11-serverAssignee: X/OpenGL Maintenance List <xgl-maint>
Status: CLOSED RAWHIDE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: rawhide   
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-02-11 09:36:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alexandre Oliva 2006-02-06 04:08:33 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.0.1) Gecko/20060202 Fedora/1.5.0.1-2 Firefox/1.5.0.1

Description of problem:
If I switch to VT1 and then back to VT7, the display gets completely stuck.  X starts eating all CPU, constantly getting signalsSIGALRM and returning immediately.  It doesn't matter if I switch with Ctrl-Alt-F# keystrokes or with vtswitch.  Killing X with SIGKILL doesn't bring the display back, even if I switch to runlevel 3 first, and then back to runlevel 5.

Main driver is nv.  I also have the vnc module loaded, but I'm not sure it makes any difference.  This only happens on my AMD64 notebook; a 32-bit notebook that also tracks rawhide and has very similar configuration doesn't trigger the problem.

This is a regression from FC5T2, and it started a few days ago, can't tell exactly when (I don't switch to VT1 very often, it's generally a single login session per boot up)

Version-Release number of selected component (if applicable):
xorg-x11-server-Xorg-1.0.1-1 xorg-x11-drv-nv-1.0.1.5-1 vnc-server-4.1.1-34 kernel-2.6.15-1.1909_FC5

How reproducible:
Always

Steps to Reproduce:
1.Switch to VT1 then back to VT7 (the latter running gdm or a login session)

Actual Results:  X freezes.  The box is still otherwise responsive, but I can't figure out how to get the display back.  Not even vtswitch 1, from a remote login, brings the display back.

Expected Results:  X should repaint the display and keep going.

Additional info:

Comment 1 Mike A. Harris 2006-02-06 20:26:20 UTC
Please try to determine what update caused the problem for you.  If we know
which package is the point of regression, it'll be easier to diagnose.

Also, please disable VNC to take it out of the equation for now.  Does the
problem occur without vnc loaded?

Comment 2 Alexandre Oliva 2006-02-07 22:37:55 UTC
Took VNC out of the equation, the problem remains.  I'm not sure it actually
changed the behavior, or I was just attaching to the X server too late in
previous attempts.  This time I switched to VT1, logged in as root, started
strace on the running X (this time, on the SSH agent passphrase prompt, right
after logging in) with syscall output to a file.  When I switched back to VT7, X
repainted the entire screen except for the SSH agent window, and then it
stopped.  This is a snippet from the strace output, that appears to be relevant:

select(1024, [6], NULL, NULL, {0, 0})   = 1 (in [6], left {0, 0})
read(6, "Z\3\351C\0\0\0\0\252=\5\0\0\0\0\0\3\0\0\0\35\1\0\0Z\3\351"..., 64) =
48select(1024, [6], NULL, NULL, {0, 0})   = 1 (in [6], left {0, 0})
read(6, "Z\3\351C\0\0\0\0\262=\5\0\0\0\0\0\3\0\30\0H\0\0\0Z\3\351"..., 64) = 48
rt_sigprocmask(SIG_BLOCK, [], [IO], 8)  = 0
rt_sigprocmask(SIG_BLOCK, [], [IO], 8)  = 0
select(1024, [6], NULL, NULL, {0, 0})   = 0 (Timeout)
rt_sigreturn(0x1)                       = 4294967295
--- SIGIO (I/O possible) @ 0 (0) ---
select(8, [6 7], NULL, NULL, {0, 0})    = 1 (in [6], left {0, 0})
rt_sigprocmask(SIG_BLOCK, [IO], [IO], 8) = 0
select(1024, [6], NULL, NULL, {0, 0})   = 1 (in [6], left {0, 0})
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe)                       = 1
read(6, "Z\3\351C\0\0\0\0\325m\5\0\0\0\0\0\3\0\0\0\33\1\0\0Z\3\351"..., 64) =
48select(1024, [6], NULL, NULL, {0, 0})   = 1 (in [6], left {0, 0})
read(6, "Z\3\351C\0\0\0\0\336m\5\0\0\0\0\0\0\0\0\0\0\0\0\0", 64) = 24
rt_sigprocmask(SIG_BLOCK, [], [], 8)    = 0
write(2, "(EE) SIGIO not blocked at xf86eq"..., 40) = 40
write(0, "(EE) SIGIO not blocked at xf86eq"..., 40) = 40
rt_sigprocmask(SIG_BLOCK, [], [], 8)    = 0
write(2, "(EE) SIGIO not blocked at xf86eq"..., 40) = 40
write(0, "(EE) SIGIO not blocked at xf86eq"..., 40) = 7
write(0, "GIO not blocked at xf86eqEnqueue"..., 33) = 0
write(0, "GIO not blocked at xf86eqEnqueue"..., 33) = 0
[...] (repeats forever)

I suppose I can get to earlier versions of X packages out of the internal Red
Hat build system, but would you hazard any as to which package this might be
related with?  This would surely help narrowing down the big and painful search
I have ahead of me :-)  Thanks,

Comment 3 Alexandre Oliva 2006-02-11 09:36:37 UTC
This got fixed in yesterday's rawhide.  Yay!