Bug 452136
Summary: | RHEL5.2 - rx2660 will not install in graphics mode | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Alan Matsuoka <alanm> | ||||||||||||
Component: | xorg-x11-drv-ati | Assignee: | Adam Jackson <ajax> | ||||||||||||
Status: | CLOSED NOTABUG | QA Contact: | desktop-bugs <desktop-bugs> | ||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||
Priority: | medium | ||||||||||||||
Version: | 5.2 | CC: | tao | ||||||||||||
Target Milestone: | rc | Keywords: | Regression | ||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | All | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2008-06-19 18:38:18 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Bug Depends On: | |||||||||||||||
Bug Blocks: | 391501 | ||||||||||||||
Attachments: |
|
Created attachment 309851 [details] sosreport-rx2660.3611111111-825161-284597.tar.bz2 Created attachment 309853 [details]
in house sysreport
Created attachment 309855 [details]
Xorg.0.log from in house system
Created attachment 309856 [details]
hp-merlion-01-softlockup.txt
Created attachment 309857 [details] sosreport-hp-merlion-01.186429-296757-f529cd.tar.bz2 This bugzilla has Keywords: Regression. Since no regressions are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP. It appears that this is due to a bad BIOS firmware rev on the VGA controller. I re-flashed the card on the system here at Red Hat and it now works fine. I am working with HP support so they can get the customer system fixed as well. Closing as NOTABUG. |
Description of problem: Customer with rx2660 and trying to install RHEL5.2 in graphics mode. Graphics will not come up, console appears to hang. Installing in text mode via MP (er, ILO2) does work properly. Once installed, the problems persist. Sometimes the graphics console comes up, other times X is running but there is no output on the screen. The /var/log/Xorg.0.log simply terminates with the message of 'Backtrace', but no backtrace. WTEC reproduced what appears to be the same problem. Installation worked via graphics mode properly, however once the system was installed, the Xserver would not start reliably. In these cases the following was observed: - the signal light on the monitor would go amber, indicating no sync from the VGA card - the Xserver would be running in apparent loop (EE) RADEON(0): Idle timed out, resetting engine... (**) RADEON(0): DC flush timeout: ffffffff (**) RADEON(0): EngineRestore (32/32) (**) RADEON(0): Idle timed out: 127 entries, stat=0xffffffff - If killed, the Xserver could not be restarted. X would show: (EE) No devices detected. - dmesg would sometimes show soft lockups: BUG: soft lockup - CPU#0 stuck for 10s! [X:7588] Modules linked in: ipt_MASQUERADE iptable_nat ip_nat bridge autofs4 hidp rfcomm l2cap bluetooth sunrpc ip_conntrack_ftp ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api vfat fat dm_multipath button parport_pc lp parport joydev sr_mod cdrom shpchp sg tg3 dm_snapshot dm_zero dm_mirror dm_mod usb_storage cciss mptspi scsi_transport_spi mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 7588, CPU 0, comm: X psr : 0000141008526010 ifs : 8000000000000001 ip : [<a0000001002ca002>] Not tainted ip is at __ia64_inb+0x82/0xc0 unat: 0000000000000000 pfs : 0000000000000388 rsc : 0000000000000003 rnat: 0000000000000000 bsps: 0000000000000000 pr : 0000000000555699 ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f csd : 0000000000000000 ssd : 0000000000000000 b0 : a0000001004dc4c0 b6 : a0000001002c9f80 b7 : a000000100201b80 f6 : 000000000000000000000 f7 : 000000000000000000000 f8 : 000000000000000000000 f9 : 000000000000000000000 f10 : 000000000000000000000 f11 : 000000000000000000000 r1 : a000000100be0270 r2 : c003fffffc000000 r3 : 0000000000000001 r8 : 00000000000000ff r9 : a000000100a3fd38 r10 : 00000000000000f2 r11 : 0000000000000fff r12 : e0000100762dfe20 r13 : e0000100762d8000 r14 : c003fffffc0f23c8 r15 : 00000000000f2000 r16 : a000000100a3fd30 r17 : 00000000000003c8 r18 : a000000100a3fd30 r19 : 0000000000000000 r20 : a000000100a3fd30 r21 : 0000000000ffffff r22 : e00001007586d0b0 r23 : e00001007586d120 r24 : a0000001002d8480 r25 : a0000001009de128 r26 : e0000100f20f36a0 r27 : a000000100201b80 r28 : 0000000000000100 r29 : a0000001002bb780 r30 : 0000000000000004 r31 : 0000000000000692 Call Trace: [<a000000100013ae0>] show_stack+0x40/0xa0 sp=e0000100762dfa80 bsp=e0000100762d9518 [<a0000001000143e0>] show_regs+0x840/0x880 sp=e0000100762dfc50 bsp=e0000100762d94c0 [<a0000001000e8510>] softlockup_tick+0x2b0/0x320 sp=e0000100762dfc50 bsp=e0000100762d9478 [<a000000100093cf0>] run_local_timers+0x30/0x60 sp=e0000100762dfc50 bsp=e0000100762d9458 [<a000000100093da0>] update_process_times+0x80/0x100 sp=e0000100762dfc50 bsp=e0000100762d9420 [<a0000001000376a0>] timer_interrupt+0x180/0x360 sp=e0000100762dfc50 bsp=e0000100762d93d8 [<a0000001000e8bb0>] handle_IRQ_event+0x90/0x120 sp=e0000100762dfc50 bsp=e0000100762d9398 [<a0000001000e8d70>] __do_IRQ+0x130/0x420 sp=e0000100762dfc50 bsp=e0000100762d9350 [<a000000100011750>] ia64_handle_irq+0xf0/0x1a0 sp=e0000100762dfc50 bsp=e0000100762d9320 [<a00000010000c020>] __ia64_leave_kernel+0x0/0x280 sp=e0000100762dfc50 bsp=e0000100762d9320 [<a0000001002ca000>] __ia64_inb+0x80/0xc0 sp=e0000100762dfe20 bsp=e0000100762d9318 [<a0000001004dc4c0>] ia64_pci_legacy_read+0x100/0x140 sp=e0000100762dfe20 bsp=e0000100762d92e0 [<a0000001002d8530>] pci_read_legacy_io+0xb0/0xe0 sp=e0000100762dfe20 bsp=e0000100762d92a8 [<a000000100201cd0>] read+0x150/0x240 sp=e0000100762dfe20 bsp=e0000100762d9268 [<a000000100164880>] vfs_read+0x200/0x3a0 sp=e0000100762dfe20 bsp=e0000100762d9218 [<a000000100164f50>] sys_read+0x70/0xe0 sp=e0000100762dfe20 bsp=e0000100762d9198 [<a00000010000bdb0>] __ia64_trace_syscall+0xd0/0x110 sp=e0000100762dfe30 bsp=e0000100762d9198 [<a000000000010620>] __start_ivt_text+0xffffffff00010620/0x400 sp=e0000100762e0000 bsp=e0000100762d9198 - lspci -nvv (from sosreport) shows errors lspci -nvv: 00:01.0 ff00: 103c:1303 (rev ff) (prog-if ff) !!! Unknown header type 7f 00:01.1 0780: 103c:1302 (rev ff) (prog-if ff) !!! Unknown header type 7f 00:01.2 0700: 103c:1048 (rev ff) (prog-if ff) !!! Unknown header type 7f 00:02.0 0c03: 1033:0035 (rev ff) (prog-if ff) !!! Unknown header type 7f 00:02.1 0c03: 1033:0035 (rev ff) (prog-if ff) !!! Unknown header type 7f 00:02.2 0c03: 1033:00e0 (rev ff) (prog-if ff) !!! Unknown header type 7f 00:03.0 0300: 1002:515e (rev ff) (prog-if ff) !!! Unknown header type 7f - At times, the serial console would become unresponsive How reproducible: Somewhat random. Once it occurs it becomes more consistent. Frequency higher if kernel does NOT direct output to a serial console. Steps to Reproduce: - ensure primary console is VGA (through conconfig in EFI) - install RHEL5.2 on rx2660 via graphics head Actual results: Some graphics hangs, blackouts. Expected results: Works. Additional info: We have a system in Atlanta that shows this problem. Am attempting to gather kdump information. Customer in Mexico experiencing problem. sosreport attached is from system in Atlanta lab. On Mon, 2008-06-16 at 21:39 +0000, Red Hat Issue Tracker wrote: > I'll see if I can track down one of our rx2660's here so I can hand an > engineer a system that reproduces the problem. In the meantime, how > critical is this issue to the customer? These days one typically > doesn't associate Itaniums with desk-side systems that are used > graphically... The criticality is unclear. This is a new customer to HP, and the first experience they had with the machine was the problems during installation. As a result, they are left with very bad impressions of the hardware and with the Red Hat OS. You may need to reboot several times. Concurrent access to the MP console may also be necessary, but it is not clear. In my testing after a cold boot, I had 6 or 7 boots without an issue, and then 5 or 6 with the problem followed by 2 or 3 with no problem. I think think this needs at least a medium priority since: - there are no release notes on this issue and no warning to avoid graphical installation or usage - the kernel soft-lockups could lead to real panic - the serial console has also become inoperable during the soft-lockups, reducing the amount of control the user has over the system My guess is that this is somehow related to the ILO2 functionality that is new with this machine. The lspci output seems to illustrate that. Rick Rick, I seem to have duplicated the problem our customer sees. Here's a couple of notes from my testing: - The issue seems to occur far more regularly when the remote console is open and being used, however it did occur once without the console open as well. - It also seems to occur more regularly with the Xen kernel, however I was able to trigger it with the bare-metal kernel as well (Xorg.0.log and softlockup message attached from that session) - The softlockups do cause the system as a whole to become extremely slow until it begins printing the errors to the X log, at which point it seems to return to some semblance of normalcy. I'll ask Doug (the Integrity onsite engineer) if he's ever come across anything like this before, and escalate to SEG at the same time as well. -David Problem Summary: Graphics often don't work on HP rx2660 systems. Usually accompanied by soft lockups (possibly unrelated?) and always by the RADEON errors in the Xorg.0.log. Supporting Materials: sosreports, X log, traceback messages. Reproducer: You can reproduce the problem simply by starting/stopping (init 3/5) X on a system. You don't even have to be looking at a monitor - I wasn't. hp-merlion-01 in RHTS will reproduce the problem. hp-rx2660-03 in RHTS is the same model system and should as well. Requested Action from SEG: Fix/escalate.