242746 – radeon driver doesn't hold lock during some kernel calls

Bug 242746 - radeon driver doesn't hold lock during some kernel calls

Summary: radeon driver doesn't hold lock during some kernel calls

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	xorg-x11-drv-ati
Sub Component:
Version:	7
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Dave Airlie
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-06-05 16:05 UTC by Adam Tkac
Modified:	2018-04-11 09:15 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2008-05-05 07:27:23 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
My config file (847 bytes, text/plain) 2007-06-06 12:08 UTC, Adam Tkac	no flags	Details
Xorg log before crash (38.95 KB, text/plain) 2007-06-06 12:16 UTC, Adam Tkac	no flags	Details
dmesg immediately before crash (20.95 KB, text/plain) 2007-06-07 10:45 UTC, Adam Tkac	no flags	Details
my xorg.conf (553 bytes, application/octet-stream) 2008-02-27 08:12 UTC, Steve	no flags	Details
View All

Description Adam Tkac 2007-06-05 16:05:50 UTC

Description of problem:
I've setup libvnc.so X module and when I connect with vncviewer server
completely hangs computer.

Version-Release number of selected component (if applicable):
rpm -q xorg-x11-drv-ati
xorg-x11-drv-ati-6.6.3-2.fc7

How reproducible:
always

Steps to Reproduce:
1. install vnc-server package
2. add Load "vnc" into module section in xorg.conf
3. add Option "SecurityTypes" "None" into Screen section in xorg.conf
4. start Xorg with radeon driver and be prepare for complete hang
5. run vncviewer into server
  
Actual results:
complete hang, computer must be restarted with button

Expected results:
vnc-ed into server

Additional info:
this messages are in log
Jun  5 19:55:30 traged kernel: [drm:radeon_cp_start] *ERROR* radeon_cp_start
called without lock held
Jun  5 19:55:30 traged kernel: [drm:radeon_cp_idle] *ERROR* radeon_cp_idle
called without lock held
Jun  5 19:55:30 traged kernel: [drm:radeon_cp_reset] *ERROR* radeon_cp_reset
called without lock held
Jun  5 19:55:30 traged kernel:  called without lock held

Comment 1 Henrique Martins 2007-06-05 17:17:39 UTC

Happened on FC6 too, the first time I've tried sharing the main display

Comment 2 Matěj Cepl 2007-06-06 07:19:20 UTC

Adam, could we get your /etc/X11/xorg.conf and /var/log/Xorg.0.log as
attachments to this bug, please?

Comment 3 Adam Tkac 2007-06-06 12:08:57 UTC

Created attachment 156340 [details]
My config file

Comment 4 Adam Tkac 2007-06-06 12:16:07 UTC

Created attachment 156343 [details]
Xorg log before crash

Comment 5 Adam Tkac 2007-06-06 12:27:27 UTC

System log repetely says messages which I wrote in original bugreport

Adam

Comment 6 Adam Jackson 2007-06-06 18:32:28 UTC

The X server always takes the DRI lock when it wakes up.  So I don't think
that's the problem.

Can you attach the dmesg output as well?  I suspect foul play in the kernel.

Comment 7 Adam Tkac 2007-06-07 10:44:23 UTC

(In reply to comment #6)
> Can you attach the dmesg output as well?  I suspect foul play in the kernel.

I don't know how get dmesg output after hang. Computer locks immediately and I
could get debuginfo only from log on disc. As I wrote upper, bunch of messages like

kernel: [drm:radeon_cp_start] *ERROR* radeon_cp_start called without lock held
kernel: [drm:radeon_cp_idle] *ERROR* radeon_cp_idle called without lock held
kernel: [drm:radeon_cp_reset] *ERROR* radeon_cp_reset called without lock held

are in log. I could remember that this works about three months ago. I tried
install older kernel and this doesn't solve this problem. I didn't do anything
with vnc so this looks like driver/X issue. Any hints how get more debuginfo?

Adam

Comment 8 Adam Tkac 2007-06-07 10:45:00 UTC

Created attachment 156443 [details]
dmesg immediately before crash

Comment 9 Adam Jackson 2007-06-07 18:16:01 UTC

dmesg has:

CPU 0: aperture @ 1a2c000000 size 32 MB
Aperture too small (32 MB)

Does your BIOS have options for making the video memory aperture larger?

Comment 10 Henrique Martins 2007-06-07 18:56:57 UTC

I have the exact same problem as Adam (and started talking to him on the
vnc-list).  My dmesg doesn't have that warning message. It claims instead:

Linux agpgart interface v0.102 (c) Dave Jones
agpgart: Detected VIA KT400/KT400A/KT600 chipset
agpgart: AGP aperture is 128M @ 0xf0000000

So that may not be the problem.

-- Henrique

Comment 11 Adam Tkac 2007-06-08 10:59:31 UTC

I did some investigations around this. This is only reproducable when you start
X with dri and glx modules. When those modules aren't loaded simulateously all
works as expected. Btw my bios is very stupid and it's no way how can I extend
video memory aperture

Adam

Comment 12 Dave Airlie 2007-09-26 00:26:53 UTC

this sounds like some missing block handler or dri lock entry points, in theory
the server should always have the lock when it is running... so it must be a
wierd vnc interaction..

Comment 13 Bill Randle 2008-02-24 19:06:13 UTC

Has anyone discovered anything on this bug? I am seeing it in a slightly
different situation. I have an embedded Linux system with no native X11. I built
RPM packages for it based on the Fedora 7 SRPMS. The video chip is an ATI M22.
Everything works just fine when DRI is disabled, but with DRI enabled and
running glxgears, I see this error message quite often. It is especially prone
to happen when moving or resizing the gears window. Sometimes, glxgears just
exits; other times the system locks up. The two error messages I get are:
  [drm:radeon_cp_idle] *ERROR* radeon_cp_idle called without lock held
usually repeated several times, then
  [drm:radeon_irq_emit] *ERROR* radeon_irq_emit called without lock held

In addition, I get an error message in the xterm running gears that says:
  radeonEmitIrqLocked: drmRadeonEmitIrq: -22

Comment 14 Bill Randle 2008-02-24 19:38:49 UTC

I should add the easiest way to reproduce this is to run multiple instances of
glxgears. It doesn't show up so much with just one copy running. I upgraded from
Mesa 6.5.2 to Mesa 7.0.2, to pick up a newer r300_dri.so but that didn't make
any difference. Kernel is 2.6.14.7 with drm upgraded to 1.24.0. I can't use a
newer drm without moving to a newer kernel, which is not an option at this time.

Comment 15 Dave Airlie 2008-02-24 21:22:03 UTC

Bill, can you open a new bug please? this isn't the same thing you are seeing...

can you attach a log to the new bug as well..

Comment 16 Steve 2008-02-27 08:11:05 UTC

I've exactly the same errors in the system log, but in f9-rawhide.

[drm:radeon_cp_reset] *ERROR* radeon_cp_reset called without lock held, held  0
owner ea000150 ea000150
[drm:radeon_cp_start] *ERROR* radeon_cp_start called without lock held, held  0
owner ea000150 ea000150
[drm:radeon_cp_idle] *ERROR* radeon_cp_idle called without lock held, held  0
owner ea000150 ea000150

If i want to log in with GDM, X (or GDM) freezes (Mouse & Keyboard). The system
is still alive.

Comment 17 Steve 2008-02-27 08:12:20 UTC

Created attachment 296029 [details]
my xorg.conf

Comment 18 Shawn Starr 2008-04-05 19:21:53 UTC

ss

Comment 19 Shawn Starr 2008-04-06 08:38:04 UTC

oops, I can reproduce this in rawhide also as of April 4th/5th

Comment 20 Adam Tkac 2008-04-23 08:53:40 UTC

Can anyone confirm this bug still exists in F9, please? I'm not able to
reproduce it with latest vnc-server build (4.1.2-29.fc9)

Comment 21 Adam Tkac 2008-05-05 07:27:23 UTC

Looks like fixed, closing

Note You need to log in before you can comment on or make changes to this bug.