Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Initializing cgroup subsys cpuacct Linux version 4.2.0-0.rc0.git4.1.fc23.x86_64 (mockbuild.fedoraproject.org) (gcc version 5.1.1 20150618 (Red Hat 5.1.1-4) (GCC) ) #1 SMP Wed Jul 1 16:19:30 UTC 2015 Command line: BOOT_IMAGE=vmlinuz0 initrd=initrd0.img root=live:CDLABEL=Rawhide-Live-Xfce-702 rootfstype=auto ro rd.live.image vga=789 video=1024x768 rd.luks=0 rd.md=0 rd.dm=0 x86/fpu: Legacy x87 FPU detected. x86/fpu: Using 'lazy' FPU context switches. e820: BIOS-provided physical RAM map: BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved BIOS-e820: [mem 0x0000000000100000-0x0000000073fddfff] usable BIOS-e820: [mem 0x0000000073fde000-0x0000000073ffffff] reserved BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved NX (Execute Disable) protection: active SMBIOS 2.8 present. DMI: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.8.1-20150619_004852- 04/01/2014 Hypervisor detected: KVM e820: update [mem 0x00000000-0x00000fff] usable ==> reserved e820: remove [mem 0x000a0000-0x000fffff] usable e820: last_pfn = 0x73fde max_arch_pfn = 0x400000000 MTRR default type: write-back MTRR fixed ranges enabled: 00000-9FFFF write-back A0000-BFFFF uncachable C0000-FFFFF write-protect MTRR variable ranges enabled: 0 base 00C0000000 mask FFC0000000 uncachable 1 disabled 2 disabled 3 disabled 4 disabled 5 disabled 6 disabled 7 disabled x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT found SMP MP-table at [mem 0x000f64e0-0x000f64ef] mapped at [ffff8800000f64e0] Scanning 1 areas for low memory corruption Base memory trampoline at [ffff880000099000] 99000 size 24576 Using GB pages for direct mapping init_memory_mapping: [mem 0x00000000-0x000fffff] [mem 0x00000000-0x000fffff] page 4k BRK [0x0346b000, 0x0346bfff] PGTABLE BRK [0x0346c000, 0x0346cfff] PGTABLE BRK [0x0346d000, 0x0346dfff] PGTABLE init_memory_mapping: [mem 0x71800000-0x719fffff] [mem 0x71800000-0x719fffff] page 2M BRK [0x0346e000, 0x0346efff] PGTABLE init_memory_mapping: [mem 0x60000000-0x717fffff] [mem 0x60000000-0x717fffff] page 2M init_memory_mapping: [mem 0x40000000-0x5fffffff] [mem 0x40000000-0x5fffffff] page 2M init_memory_mapping: [mem 0x00100000-0x3fffffff] [mem 0x00100000-0x001fffff] page 4k [mem 0x00200000-0x3fffffff] page 2M init_memory_mapping: [mem 0x71a00000-0x73fddfff] [mem 0x71a00000-0x73dfffff] page 2M [mem 0x73e00000-0x73fddfff] page 4k BRK [0x0346f000, 0x0346ffff] PGTABLE RAMDISK: [mem 0x71ae2000-0x73fddfff] ACPI: Early table checksum verification disabled ACPI: RSDP 0x00000000000F6310 000014 (v00 BOCHS ) ACPI: RSDT 0x0000000073FE2223 000034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) ACPI: FACP 0x0000000073FE1DF8 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001) ACPI: DSDT 0x0000000073FE0040 001DB8 (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001) ACPI: FACS 0x0000000073FE0000 000040 ACPI: SSDT 0x0000000073FE1E6C 0002FB (v01 BOCHS BXPCSSDT 00000001 BXPC 00000001) ACPI: APIC 0x0000000073FE2167 000080 (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) ACPI: MCFG 0x0000000073FE21E7 00003C (v01 BOCHS BXPCMCFG 00000001 BXPC 00000001) ACPI: Local APIC address 0xfee00000 No NUMA configuration found Faking a node at [mem 0x0000000000000000-0x0000000073fddfff] NODE_DATA(0) allocated [mem 0x71abb000-0x71ae1fff] kvm-clock: Using msrs 4b564d01 and 4b564d00 kvm-clock: cpu 0, msr 0:71a3b001, primary cpu clock clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns [ffffea0000000000-ffffea0001dfffff] PMD -> [ffff88006f400000-ffff8800711fffff] on node 0 Zone ranges: DMA [mem 0x0000000000001000-0x0000000000ffffff] DMA32 [mem 0x0000000001000000-0x0000000073fddfff] Normal empty Movable zone start for each node Early memory node ranges node 0: [mem 0x0000000000001000-0x000000000009efff] node 0: [mem 0x0000000000100000-0x0000000073fddfff] Initmem setup node 0 [mem 0x0000000000001000-0x0000000073fddfff] On node 0 totalpages: 475004 DMA zone: 64 pages used for memmap DMA zone: 21 pages reserved DMA zone: 3998 pages, LIFO batch:0 DMA32 zone: 7360 pages used for memmap DMA32 zone: 471006 pages, LIFO batch:31 ACPI: PM-Timer IO Port: 0x608 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1]) IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) ACPI: IRQ0 used by override. ACPI: IRQ5 used by override. ACPI: IRQ9 used by override. ACPI: IRQ10 used by override. ACPI: IRQ11 used by override. Using ACPI (MADT) for SMP configuration information smpboot: Allowing 2 CPUs, 0 hotplug CPUs PM: Registered nosave memory: [mem 0x00000000-0x00000fff] PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff] ============================================= [ INFO: possible recursive locking detected ] 4.2.0-0.rc0.git4.1.fc23.x86_64 #1 Not tainted --------------------------------------------- kworker/u4:5/99 is trying to acquire lock: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa014713d>] qxl_bo_unref+0x4d/0xf0 [qxl] but task is already holding lock: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa01471b0>] qxl_bo_unref+0xc0/0xf0 [qxl] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&dev->struct_mutex); lock(&dev->struct_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by kworker/u4:5/99: #0: ("%s""qxl_gc"){.+.+.+}, at: [<ffffffff810cb34b>] process_one_work+0x19b/0x840 #1: ((&qdev->gc_work)){+.+.+.}, at: [<ffffffff810cb34b>] process_one_work+0x19b/0x840 #2: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa01471b0>] qxl_bo_unref+0xc0/0xf0 [qxl] #3: (&qdev->surf_evict_mutex){+.+.+.}, at: [<ffffffffa01490ba>] qxl_surface_evict+0x2a/0x70 [qxl] stack backtrace: CPU: 0 PID: 99 Comm: kworker/u4:5 Not tainted 4.2.0-0.rc0.git4.1.fc23.x86_64 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.8.1-20150619_004852- 04/01/2014 Workqueue: qxl_gc qxl_gc_work [qxl] 0000000000000000 000000003c3cef53 ffff880000147948 ffffffff81862c73 0000000000000000 ffffffff82bb1a40 ffff880000147a28 ffffffff81106ed8 3a11d10d00000000 3a11d10db870636c 0000000000000000 ffff88006f1d8200 Call Trace: [<ffffffff81862c73>] dump_stack+0x4c/0x65 [<ffffffff81106ed8>] __lock_acquire+0x1cd8/0x1d00 [<ffffffff810e670b>] ? sched_clock_local+0x1b/0x90 [<ffffffff810e693a>] ? sched_clock_cpu+0x8a/0xb0 [<ffffffff81107887>] lock_acquire+0xc7/0x270 [<ffffffffa014713d>] ? qxl_bo_unref+0x4d/0xf0 [qxl] [<ffffffffa0107b50>] ? ttm_mem_io_reserve+0x40/0xd0 [ttm] [<ffffffffa0147163>] qxl_bo_unref+0x73/0xf0 [qxl] [<ffffffffa014713d>] ? qxl_bo_unref+0x4d/0xf0 [qxl] [<ffffffffa014c172>] qxl_alloc_surface_release_reserved+0xc2/0x110 [qxl] [<ffffffffa01490ba>] ? qxl_surface_evict+0x2a/0x70 [qxl] [<ffffffffa014814e>] qxl_hw_surface_dealloc.part.3+0x3e/0x110 [qxl] [<ffffffffa01490d4>] qxl_surface_evict+0x44/0x70 [qxl] [<ffffffffa01475ba>] qxl_gem_object_free+0x3a/0x70 [qxl] [<ffffffffa008e197>] drm_gem_object_free+0x27/0x30 [drm] [<ffffffffa01471ce>] qxl_bo_unref+0xde/0xf0 [qxl] [<ffffffff81248f8c>] ? kfree+0x28c/0x2f0 [<ffffffffa014b7ac>] qxl_release_free_list+0x4c/0x90 [qxl] [<ffffffffa014bb22>] qxl_release_free+0x82/0xf0 [qxl] [<ffffffffa01485d5>] qxl_garbage_collect+0xd5/0x1b0 [qxl] [<ffffffffa0142305>] qxl_gc_work+0x15/0x20 [qxl] [<ffffffff810cb3e2>] process_one_work+0x232/0x840 [<ffffffff810cb34b>] ? process_one_work+0x19b/0x840 [<ffffffff8112535d>] ? debug_lockdep_rcu_enabled+0x1d/0x20 [<ffffffff810cbac5>] ? worker_thread+0xd5/0x450 [<ffffffff810cba3e>] worker_thread+0x4e/0x450 [<ffffffff810cb9f0>] ? process_one_work+0x840/0x840 [<ffffffff810d2404>] kthread+0x104/0x120 [<ffffffff810d2300>] ? kthread_create_on_node+0x250/0x250 [<ffffffff8186c85f>] ret_from_fork+0x3f/0x70 [<ffffffff810d2300>] ? kthread_create_on_node+0x250/0x250
*** Bug 1240619 has been marked as a duplicate of this bug. ***
This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle. Changing version to '23'. (As we did not run this process for some time, it could affect also pre-Fedora 23 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23
Why CLOSED/DEFERRED? Did this bug stop happening ?
$ qemu-system-x86_64 ... -device virtio-vga # lspci -d 1af4:1050 -knn 00:03.0 VGA compatible controller [0300]: Red Hat, Inc Device [1af4:1050] (rev 01) Subsystem: Red Hat, Inc Device [1af4:1100] Kernel driver in use: virtio-pci Kernel modules: virtio_pci # dmesg | grep virtio [ 1.727390] [drm] pci: virtio-vga detected [ 1.729315] [drm] virtio vbuffers: 80 bufs, 192B each, 15kB total. [ 2.023845] virtio_gpu virtio0: fb0: virtiodrmfb frame buffer device [ 2.023846] virtio_gpu virtio0: registered panic notifier [ 2.043135] [drm] Initialized virtio_gpu 0.0.1 0 on minor 0
(In reply to Christophe Fergeau from comment #3) > Why CLOSED/DEFERRED? Did this bug stop happening ? Ask yourself the question, why did not you responded on similar report, almost two years ago https://bugzilla.redhat.com/show_bug.cgi?id=1003924 There are many neglected reports like this, and you are as surprised. Really.
Another similar bug trace (from IRC chat) [ 319.555857] ============================================= [ 319.557699] [ INFO: possible recursive locking detected ] [ 319.559717] 4.2.0-rc5-debug #1 Not tainted [ 319.561183] --------------------------------------------- [ 319.563084] kworker/u14:1/71 is trying to acquire lock: [ 319.564900] (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa079090f>] qxl_bo_unref+0x6f/0x120 [qxl] [ 319.570138] but task is already holding lock: [ 319.572433] (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa079098a>] qxl_bo_unref+0xea/0x120 [qxl] [ 319.575940] other info that might help us debug this: [ 319.578281] Possible unsafe locking scenario: [ 319.580538] CPU0 [ 319.581441] ---- [ 319.582367] lock(&dev->struct_mutex); [ 319.583970] lock(&dev->struct_mutex); [ 319.585522] *** DEADLOCK *** [ 319.587910] May be due to missing lock nesting notation [ 319.590727] 4 locks held by kworker/u14:1/71: [ 319.592428] #0: ("%s""qxl_gc"){++++.+}, at: [<ffffffff810efed7>] process_one_work+0x347/0xcf0 [ 319.595957] #1: ((&qdev->gc_work)){+.+.+.}, at: [<ffffffff810efed7>] process_one_work+0x347/0xcf0 [ 319.599580] #2: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa079098a>] qxl_bo_unref+0xea/0x120 [qxl] [ 319.603407] #3: (&qdev->surf_evict_mutex){+.+.+.}, at: [<ffffffffa0793f4a>] qxl_surface_evict+0x2a/0x70 [qxl] [ 319.607373] stack backtrace: [ 319.609022] CPU: 2 PID: 71 Comm: kworker/u14:1 Not tainted 4.2.0-rc5-debug #1 [ 319.611620] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 319.614490] Workqueue: qxl_gc qxl_gc_work [qxl] [ 319.616225] ffff8801f0d323d8 000000002e1deaf2 ffff8801f0dd75c8 ffffffff8193eef7 [ 319.619037] 0000000000000000 ffffffff82e813d0 ffff8801f0dd76d8 ffffffff81153c3a [ 319.621818] ffff8801f0d32430 0000623e00000004 ffff880100000004 0000000000038e00 [ 319.624618] Call Trace: [ 319.625482] [<ffffffff8193eef7>] dump_stack+0x4c/0x6e [ 319.627339] [<ffffffff81153c3a>] __lock_acquire+0x1dfa/0x1f20 [ 319.629938] [<ffffffff81151e40>] ? debug_check_no_locks_freed+0x1d0/0x1d0 [ 319.634704] [<ffffffffa0607783>] ? ttm_mem_io_reserve+0x83/0x1c0 [ttm] [ 319.639489] [<ffffffffa078db30>] ? qxl_ttm_fault+0x60/0x60 [qxl] [ 319.643385] [<ffffffff81154c6d>] lock_acquire+0xdd/0x2e0 [ 319.646794] [<ffffffffa079090f>] ? qxl_bo_unref+0x6f/0x120 [qxl] [ 319.650642] [<ffffffff812f6fee>] ? kasan_kmalloc+0x5e/0x70 [ 319.653799] [<ffffffffa0790935>] qxl_bo_unref+0x95/0x120 [qxl] [ 319.657117] [<ffffffffa079090f>] ? qxl_bo_unref+0x6f/0x120 [qxl] [ 319.660537] [<ffffffffa0798f3b>] qxl_alloc_surface_release_reserved+0x1bb/0x220 [qxl] [ 319.664532] [<ffffffffa0798d80>] ? qxl_alloc_release_reserved+0x410/0x410 [qxl] [ 319.667891] [<ffffffff8115181b>] ? trace_hardirqs_on_caller+0x18b/0x290 [ 319.670571] [<ffffffff8115192d>] ? trace_hardirqs_on+0xd/0x10 [ 319.673247] [<ffffffffa0793f4a>] ? qxl_surface_evict+0x2a/0x70 [qxl] [ 319.675882] [<ffffffffa07922df>] qxl_hw_surface_dealloc.part.1+0x9f/0x1e0 [qxl] [ 319.678512] [<ffffffff81943bf0>] ? mutex_trylock+0x250/0x250 [ 319.680579] [<ffffffff81151666>] ? mark_held_locks+0x96/0xc0 [ 319.682644] [<ffffffffa0792240>] ? qxl_push_command_ring_release+0x1c0/0x1c0 [qxl] [ 319.685376] [<ffffffff8115181b>] ? trace_hardirqs_on_caller+0x18b/0x290 [ 319.687798] [<ffffffffa0793f6f>] qxl_surface_evict+0x4f/0x70 [qxl] [ 319.690012] [<ffffffffa0790f8d>] qxl_gem_object_free+0x9d/0xe0 [qxl] [ 319.692280] [<ffffffffa0790ef0>] ? qxl_vram_evict+0x20/0x20 [qxl] [ 319.695032] [<ffffffff81151666>] ? mark_held_locks+0x96/0xc0 [ 319.695035] [<ffffffff812f1a65>] ? kfree+0xe5/0x340 [ 319.695052] [<ffffffffa04efa18>] drm_gem_object_free+0x58/0x60 [drm] [ 319.695058] [<ffffffffa07909a8>] qxl_bo_unref+0x108/0x120 [qxl] [ 319.695062] [<ffffffff812f1c0c>] ? kfree+0x28c/0x340 [ 319.695068] [<ffffffffa0797f39>] qxl_release_free_list+0xb9/0x170 [qxl] [ 319.695079] [<ffffffff814a9ec0>] ? idr_mark_full+0x80/0x80 [ 319.695085] [<ffffffffa0797e80>] ? qxl_nop_signaling+0x10/0x10 [qxl] [ 319.695090] [<ffffffffa079848e>] ? qxl_release_free+0x7e/0x130 [qxl] [ 319.695093] [<ffffffff8115afbf>] ? do_raw_spin_unlock+0x9f/0x140 [ 319.695098] [<ffffffffa07984de>] qxl_release_free+0xce/0x130 [qxl] [ 319.695104] [<ffffffffa0792b60>] qxl_garbage_collect+0x1f0/0x340 [qxl] [ 319.695110] [<ffffffffa0792970>] ? qxl_io_log+0x120/0x120 [qxl] [ 319.695112] [<ffffffff8114b5b6>] ? __lock_is_held+0x96/0xe0 [ 319.695116] [<ffffffffa0788445>] qxl_gc_work+0x15/0x20 [qxl] [ 319.695119] [<ffffffff810effad>] process_one_work+0x41d/0xcf0 [ 319.695121] [<ffffffff810efed7>] ? process_one_work+0x347/0xcf0 [ 319.695123] [<ffffffff8194a4b0>] ? _raw_spin_unlock_irq+0x30/0x60 [ 319.695126] [<ffffffff810efb90>] ? cancel_delayed_work_sync+0x20/0x20 [ 319.695129] [<ffffffff810f090c>] worker_thread+0x8c/0x7e0 [ 319.695132] [<ffffffff810f0880>] ? process_one_work+0xcf0/0xcf0 [ 319.695133] [<ffffffff810fa9be>] kthread+0x17e/0x1a0 [ 319.695135] [<ffffffff810fa840>] ? kthread_create_on_node+0x350/0x350 [ 319.695137] [<ffffffff811515f3>] ? mark_held_locks+0x23/0xc0 [ 319.695140] [<ffffffff810fa840>] ? kthread_create_on_node+0x350/0x350 [ 319.695143] [<ffffffff8194b41f>] ret_from_fork+0x3f/0x70 [ 319.695145] [<ffffffff810fa840>] ? kthread_create_on_node+0x350/0x350
(reporter from IRC here) This problem occurred on vanilla mainline v4.2-rc5 using a kernel configuration derived from Arch Linux 4.1.3-1-ARCH (and additionally KASAN and lockdep enabled). The QEMU version is 2.0.0+dfsg-2ubuntu1.15, guest package versions: - xorg-server 1.17.2 - xf86-video-qxl 0.1.4 - spice-agent 0.16.0 (enabled/disabled does not affect results) Steps to reproduce: - Boot a kernel with lockdep enabled (see below for qemu command): - Connect with remote-viewer (I am using a wireless network if that matters, the VM is on a different machine). - Invoke startx which will start X and xterm. - Watch the lockdep spew in the logs. Full QEMU command (note: default vgamem is doubled): qemu-system-x86_64 -machine pc,accel=kvm -kernel 4.2.0-rc5-debug/bzImage -initrd 4.2.0-rc5-debug/initrd.gz -serial stdio -drive media=cdrom,if=virtio,file=archlinux.iso -net user,hostfwd=tcp:0:2222-:22 -net nic,model=virtio -device usb-ehci -vga std -cpu host -m 8G -smp 7 -virtfs local,mount_tag=shared,security_model=none,path=/tmp/spice -vga none -device qxl-vga,vgamem_mb=32 -spice port=5930,disable-ticketing -device virtio-serial -chardev spicevmc,id=vdagent,name=vdagent -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -append 'vga=792 quiet console=tty0 console=ttyS0,115200n8 app.uid=1000 app.gid=1000 slub_debug=U loglevel=8 console=hvc0,115200n8' -monitor stdio -chardev file,path=/tmp/serial.txt,id=log -device virtconsole,chardev=log Here is a possibly related follow-up event: After starting spice-vdagent and spice-vdagentd, mutter, addition of two additional screens and positioning of Virtual-2 --below Virtual-1, Xorg is non-responsive when I keep resizing glxgears over the three screens with the mouse. gdb waits forever for attaching and only if I start `perf record -a -g sleep 5`, it becomes responsive again. This pattern was first noticed with Firefox (in a single screen while four were enabled). Below is a log excerpt from this anomaly. This problem was first noticed in the 4.1.3-1-ARCH kernel. Aug 10 16:22:07 builder spice-vdagent[6251]: 0x2521010 sent guest xorg resolution, arg1: 2048, arg2: 1492, size 48 Aug 10 16:22:07 builder spice-vdagent[6251]: Screen 0 1024x768+1024+0 Aug 10 16:22:07 builder spice-vdagent[6251]: Screen 1 1024x746+0+0 Aug 10 16:22:07 builder spice-vdagent[6251]: Screen 2 1024x746+0+746 Aug 10 16:22:07 builder spice-vdagent[6251]: 0x2521010 sent guest xorg resolution, arg1: 2048, arg2: 1492, size 48 Aug 10 16:22:16 builder kernel: [drm:qxl_release_from_id_locked [qxl]] *ERROR* failed to find id in release_idr Aug 10 16:23:50 builder sudo[6308]: arch : TTY=pts/1 ; PWD=/home/arch ; USER=root ; COMMAND=/usr/bin/pacman -S perf Aug 10 16:23:50 builder sudo[6308]: pam_unix(sudo:session): session opened for user root by arch(uid=0) Aug 10 16:23:53 builder sudo[6308]: pam_unix(sudo:session): session closed for user root Aug 10 16:24:03 builder sudo[6354]: arch : TTY=pts/1 ; PWD=/home/arch ; USER=root ; COMMAND=/usr/bin/perf record -a -g sleep 10 Aug 10 16:24:03 builder sudo[6354]: pam_unix(sudo:session): session opened for user root by arch(uid=0) Aug 10 16:24:03 builder kernel: perf interrupt took too long (6276 > 4960), lowering kernel.perf_event_max_sample_rate to 25200 Aug 10 16:24:03 builder kernel: perf interrupt took too long (9960 > 9920), lowering kernel.perf_event_max_sample_rate to 12600 Aug 10 16:24:03 builder kernel: perf interrupt took too long (19853 > 19841), lowering kernel.perf_event_max_sample_rate to 6300 Aug 10 16:24:04 builder kernel: f 4026531847#136039: failed to wait on release 7 after spincount 301 Aug 10 16:24:04 builder kernel: perf interrupt took too long (39767 > 37878), lowering kernel.perf_event_max_sample_rate to 3300 Aug 10 16:24:04 builder kernel: f 4026531865#136044: failed to wait on release 25 after spincount 301 Aug 10 16:24:04 builder kernel: f 4026531879#136024: failed to wait on release 39 after spincount 301 Aug 10 16:24:05 builder kernel: f 4026531892#136026: failed to wait on release 52 after spincount 301 Aug 10 16:24:05 builder kernel: f 4026531909#136027: failed to wait on release 69 after spincount 301 Aug 10 16:24:05 builder kernel: f 4026531916#136028: failed to wait on release 76 after spincount 301 Aug 10 16:24:06 builder kernel: f 4026531929#136029: failed to wait on release 89 after spincount 301 Aug 10 16:24:06 builder kernel: f 4026531930#136030: failed to wait on release 90 after spincount 301 Aug 10 16:24:06 builder kernel: f 4026531935#136031: failed to wait on release 95 after spincount 301 Aug 10 16:24:07 builder kernel: f 4026531937#136032: failed to wait on release 97 after spincount 301 Aug 10 16:24:07 builder kernel: f 4026531940#136033: failed to wait on release 100 after spincount 301 Aug 10 16:24:07 builder kernel: f 4026531941#136034: failed to wait on release 101 after spincount 301 Aug 10 16:24:08 builder kernel: f 4026531944#136035: failed to wait on release 104 after spincount 301 Aug 10 16:24:08 builder kernel: f 4026531945#136036: failed to wait on release 105 after spincount 301 Aug 10 16:24:08 builder kernel: f 4026531946#136037: failed to wait on release 106 after spincount 301 Aug 10 16:24:09 builder kernel: f 4026531857#136040: failed to wait on release 17 after spincount 301 Aug 10 16:24:09 builder kernel: f 4026531858#136041: failed to wait on release 18 after spincount 301 Aug 10 16:24:10 builder kernel: f 4026531859#136042: failed to wait on release 19 after spincount 301 Aug 10 16:24:10 builder kernel: f 4026531862#136043: failed to wait on release 22 after spincount 301 Aug 10 16:24:10 builder kernel: f 4026531870#136045: failed to wait on release 30 after spincount 301 Aug 10 16:24:11 builder kernel: f 4026531873#136046: failed to wait on release 33 after spincount 301 Aug 10 16:24:11 builder kernel: f 4026531877#136047: failed to wait on release 37 after spincount 301 Aug 10 16:24:11 builder kernel: f 4026531880#136048: failed to wait on release 40 after spincount 301 Aug 10 16:24:12 builder kernel: f 4026531886#136049: failed to wait on release 46 after spincount 301
[ 41.688607] ============================================= [ 41.688608] [ INFO: possible recursive locking detected ] [ 41.688611] 4.2.0-0.rc6.git0.1.fc24.x86_64+debug #1 Not tainted [ 41.688612] --------------------------------------------- [ 41.688614] kworker/u4:7/105 is trying to acquire lock: [ 41.688616] (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa010e13d>] qxl_bo_unref+0x4d/0xf0 [qxl] [ 41.688627] but task is already holding lock: [ 41.688629] (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa010e1b0>] qxl_bo_unref+0xc0/0xf0 [qxl] [ 41.688634] other info that might help us debug this: [ 41.688635] Possible unsafe locking scenario: [ 41.688637] CPU0 [ 41.688638] ---- [ 41.688638] lock(&dev->struct_mutex); [ 41.688640] lock(&dev->struct_mutex); [ 41.688642] *** DEADLOCK *** [ 41.688643] May be due to missing lock nesting notation [ 41.688645] 4 locks held by kworker/u4:7/105: [ 41.688646] #0: ("%s""qxl_gc"){++++.+}, at: [<ffffffff810cb35b>] process_one_work+0x19b/0x840 [ 41.688652] #1: ((&qdev->gc_work)){+.+.+.}, at: [<ffffffff810cb35b>] process_one_work+0x19b/0x840 [ 41.688656] #2: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa010e1b0>] qxl_bo_unref+0xc0/0xf0 [qxl] [ 41.688661] #3: (&qdev->surf_evict_mutex){+.+.+.}, at: [<ffffffffa01100ba>] qxl_surface_evict+0x2a/0x70 [qxl] [ 41.688667] stack backtrace: [ 41.688670] CPU: 1 PID: 105 Comm: kworker/u4:7 Not tainted 4.2.0-0.rc6.git0.1.fc24.x86_64+debug #1 [ 41.688671] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150718_101000-lnx 04/01/2014 [ 41.688675] Workqueue: qxl_gc qxl_gc_work [qxl] [ 41.688676] 0000000000000000 00000000fc23f67f ffff88005c977948 ffffffff81868d8e [ 41.688679] 0000000000000000 ffffffff82aed9d0 ffff88005c977a28 ffffffff81109108 [ 41.688682] 39f1cf0d00000000 39f1cf0da46ee367 0000000000000000 ffff8800595d8200 [ 41.688684] Call Trace: [ 41.688688] [<ffffffff81868d8e>] dump_stack+0x4c/0x65 [ 41.688691] [<ffffffff81109108>] __lock_acquire+0x1cd8/0x1d00 [ 41.688694] [<ffffffff810e6b2b>] ? sched_clock_local+0x1b/0x90 [ 41.688696] [<ffffffff810e6d5a>] ? sched_clock_cpu+0x8a/0xb0 [ 41.688698] [<ffffffff81109ab7>] lock_acquire+0xc7/0x270 [ 41.688701] [<ffffffffa010e13d>] ? qxl_bo_unref+0x4d/0xf0 [qxl] [ 41.688707] [<ffffffffa00ceb00>] ? ttm_mem_io_reserve+0x40/0xd0 [ttm] [ 41.688710] [<ffffffffa010e163>] qxl_bo_unref+0x73/0xf0 [qxl] [ 41.688714] [<ffffffffa010e13d>] ? qxl_bo_unref+0x4d/0xf0 [qxl] [ 41.688718] [<ffffffffa0113172>] qxl_alloc_surface_release_reserved+0xc2/0x110 [qxl] [ 41.688721] [<ffffffffa01100ba>] ? qxl_surface_evict+0x2a/0x70 [qxl] [ 41.688724] [<ffffffffa010f14e>] qxl_hw_surface_dealloc.part.3+0x3e/0x110 [qxl] [ 41.688728] [<ffffffffa01100d4>] qxl_surface_evict+0x44/0x70 [qxl] [ 41.688731] [<ffffffffa010e5ba>] qxl_gem_object_free+0x3a/0x70 [qxl] [ 41.688746] [<ffffffffa0049247>] drm_gem_object_free+0x27/0x30 [drm] [ 41.688749] [<ffffffffa010e1ce>] qxl_bo_unref+0xde/0xf0 [qxl] [ 41.688752] [<ffffffff8124a54c>] ? kfree+0x28c/0x2f0 [ 41.688756] [<ffffffffa01127ac>] qxl_release_free_list+0x4c/0x90 [qxl] [ 41.688759] [<ffffffffa0112b22>] qxl_release_free+0x82/0xf0 [qxl] [ 41.688763] [<ffffffffa010f5d5>] qxl_garbage_collect+0xd5/0x1b0 [qxl] [ 41.688766] [<ffffffffa0109305>] qxl_gc_work+0x15/0x20 [qxl] [ 41.688768] [<ffffffff810cb3f2>] process_one_work+0x232/0x840 [ 41.688770] [<ffffffff810cb35b>] ? process_one_work+0x19b/0x840 [ 41.688773] [<ffffffff811274ad>] ? debug_lockdep_rcu_enabled+0x1d/0x20 [ 41.688776] [<ffffffff810cbad5>] ? worker_thread+0xd5/0x450 [ 41.688778] [<ffffffff810cba4e>] worker_thread+0x4e/0x450 [ 41.688780] [<ffffffff810cba00>] ? process_one_work+0x840/0x840 [ 41.688783] [<ffffffff810d2694>] kthread+0x104/0x120 [ 41.688785] [<ffffffff810d2590>] ? kthread_create_on_node+0x250/0x250 [ 41.688788] [<ffffffff81872c5f>] ret_from_fork+0x3f/0x70 [ 41.688790] [<ffffffff810d2590>] ? kthread_create_on_node+0x250/0x250
@poma Frediano (freddy77 on IRC) proposed this patch which works for me (Tested-by: Peter Wu <peter>). It will presumably be submitted for inclusion after inclusion of a commit message. diff --git a/drivers/gpu/drm/qxl/qxl_gem.c b/drivers/gpu/drm/qxl/qxl_gem.c index d9746e9..6eaa8db 100644 --- a/drivers/gpu/drm/qxl/qxl_gem.c +++ b/drivers/gpu/drm/qxl/qxl_gem.c @@ -31,15 +31,18 @@ void qxl_gem_object_free(struct drm_gem_object *gobj) { struct qxl_bo *qobj = gem_to_qxl_bo(gobj); + struct drm_device *dev = gobj->dev; struct qxl_device *qdev; struct ttm_buffer_object *tbo; - qdev = (struct qxl_device *)gobj->dev->dev_private; + qdev = (struct qxl_device *)dev->dev_private; + mutex_unlock(&dev->struct_mutex); qxl_surface_evict(qdev, qobj, false); tbo = &qobj->tbo; ttm_bo_unref(&tbo); + mutex_lock(&dev->struct_mutex); } int qxl_gem_object_create(struct qxl_device *qdev, int size,
Created attachment 1061721 [details] Fix INFO: possible recursive locking detected kernel-core-4.2.0-0.rc6.git0.2.fc24.x86_64 - OK thanks for patch
$ uname -r 4.2.0-0.rc6.git0.2.fc24.x86_64 $ dmesg | grep -i qxl [ 1.571388] [drm] qxl: 16M of VRAM memory size [ 1.572323] [drm] qxl: 63M of IO pages memory ready (VRAM domain) [ 1.573220] [drm] qxl: 64M of Surface memory size [ 1.634890] fb: switching to qxldrmfb from VESA VGA [ 1.635933] fbcon: qxldrmfb (fb0) is primary device [ 1.643858] [drm:qxl_enc_commit [qxl]] *ERROR* head number too large or missing monitors config: ffffc900003ca000, 0 [ 1.646298] qxl 0000:00:02.0: fb0: qxldrmfb frame buffer device [ 1.646299] qxl 0000:00:02.0: registered panic notifier [ 1.670068] [drm] Initialized qxl 0.1.0 20120117 for 0000:00:02.0 on minor 0 [ 5.569794] [drm:qxl_enc_commit [qxl]] *ERROR* head number too large or missing monitors config: ffffc900003ca000, 0 [ 5.890723] [drm:qxl_enc_commit [qxl]] *ERROR* head number too large or missing monitors config: ffffc900003ca000, 0 [ 5.903841] [drm:qxl_enc_commit [qxl]] *ERROR* head number too large or missing monitors config: ffffc900003ca000, 0 [ 9.653615] [drm:qxl_enc_commit [qxl]] *ERROR* head number too large or missing monitors config: ffffc900003ca000, 0 [ 10.082617] [drm:qxl_enc_commit [qxl]] *ERROR* head number too large or missing monitors config: ffffc900003ca000, 0 [ 10.244574] [drm:qxl_enc_commit [qxl]] *ERROR* head number too large or missing monitors config: ffffc900003ca000, 0 [ 10.260143] [drm:qxl_enc_commit [qxl]] *ERROR* head number too large or missing monitors config: ffffc900003ca000, 0 $ uname -r 4.2.0-0.rc6.git0.2.fc24.x86_64+debug $ dmesg | grep -i qxl [ 5.949796] [drm] qxl: 16M of VRAM memory size [ 5.951002] [drm] qxl: 63M of IO pages memory ready (VRAM domain) [ 5.952171] [drm] qxl: 64M of Surface memory size [ 6.003472] fb: switching to qxldrmfb from VESA VGA [ 6.010208] fbcon: qxldrmfb (fb0) is primary device [ 6.026639] qxl 0000:00:02.0: fb0: qxldrmfb frame buffer device [ 6.026641] qxl 0000:00:02.0: registered panic notifier [ 6.034516] [drm] Initialized qxl 0.1.0 20120117 for 0000:00:02.0 on minor 0
Has the patch been merged? I'm still encountering this crash on Fedora Rawhide with kernel '4.3.0-0.rc0.git6.1.fc24.x86_64'.
[ 61.127393] ============================================= [ 61.127395] [ INFO: possible recursive locking detected ] [ 61.127397] 4.3.0-0.rc0.git7.1.fc24.x86_64 #1 Not tainted [ 61.127399] --------------------------------------------- [ 61.127401] kworker/u4:6/101 is trying to acquire lock: [ 61.127402] (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa01030bd>] qxl_bo_unref+0x4d/0xf0 [qxl] [ 61.127413] but task is already holding lock: [ 61.127415] (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0103130>] qxl_bo_unref+0xc0/0xf0 [qxl] [ 61.127420] other info that might help us debug this: [ 61.127422] Possible unsafe locking scenario: [ 61.127423] CPU0 [ 61.127424] ---- [ 61.127425] lock(&dev->struct_mutex); [ 61.127427] lock(&dev->struct_mutex); [ 61.127428] *** DEADLOCK *** [ 61.127430] May be due to missing lock nesting notation [ 61.127431] 4 locks held by kworker/u4:6/101: [ 61.127432] #0: ("%s""qxl_gc"){++++.+}, at: [<ffffffff810c63d9>] process_one_work+0x199/0x6a0 [ 61.127438] #1: ((&qdev->gc_work)){+.+.+.}, at: [<ffffffff810c63d9>] process_one_work+0x199/0x6a0 [ 61.127442] #2: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0103130>] qxl_bo_unref+0xc0/0xf0 [qxl] [ 61.127446] #3: (&qdev->surf_evict_mutex){+.+.+.}, at: [<ffffffffa010503a>] qxl_surface_evict+0x2a/0x70 [qxl] [ 61.127452] stack backtrace: [ 61.127455] CPU: 1 PID: 101 Comm: kworker/u4:6 Not tainted 4.3.0-0.rc0.git7.1.fc24.x86_64 #1 [ 61.127456] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150718_101000-lnx 04/01/2014 [ 61.127460] Workqueue: qxl_gc qxl_gc_work [qxl] [ 61.127462] 0000000000000000 00000000bc09806a ffff88005ca039e0 ffffffff81847841 [ 61.127465] ffffffff82bc6660 ffff88005ca03ab8 ffffffff81104a64 ffffffff82b5e800 [ 61.127467] 39d1cf0d00000000 0000000000000000 ffff8800595d8580 ffff88005ca03a00 [ 61.127470] Call Trace: [ 61.127474] [<ffffffff81847841>] dump_stack+0x4b/0x63 [ 61.127477] [<ffffffff81104a64>] __lock_acquire+0x1b74/0x1ba0 [ 61.127479] [<ffffffff810e2207>] ? sched_clock_local+0x17/0x80 [ 61.127481] [<ffffffff81105406>] lock_acquire+0xc6/0x1c0 [ 61.127485] [<ffffffffa01030bd>] ? qxl_bo_unref+0x4d/0xf0 [qxl] [ 61.127488] [<ffffffffa01030e3>] qxl_bo_unref+0x73/0xf0 [qxl] [ 61.127491] [<ffffffffa01030bd>] ? qxl_bo_unref+0x4d/0xf0 [qxl] [ 61.127495] [<ffffffffa0108062>] qxl_alloc_surface_release_reserved+0xc2/0x110 [qxl] [ 61.127498] [<ffffffffa01040ce>] qxl_hw_surface_dealloc.part.3+0x3e/0x110 [qxl] [ 61.127502] [<ffffffffa0105054>] qxl_surface_evict+0x44/0x70 [qxl] [ 61.127505] [<ffffffffa010353a>] qxl_gem_object_free+0x3a/0x70 [qxl] [ 61.127519] [<ffffffffa0055197>] drm_gem_object_free+0x27/0x30 [drm] [ 61.127527] [<ffffffffa010314e>] qxl_bo_unref+0xde/0xf0 [qxl] [ 61.127530] [<ffffffff81240717>] ? kfree+0x277/0x280 [ 61.127534] [<ffffffffa01076cc>] qxl_release_free_list+0x4c/0x90 [qxl] [ 61.127537] [<ffffffffa0107a2e>] qxl_release_free+0x7e/0xe0 [qxl] [ 61.127540] [<ffffffffa0104555>] qxl_garbage_collect+0xd5/0x1b0 [qxl] [ 61.127543] [<ffffffffa00fe305>] qxl_gc_work+0x15/0x20 [qxl] [ 61.127545] [<ffffffff810c6470>] process_one_work+0x230/0x6a0 [ 61.127547] [<ffffffff810c63d9>] ? process_one_work+0x199/0x6a0 [ 61.127549] [<ffffffff810c692e>] worker_thread+0x4e/0x450 [ 61.127550] [<ffffffff810c68e0>] ? process_one_work+0x6a0/0x6a0 [ 61.127553] [<ffffffff810cd9a1>] kthread+0x101/0x120 [ 61.127556] [<ffffffff81102b09>] ? trace_hardirqs_on_caller+0x129/0x1b0 [ 61.127558] [<ffffffff810cd8a0>] ? kthread_create_on_node+0x250/0x250 [ 61.127561] [<ffffffff8185186f>] ret_from_fork+0x3f/0x70 [ 61.127563] [<ffffffff810cd8a0>] ? kthread_create_on_node+0x250/0x250
Latest trace in comment #13 refers to a 4.3.0-0.rc0.git7.1.fc24.x86_64 kernel, moving back to rawhide.
poma, please don't close bugs without adding an explanation why you are closing it.
Looks like the new kernel does not have this issue. Or debugging settings was turned off or the problem disappeared for some reason. I have a possible alternative patch which is this diff --git a/qxl/qxl_release.c b/qxl/qxl_release.c index b66ec33..4efa8e2 100644 --- a/qxl/qxl_release.c +++ b/qxl/qxl_release.c @@ -307,7 +307,7 @@ int qxl_alloc_surface_release_reserved(struct qxl_device *qdev, idr_ret = qxl_release_alloc(qdev, QXL_RELEASE_SURFACE_CMD, release); if (idr_ret < 0) return idr_ret; - bo = qxl_bo_ref(to_qxl_bo(entry->tv.bo)); + bo = to_qxl_bo(entry->tv.bo); (*release)->release_offset = create_rel->release_offset + 64; @@ -316,8 +316,6 @@ int qxl_alloc_surface_release_reserved(struct qxl_device *qdev, info = qxl_release_map(qdev, *release); info->id = idr_ret; qxl_release_unmap(qdev, *release, info); - - qxl_bo_unref(&bo); return 0; } however as I didn't manage to get the error with the new kernel I don't know if it fixed the issue.
Debug options are turned off on the first -rcX build of ever kernel. The next rawhide build will have them enabled again.
Laughter is the best medicine. https://bugzilla.kernel.org/show_bug.cgi?id=104581
Patch posted at http://lists.freedesktop.org/archives/dri-devel/2015-September/090891.html
- [PATCH v2 0/2] Dependency locks fixes for QXL driver http://lists.freedesktop.org/archives/dri-devel/2015-September/090889.html - [v2,1/2] drm/qxl: avoid buffer reservation in qxl_crtc_page_flip https://patchwork.kernel.org/patch/7256651 - [v2,2/2] drm/qxl: avoid dependency lock https://patchwork.kernel.org/patch/7256661 - Rawhide-Xfce-Live-1001.iso \ 4.3.0-0.rc3.git2.4.fc24.x86_64 + v2-1-2-drm-qxl-avoid-buffer-reservation-in-qxl_crtc_page_flip.patch + v2-2-2-drm-qxl-avoid-dependency-lock.patch
(In reply to poma from comment #21) > - [PATCH v2 0/2] Dependency locks fixes for QXL driver ... PASSED
Moving back to default assignee for kernel bugs so that these patches can be added to the fedora kernel package if needed.
Patches added in Fedora rawhide git. Thanks.
4.3.0-0.rc4.git0.1.fc24.x86_64+debug PASSED
I'm still seeing these "failed to wait on release" on kernel-4.17.0-0.rc0.git7.1.fc29 Along with these error messages kernel: qxl 0000:00:02.0: object_init failed for (3149824, 0x00000001) kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO
(In reply to Steve Dickson from comment #26) > I'm still seeing these "failed to wait on release" on > kernel-4.17.0-0.rc0.git7.1.fc29 > > Along with these error messages > kernel: qxl 0000:00:02.0: object_init failed for (3149824, 0x00000001) > kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO Please open a specific bug for this, does not seem much related. Thanks