Bug 662219
| Summary: | Panic in ttm_tt_swapout | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Jay Fenlason <fenlason> | ||||||||||||||||
| Component: | kernel | Assignee: | Dave Airlie <airlied> | ||||||||||||||||
| Status: | CLOSED WONTFIX | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||||||||||||
| Severity: | medium | Docs Contact: | |||||||||||||||||
| Priority: | low | ||||||||||||||||||
| Version: | 6.8 | CC: | arozansk, jfeeney, jglisse | ||||||||||||||||
| Target Milestone: | rc | ||||||||||||||||||
| Target Release: | --- | ||||||||||||||||||
| Hardware: | Unspecified | ||||||||||||||||||
| OS: | Unspecified | ||||||||||||||||||
| Whiteboard: | |||||||||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||
| Last Closed: | 2017-12-06 11:13:34 UTC | Type: | --- | ||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
| Embargoed: | |||||||||||||||||||
| Bug Depends On: | |||||||||||||||||||
| Bug Blocks: | 846704 | ||||||||||||||||||
| Attachments: |
|
||||||||||||||||||
Jay i guess it happens after a long time ? My box had only been up for a a while when it happened, well less than a month, possibly only a few days. What do you consider a "long time"? According to last, it was booted on " Mon Nov 22 08:40" and again on "Fri Dec 10 11:48", so it had been up for ~17-18 days. Is the oops message in the log if so please attach full log here with the oops, also what is your GPU. There are some ttm_tt_ related messages in /var/log/messages*, but the actual crash was apparently only sent to the screen, where I took the attached picture. I have more pictures, of the crash, which may or may not be useful. The machine's here in Westford if you want to log in and poke around on it. lspci -v says: 05:00.0 VGA compatible controller: nVidia Corporation NV37GL [Quadro PCI-E Series] (rev a2) (prog-if 00 [VGA controller]) Subsystem: nVidia Corporation Device 0215 Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at dd000000 (32-bit, non-prefetchable) [size=16M] Memory at c0000000 (32-bit, prefetchable) [size=256M] Memory at de000000 (32-bit, non-prefetchable) [size=16M] Expansion ROM at dfc00000 [disabled] [size=128K] Capabilities: <access denied> Kernel driver in use: nouveau Kernel modules: nouveau, nvidiafb It's a fairly stock Dell Precision 470, one of a batch that we received a few years ago. Please attach /var/log/messages if the oops happen again please fetch me before rebooting. Created attachment 468983 [details]
/var/log/messages
Created attachment 468984 [details]
/var/log/messages-20101212
Do you remember if you saw similar ttm error preceding the oops ? Not that I noticed, but I wasn't paying any particular attention to dmesg at the time. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. If you would like it considered as an exception in the current release, please ask your support representative. This request was erroneously denied for the current release of Red Hat Enterprise Linux. The error has been fixed and this request has been re-proposed for the current release. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. If you would like it considered as an exception in the current release, please ask your support representative. This request was erroneously denied for the current release of Red Hat Enterprise Linux. The error has been fixed and this request has been re-proposed for the current release. Guys I have had an occurrence of the same issue.
System is Dell Precision WorkStation T3500
02:00.0 VGA compatible controller: nVidia Corporation G98 [Quadro NVS 295] (rev a1)
KERNEL: usr/lib/debug/lib/modules/2.6.32-71.el6.x86_64/vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 4
DATE: Fri Feb 18 10:08:03 2011
UPTIME: 2 days, 17:56:35
LOAD AVERAGE: 0.01, 0.09, 0.06
TASKS: 327
NODENAME: desktop2.example.com
RELEASE: 2.6.32-71.el6.x86_64
VERSION: #1 SMP Wed Sep 1 01:33:01 EDT 2010
MACHINE: x86_64 (3066 Mhz)
MEMORY: 4 GB
PANIC: "Oops: 0000 [#1] SMP " (check log for details)
PID: 157
COMMAND: "ttm_swap"
TASK: ffff8801094294e0 [THREAD_INFO: ffff88010942a000]
CPU: 2
STATE: TASK_RUNNING (PANIC)
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffffa00a090e>] ttm_tt_swapout+0x5e/0x2c0 [ttm]
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/topology/thread_siblings
CPU 2
Modules linked in: dm_snapshot fuse tun ip6table_filter ip6_tables ebtable_nat ebtables xt_CHECKSUM ipt_REJECT autofs4 sunrpc bridge stp llc ipv6 dm_mirror dm_region_hash dm_log kvm_intel kvm uinput ppdev parport_pc parport wmi sg dcdbas i2c_i801 iTCO_wdt iTCO_vendor_support tg3 serio_raw snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i7core_edac edac_core ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom ahci nouveau ttm drm_kms_helper drm i2c_algo_bit video output i2c_core dm_mod [last unloaded: nf_conntrack]
Modules linked in: dm_snapshot fuse tun ip6table_filter ip6_tables ebtable_nat ebtables xt_CHECKSUM ipt_REJECT autofs4 sunrpc bridge stp llc ipv6 dm_mirror dm_region_hash dm_log kvm_intel kvm uinput ppdev parport_pc parport wmi sg dcdbas i2c_i801 iTCO_wdt iTCO_vendor_support tg3 serio_raw snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i7core_edac edac_core ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom ahci nouveau ttm drm_kms_helper drm i2c_algo_bit video output i2c_core dm_mod [last unloaded: nf_conntrack]
Pid: 157, comm: ttm_swap Not tainted 2.6.32-71.el6.x86_64 #1 Precision WorkStation T3500
RIP: 0010:[<ffffffffa00a090e>] [<ffffffffa00a090e>] ttm_tt_swapout+0x5e/0x2c0 [ttm]
RSP: 0018:ffff88010942bcf0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff88010f26d600 RCX: 0000000000001000
RDX: ffff8800ccfa6780 RSI: ffff8800ccfa6780 RDI: ffff8800ea841c40
RBP: ffff88010942bd40 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: ffff8800ea841c40 R14: ffff88010f26d660 R15: ffff88010a8adb20
FS: 0000000000000000(0000) GS:ffff88002c240000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 0000000001001000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ttm_swap (pid: 157, threadinfo ffff88010942a000, task ffff8801094294e0)
Stack:
ffff88010aa8e278 ffff8800ccfa6780 ffff8800ccfa6780 ffff88010f26d600
<0> 0000000000000000 ffff88010f26d600 0000000000000000 ffff88010f26d644
<0> ffff88010f26d660 ffff88010a8adb20 ffff88010942bdc0 ffffffffa00a2008
Call Trace:
[<ffffffffa00a2008>] ttm_bo_swapout+0x1d8/0x270 [ttm]
[<ffffffffa009f4f3>] ttm_shrink+0xe3/0x130 [ttm]
[<ffffffffa009f540>] ? ttm_shrink_work+0x0/0x20 [ttm]
[<ffffffffa009f559>] ttm_shrink_work+0x19/0x20 [ttm]
[<ffffffff8108c610>] worker_thread+0x170/0x2a0
[<ffffffff81091ca0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8108c4a0>] ? worker_thread+0x0/0x2a0
[<ffffffff81091936>] kthread+0x96/0xa0
[<ffffffff810141ca>] child_rip+0xa/0x20
[<ffffffff810918a0>] ? kthread+0x0/0xa0
[<ffffffff810141c0>] ? child_rip+0x0/0x20
Code: 02 00 00 f6 47 20 02 0f 85 f0 01 00 00 48 8b 45 c0 48 85 c0 48 89 45 b8 0f 84 02 02 00 00 48 8b 55 b8 49 83 7d 28 00 48 8b 42 18 <48> 8b 40 10 4c 8b b0 18 01 00 00 0f 84 81 01 00 00 65 48 8b 1c
RIP [<ffffffffa00a090e>] ttm_tt_swapout+0x5e/0x2c0 [ttm]
RSP <ffff88010942bcf0>
CR2: 0000000000000010
vmcore is available inside this directory:
http://jb.usersys.redhat.com/data/desktop2-vmcore/
I'm attaching some output from crash for convenience.
Created attachment 479624 [details]
crash sys output
Created attachment 479625 [details]
log from crash/vmcore
Created attachment 479626 [details]
kmem -f from crash
Created attachment 479627 [details]
bt -a from crash
Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. Since RHEL 6.2 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. Moving to 6.6 so it has a chance of being retested there and close consequently Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available. The official life cycle policy can be reviewed here: http://redhat.com/rhel/lifecycle This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL: https://access.redhat.com/ |
Created attachment 468072 [details] picture of the screen. Description of problem: While using my desktop, it panicked. Version-Release number of selected component (if applicable): 2.6.32-71.el6.x86_64 How reproducible: Once was enough Steps to Reproduce: 1.Use desktop 2.Observe panic 3. Actual results: System panicked and had to be power cycled. Expected results: System continued to work. Additional info: