RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 587265 - [abrt] crash in kernel: BUG: soft lockup - CPU#1 stuck for 4096s! [sync_supers:18]
Summary: [abrt] crash in kernel: BUG: soft lockup - CPU#1 stuck for 4096s! [sync_super...
Keywords:
Status: CLOSED DUPLICATE of bug 550724
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Andrew Jones
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard: abrt_hash:501133792
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-04-29 13:53 UTC by Andrew Hecox
Modified: 2011-01-05 10:10 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-11-16 09:07:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
sos from domU (368.21 KB, application/x-bzip2)
2010-04-29 14:03 UTC, Andrew Hecox
no flags Details

Description Andrew Hecox 2010-04-29 13:53:26 UTC
abrt 1.0.7 detected a crash.

architecture: x86_64
cmdline: not_applicable
component: kernel
executable: kernel
kernel: 2.6.32-19.el6.x86_64
package: kernel
reason: BUG: soft lockup - CPU#1 stuck for 4096s! [sync_supers:18]
release: Red Hat Enterprise Linux release 6.0 Beta (Santiago)

kerneloops
-----
BUG: soft lockup - CPU#1 stuck for 4096s! [sync_supers:18]
Modules linked in: autofs4(U) sunrpc(U) ip6t_REJECT(U) nf_conntrack_ipv6(U) ip6table_filter(U) ip6_tables(U) ipv6(U) dm_mirror(U) dm_region_hash(U) dm_log(U) joydev(U) xen_netfront(U) ext4(U) mbcache(U) jbd2(U) xen_blkfront(U) dm_mod(U) [last unloaded: scsi_wait_scan]
CPU 1:
Modules linked in: autofs4(U) sunrpc(U) ip6t_REJECT(U) nf_conntrack_ipv6(U) ip6table_filter(U) ip6_tables(U) ipv6(U) dm_mirror(U) dm_region_hash(U) dm_log(U) joydev(U) xen_netfront(U) ext4(U) mbcache(U) jbd2(U) xen_blkfront(U) dm_mod(U) [last unloaded: scsi_wait_scan]
Pid: 18, comm: sync_supers Not tainted 2.6.32-19.el6.x86_64 #1 
RIP: e030:[<ffffffff8100922a>]  [<ffffffff8100922a>] hypercall_page+0x22a/0x1010
RSP: e02b:ffff88007d13fd50  EFLAGS: 00000246
RAX: 0000000000030001 RBX: ffff880003795400 RCX: ffffffff8100922a
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88007d13fd68 R08: ffff88007d13e000 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000246 R12: ffff88007d13d540
R13: ffff88007d337540 R14: 0000000000000001 R15: ffff880004324100
FS:  00007fbb108657c0(0000) GS:ffff88000430e000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f1c9d489000 CR3: 0000000003528000 CR4: 0000000000000660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Call Trace:
[<ffffffff8100f19d>] ? xen_force_evtchn_callback+0xd/0x10
[<ffffffff8100f9c2>] check_events+0x12/0x20
[<ffffffff8100f969>] ? xen_irq_enable_direct_end+0x0/0x7
[<ffffffff810566e3>] ? finish_task_switch+0x53/0xd0
[<ffffffff814c04ce>] thread_return+0x4e/0x740
[<ffffffff811224e0>] ? bdi_sync_supers+0x0/0x60
[<ffffffff8112251b>] bdi_sync_supers+0x3b/0x60
[<ffffffff8108d8a6>] kthread+0x96/0xa0
[<ffffffff810141ca>] child_rip+0xa/0x20
[<ffffffff81013391>] ? int_ret_from_sys_call+0x7/0x1b
[<ffffffff81013b1d>] ? retint_restore_args+0x5/0x6
[<ffffffff810141c0>] ? child_rip+0x0/0x20

Comment 1 Andrew Hecox 2010-04-29 14:03:22 UTC
Created attachment 410117 [details]
sos from domU

Comment 3 RHEL Program Management 2010-04-29 15:25:39 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 4 Andrew Jones 2010-04-30 09:00:00 UTC
There's a discussion about this upstream on xen-devel right now

http://lists.xensource.com/archives/html/xen-devel/2010-03/msg01561.html

That machine and the machine this bug is reported for are both AMD.

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 5
model name	: AMD Opteron(tm) Processor 250
stepping	: 10
cpu MHz		: 2393.180
cache size	: 1024 KB
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu tsc msr pae cx8 cmov pat clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow rep_good
bogomips	: 4786.36
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management: ts fid vid ttp


This machine has 2 vcpus and xen-devel one has 3. While I don't have much confidence it will help, it would probably be a good idea to test a guest with only 1 vcpu. Can you try running this guest again with only one vcpu?

Comment 5 Andrew Jones 2010-04-30 09:16:47 UTC
Oops, I think I shot from the hip a bit too fast there on my last comment. Now that I'm looking closer it doesn't look like the upstream issue is related, so sorry about that noise. This bug does look like something I've seen before though bug 550724. It looks like that bug because I poked at the sos report and see that dmesg shows all tasks are getting stuck, i.e. locked up on D state. I'll look closer now before I make my next comment :-)  Also, is it possible for me to get access to this machine?

Comment 6 Andrew Hecox 2010-05-02 21:40:12 UTC
yeah, access is not a problem: should I switch back to 2vCPUs? 

I think this might have happened right after ntp-syncing post-install, if that helps.

Comment 9 RHEL Program Management 2010-07-15 14:50:47 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 10 Andrew Jones 2010-08-02 15:55:36 UTC
Resetting this to 6.1. We need a reliable reproducer to work on it.

Comment 11 Andrew Jones 2010-09-20 08:43:18 UTC
This is likely a dup of bug 550724. Are you still seeing this problem? If so, please try running with irqbalance off and see if it goes away so we can dup it.

Thanks,
Drew

Comment 13 Andrew Jones 2010-11-16 09:07:37 UTC
Closing this as a dup of bug 550724, if latest kernels still have the problem, then it should be reopened with more information.

*** This bug has been marked as a duplicate of bug 550724 ***


Note You need to log in before you can comment on or make changes to this bug.