Bug 239601

Summary: kernel: BUG: at kernel/lockdep.c:1858 trace_hardirqs_on()
Product: [Fedora] Fedora Reporter: Robert Scheck <redhat-bugzilla>
Component: kernel-xenAssignee: Eduardo Habkost <ehabkost>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: urgent Docs Contact:
Priority: medium    
Version: rawhideCC: itamar, vikigoyal, xen-maint
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.20-2925.11.fc7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-06-18 16:41:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 150225    
Attachments:
Description Flags
Relevant parts from /var/log/messages none

Description Robert Scheck 2007-05-09 20:09:04 UTC
Description of problem:
Well, the Xen kernel is broken...when booting the machine using kernel-xen,
then machine hangs up during boot.

May  9 21:48:45 rsc kernel: BUG: at kernel/lockdep.c:1858 trace_hardirqs_on()
May  9 21:48:45 rsc kernel:  [<c1005d9a>] show_trace_log_lvl+0x1a/0x2f
May  9 21:48:45 rsc kernel:  [<c1006343>] show_trace+0x12/0x14
May  9 21:48:45 rsc kernel:  [<c10063be>] dump_stack+0x16/0x18
May  9 21:48:45 rsc kernel:  [<c1037439>] trace_hardirqs_on+0xc4/0x143
May  9 21:48:45 rsc kernel:  [<c10055d4>] restore_all+0x3b/0x3e
May  9 21:48:45 rsc kernel:  =======================

Version-Release number of selected component (if applicable):
kernel-xen-2.6.20-2925.5.fc7

How reproducible:
Everytime for me.

Steps to Reproduce:
1. Install the virtualization stuff using pirut
2. Change boot order
3. Boot system
  
Actual results:
No booting possible when using kernel-xen package.

Expected results:
Working system, not more and not less ;-)

Additional info:
Syslog is attached to this report. I gave the kernel two tries as you can see.

Comment 1 Robert Scheck 2007-05-09 20:10:11 UTC
Created attachment 154426 [details]
Relevant parts from /var/log/messages

Comment 2 Itamar Reis Peixoto 2007-05-15 03:23:19 UTC
the same problem with

xen-libs-3.1.0-0.rc7.1.fc7
kernel-xen-2.6.20-2925.8.fc7
xen-3.1.0-0.rc7.1.fc7



Comment 3 Daniel Berrangé 2007-05-15 03:36:13 UTC
The lockdep bug (while it should be fixed) is fairly harmless as far as we can
tell. 

Ries - when you say you have the same problem when latest kernel, are you
refering to the lockdep bug warning message, or an actual fatal hang / crash at
boot time ?


Comment 4 Itamar Reis Peixoto 2007-05-15 04:26:17 UTC
I am able to boot dom0, with warning

device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel
BUG: at kernel/lockdep.c:1858 trace_hardirqs_on()
 [<c1005d9e>] show_trace_log_lvl+0x1a/0x2f
 [<c1006347>] show_trace+0x12/0x14
 [<c10063c2>] dump_stack+0x16/0x18
 [<c1037435>] trace_hardirqs_on+0xc4/0x143
 [<c10055d4>] restore_all+0x3b/0x3e
 =======================
printk: 20312 messages suppressed.
4gb seg fixup, process nash-hotplug (pid 219), cs:ip 73:00279a1c
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.




but I am unable to create domU's


Comment 5 Eduardo Habkost 2007-05-16 15:04:27 UTC
Robert Scheck,

Do the problem persist when using the new kernel-xen? 
(kernel-xen-2.6-2.6.20-2925.8.fc7)

Comment 6 Itamar Reis Peixoto 2007-05-16 18:10:19 UTC
Eduardo Habkost, have you read my message ?

I have tested with kernel-xen-2.6-2.6.20-2925.8.fc7 and the problem persists

Comment 7 Eduardo Habkost 2007-05-16 18:22:06 UTC
My questoin is which problem this bug is about, and what we need to close it.

We have three distinct problems reported here:

a) Crashes and hangs (same of bug 234008, but for rawhide, and solved with the 
newer kernel)
b) BUG: at kernel/lockdep.c:1858 trace_hardirqs_on() (not solved yet)
c) Not being able to create domU virtual machines (need more information)


If this bug is about (a), then it can be closed if the problem is solved. If 
this is about (b), then we can keep it open until we resolve the "BUG" 
message. It is up to the bug reporter to decide what is the case, so I asked 
Robert about the problem he reported.

Itamar, Regarding not being able to create domUs, could you provide more 
details about the problem? How you are trying to create them and what are the 
error messages or unexpected behaviour you are getting? Maybe this problem is 
appropriate for a separated bug report, as it seems to be independent of (a) 
and (b). Are you able to create domUs using kernel-xen 2.6.19?

Comment 8 Itamar Reis Peixoto 2007-05-16 18:34:59 UTC
I am able to create domU's using  xen 3.1 rc10 and a 2.6.18 kernel, compiled 
from 

http://xenbits.xensource.com/xen-3.1-testing.hg

Comment 9 Itamar Reis Peixoto 2007-05-16 20:10:07 UTC
Eduardo 

if you have a ssh_key I can provide root access to the machine.


Comment 10 Leslie Satenstein 2007-05-16 21:13:56 UTC
Have this bug with

-rw-r--r-- 1 root root  875161 2007-05-10 18:00 System.map-2.6.20-2925.8.fc7xen

Leslie

Comment 11 Robert Scheck 2007-05-18 23:22:58 UTC
The problem also exists with 2.6.20-2925.8.fc7xen.

Comment 12 Itamar Reis Peixoto 2007-05-24 04:20:29 UTC
this problem still persists in 2.6.20-2925.9.fc7xen

Comment 13 Eduardo Habkost 2007-05-24 12:51:02 UTC
Do you still see hangs or crashes, or just the "BUG: at kernel/lockdep.c:1858 
trace_hardirqs_on()" persisted?

Comment 14 Eduardo Habkost 2007-05-24 15:24:27 UTC
It seems that upstream xen doesn't support CONFIG_PROVE_LOCKING (. The check 
for enabled interrupts on TRACE_IRQS_IRET seems to be incorrect for xen, as 
they are disabled/enabled through evtchn_upcall_mask(%esi), and not 
PT_EFLAGS(%esp).

Relevant part of arch/i386/kernel/entry-xen.S:

.macro TRACE_IRQS_IRET
#ifdef CONFIG_TRACE_IRQFLAGS
	testl $IF_MASK,PT_EFLAGS(%esp)     # interrupts off?
	jz 1f
	TRACE_IRQS_ON
1:
#endif
.endm


Changing this to check evtchn_upcall_mask() may solve the problem, but I don't 
know if there may be other problems related to CONFIG_PROVE_LOCKING on xen. 
Disabling CONFIG_PROVE_LOCKING on kernel-xen seems to be more safe.

Comment 15 Itamar Reis Peixoto 2007-05-24 16:18:26 UTC
(In reply to comment #13)
> Do you still see hangs or crashes, or just the "BUG: at kernel/lockdep.c:1858 
> trace_hardirqs_on()" persisted?

yes, the dom0 boots fine with this warning,at this moment no crashes, but I am 
unable to create domU's

Comment 16 Eduardo Habkost 2007-05-24 16:48:25 UTC
Itamar, could you provide more information regarding the problem you have 
creating domU's? Preferably in a new bugzilla bug, as it is not related 
to the lockdep warning (that is harmless).

Please detail the exact steps you are making to create the new domU domains, 
error messages or unexpected behaviour you are seeing, and what was the 
expected behaviour.

Comment 17 Eduardo Habkost 2007-05-24 16:50:35 UTC
CONFIG_PROVE_LOCKING was disabled on F-7 CVS, and a build was submitted:
http://koji.fedoraproject.org/koji/taskinfo?taskID=16419

It probably won't go to Fedora 7 final, but as an additional update, however.

Comment 18 Robert Scheck 2007-05-24 19:43:02 UTC
Itamar that looks related to my problem. I can create DomUs but if I want to 
boot them, they simply crash. Xm list tells, that they are already running, 
but when looking to log, they're crashed...

Comment 19 Robert Scheck 2007-05-24 19:43:50 UTC
Eduardo, IIRC nothing is gold for now. It can slip in, if you're fast enough.

Comment 20 Eduardo Habkost 2007-06-11 19:25:00 UTC
Ouch. Wrong bug closed, sorry.

Comment 21 Itamar Reis Peixoto 2007-06-11 19:27:09 UTC
this is fixed in 2.6.20-2925.10.fc7



Comment 22 Fedora Update System 2007-06-11 22:05:00 UTC
kernel-xen-2.6-2.6.20-2925.11.fc7 has been pushed to the Fedora 7 testing repository.  If problems still persist, please make note of it in this bug report.

Comment 23 Eduardo Habkost 2007-06-18 13:55:56 UTC
*** Bug 244561 has been marked as a duplicate of this bug. ***

Comment 24 Fedora Update System 2007-06-18 16:41:15 UTC
kernel-xen-2.6-2.6.20-2925.11.fc7 has been pushed to the Fedora 7 stable repository.  If problems still persist, please make note of it in this bug report.