Bug 211090 - latest kernel breaks xend
latest kernel breaks xend
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: xen (Show other bugs)
5
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Xen Maintainance List
Brian Brock
:
: 212870 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-10-17 06:30 EDT by Lars Volker
Modified: 2007-11-30 17:11 EST (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-11-04 14:18:47 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
xend.log and xend-debug.log of failing xen-3.0.2-5.fc5 (3.46 KB, text/plain)
2006-10-21 03:13 EDT, Daniel Tschan
no flags Details

  None (edit)
Description Lars Volker 2006-10-17 06:30:27 EDT
Description of problem: upgrading to the latest dom0 kernel
(2.6.18-1.2200.fc5xenU) breaks xen, xend won't start anymore.


How reproducible: yum update, service xend restart
Comment 1 Lars Volker 2006-10-17 06:33:54 EDT
sorry, made a mistake. Of course it must be 2.6.18-1.2200.fc5xen0 as the version
of the kernel as it's the dom0 kernel, that breaks things.
Comment 2 Lars Volker 2006-10-17 07:24:04 EDT
Here is an excerpt from the logs on xend-startup.

[2006-10-17 12:24:49 xend] INFO (SrvDaemon:283) Xend Daemon started
[2006-10-17 12:24:49 xend] INFO (SrvDaemon:287) Xend changeset: unavailable .
[2006-10-17 12:24:49 xend] ERROR (SrvDaemon:297) Exception starting xend ((38,
'Function not implemented'))
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvDaemon.py", line
291, in run
    servers = SrvServer.create()
  File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvServer.py", line
108, in create
    root.putChild('xend', SrvRoot())
  File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvRoot.py", line 40,
in __init__
    self.get(name)
  File "/usr/lib/python2.4/site-packages/xen/web/SrvDir.py", line 82, in get
    val = val.getobj()
  File "/usr/lib/python2.4/site-packages/xen/web/SrvDir.py", line 52, in getobj
    self.obj = klassobj()
  File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvDomainDir.py", line
39, in __init__
    self.xd = XendDomain.instance()
  File "/usr/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 609, in
instance
    inst.init()
  File "/usr/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 76, in init
    self._add_domain(
  File "/usr/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 139, in
xen_domains
    domlist = xc.domain_getinfo()
Error: (38, 'Function not implemented')
[2006-10-17 12:24:49 xend] INFO (SrvDaemon:183) Xend exited with status 1.
Comment 3 Jirka Pech 2006-10-17 12:13:36 EDT
Confirming the same behaviour on clean (but updated) FC5.
Comment 4 Jirka Pech 2006-10-17 14:36:57 EDT
Confirming the same behaviour on clean (but updated) FC5 with obsoleted testing
kernel 2.6.18-1.2189.fc5xen0.
Comment 5 JM 2006-10-18 14:05:30 EDT
Same behaviour on x86_64 FC5 system. The kernel kernel-xen0-2.6.18-1.2200.fc5
breaks the xend. With kernel-xen0-2.6.17-1.2187_FC5 the xend works without any
problems.
Comment 6 Dave South 2006-10-20 07:32:04 EDT
If you strace the xm list process it shows a whole list of missing files. I will
not attach the output from strace here as it is massive.

Comment 7 Daniel Berrange 2006-10-20 08:47:45 EDT
A fix for this issue was pushed to the fedora-updates repository for FC5
yesterday. Please do a yum update to the following RPM versions & re-test to
confirm the fix:

xen-3.0.2-5.fc5
libvirt-0.1.7-2.FC5
Comment 8 Dave South 2006-10-20 09:06:26 EDT
I have these versions on my machine. The same problem still occurs.
Comment 9 Daniel Veillard 2006-10-20 09:48:46 EDT
w.r.t. #8 , can you confirm it's i686 that you rebooted (or well it's the new
xend showing up problem) and provide the end of the xend startup logs  ?

  thanks

Daniel
Comment 10 Daniel Tschan 2006-10-21 03:13:42 EDT
Created attachment 139049 [details]
xend.log and xend-debug.log of failing xen-3.0.2-5.fc5

I can confirm that xen-3.0.2-5.fc5 still doesn't work on i686 kernel-xen and
kernel-xen0. All updates available at this time are installed. Strangely enough
it works when I boot with xen-3.0.2-0.fc5.3, then upgrade to xen-3.0.2-5.fc5
and restart xend.
Comment 11 Jirka Pech 2006-10-26 04:42:19 EDT
Daniel,

that's really strange, I can't confirm that on my systems. Haven't you changed
anything else?

Thank you,
Jirka Pech
Comment 12 Daniel Tschan 2006-10-26 05:23:57 EDT
Thanks for the info. The only difference that came to my mind was that I had to
create the initial ramdisk manually because of bug #211030 . And indeed this was
the cause of the problem. I now upgraded to xen-3.0.2-5.fc5, then rebuilt the
initial ramdisk and rebooted. Now the problem is gone.
Comment 13 Juliano F. Ravasi 2006-10-26 18:17:25 EDT
I'm still observing this bug on my system. Xend dies with ERROR (SrvDaemon:297)
Exception starting xend ((111, 'Connection refused'))

[root@misuzu ~]# uname -a
Linux misuzu.jr 2.6.18-1.2200.fc5xen0 #1 SMP Sat Oct 14 17:49:47 EDT 2006 i686
i686 i386 GNU/Linux
[root@misuzu ~]# rpm -q xen libvirt kernel-xen0
xen-3.0.2-5.fc5
libvirt-0.1.7-2.FC5
kernel-xen0-2.6.17-1.2187_FC5
kernel-xen0-2.6.18-1.2200.fc5
Comment 14 Juliano F. Ravasi 2006-10-26 18:43:07 EDT
stracing xend, I found this that seems to be relevant:

[pid  4165] connect(14, {sa_family=AF_FILE, path="/var/run/xenstored/socket"},
110) = -1 ECONNREFUSED (Connection refused)

The file exists, but xenstored is not running.

stracing xenstored, it looks like it bails out in initialization after trying to
read /dev/xen/evtchn:

[pid  4157] send(9, "<27>Oct 26 19:25:42 xenstored: C"..., 55, MSG_NOSIGNAL) = 55
[pid  4157] open("/proc/xen/privcmd", O_RDWR) = 10
[pid  4157] fcntl64(10, F_GETFD)        = 0
[pid  4157] fcntl64(10, F_SETFD, FD_CLOEXEC) = 0
[pid  4157] lstat64("/dev/xen/evtchn", {st_mode=S_IFCHR|0600,
st_rdev=makedev(10, 201), ...}) = 0
[pid  4157] open("/dev/xen/evtchn", O_RDWR) = -1 ENODEV (No such device)
[pid  4157] write(2, "ERROR: Could not open event chan"..., 68) = 68
[pid  4157] write(2, "FATAL: ", 7)      = 7
[pid  4157] write(2, "Failed to open evtchn device: No"..., 45) = 45
[pid  4157] close(10)                   = 0
[pid  4157] close(5)                    = 0
[pid  4157] close(4)                    = 0
[pid  4157] exit_group(1)               = ?
Process 4157 detached


Comment 15 Daniel Tschan 2006-10-27 02:46:23 EDT
The problem reappeared on my system today, though I haven't changed anything
meanwhile. I probably booted the wrong kernel after rebuilding the initrd
yesterday. Sorry for the wrong comment. Strace output is identical to Juliano's.
Comment 16 Henning Schmiedehausen 2006-10-29 16:53:25 EST
Same problem here. Fully patched FC5

Linux sinus.hometree.net 2.6.18-1.2200.fc5xen0 #1 SMP Sat Oct 14 17:49:47 EDT
2006 i686 i686 i386 GNU/Linux
xen-3.0.2-5.fc5
libvirt-0.1.7-2.FC5
kernel-xen0-2.6.18-1.2200.fc5

strace shows:

open("/proc/xen/privcmd", O_RDWR)       = 10
fcntl64(10, F_GETFD)                    = 0
fcntl64(10, F_SETFD, FD_CLOEXEC)        = 0
lstat64("/dev/xen/evtchn", {st_mode=S_IFCHR|0600, st_rdev=makedev(10, 201),
...}) = 0
open("/dev/xen/evtchn", O_RDWR)         = -1 ENODEV (No such device)
write(2, "ERROR: Could not open event chan"..., 68ERROR: Could not open event
channel interface (19 = No such device)) = 68
write(2, "FATAL: ", 7FATAL: )                  = 7
write(2, "Failed to open evtchn device: No"..., 45Failed to open evtchn device:
No such device) = 45
close(10)                               = 0
close(5)                                = 0
close(4)                                = 0
exit_group(1)                           = ?

when starting xenstored. So it seems to be the same problem as the x86_64 cases.
Do you guy do *any* QA BTW?
Comment 17 Daniel Berrange 2006-10-29 18:09:30 EST
The kernel was updated without the xen userspace being updated to match. Since
the 'evtchn' device was changed to use a dynamically allocated minor number,
xenstored failed during startup trying to open device with old static minor
number. Temporary fix is in this thread:

http://www.redhat.com/archives/fedora-xen/2006-October/msg00151.html

Official updated xen RPMs will be pushed out in the very near future.
Comment 18 Daniel Berrange 2006-10-29 18:12:24 EST
*** Bug 212870 has been marked as a duplicate of this bug. ***
Comment 19 Rob Dyke 2006-11-01 18:48:58 EST
Downloaded and installed Daniel's patch. Worked great and restarted xen on our
2.6.18-1.2200.fc5xen0. Thank you.
Comment 20 JM 2006-11-02 11:57:05 EST
I use the patch on x86_64 system (kernel-xen0-2.6.18-1.2200.fc5) and so far it
works. Thank you for the patch.
Comment 21 Daniel Berrange 2006-11-02 16:23:27 EST
Please test the updated Xen 3.0.3-1.fc5 available in updates-testing.

http://www.redhat.com/archives/fedora-test-list/2006-October/msg01060.html
Comment 22 Fedora Update System 2006-11-03 11:30:22 EST
The xen-3.0.3-l.fc5 errata fixes the issues observed in this bug.

Note You need to log in before you can comment on or make changes to this bug.