Bug 247523
Summary: | Xen kernel breaks Wake-on-LAN from shutdown state | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Chris Snook <csnook> | ||||||
Component: | xen | Assignee: | Michal Novotny <minovotn> | ||||||
Status: | CLOSED CANTFIX | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | low | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 5.0 | CC: | areis, berrange, clalance, jdenemar, jreznik, martin.wilck, mrezanin, riel, simon.matter, syeghiay, tao, xen-maint | ||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2010-04-08 16:51:08 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 490053 | ||||||||
Bug Blocks: | 499522, 514499 | ||||||||
Attachments: |
|
Description
Chris Snook
2007-07-09 18:40:55 UTC
change QA contact Still broken in RHEL 5.3 beta, on the same hardware. The acpi_power_off bug has been fixed, so it's possible the real bug is the bridging configuration, not Xen itself. By any chance, do the xen network scripts make the NIC think it has a different MAC address? Maybe this is related to bug #235502 or bug #458806? Created attachment 335038 [details] Proposed patch With this patch and the patch from Bug 490053 applied to my system, I can wake the system with a magic packet after running "poweroff" from the command line. Well, I have it tested and the patch is working. The problem is that the network was not stopped which made this Wake-On-LAN (WOL) issue. Stopping the network made WOL working even after xen kernel shutdown. Created attachment 339466 [details] New version of this patch I have created a better version of this patch thanks to these information to check runlevel first and stop network only when shutting down/powering off (runlevel 0 or 6). It's been tested and it cases no network disruption now, like the proposed patch (attachment #335041 [details]) did, because network is stopped only when Dom0 is shutting down. The wake-on-lan using etherwake has been tested with this patch applied and it works fine - it starts the bare-metal box. This is still not safe as it will break any machine using NFS root, or network block devices like NBD/GNBD/iSCSI. Well, I am not using NFS root or devices written there... Any idea what could we do about that ? I've been investigating network scripts but there should be no problem because when starting network-bridge with Xen, it returns an error when using network root. It just says that bridging on network root is not supported and returns from the script itself so that the bridge is not created and stopping it is not possible because code there is to exit the script when no bridge is found. Is this still an issue then? As Dan said on list long time ago: I don't see how this is safe if you have filesystems on NFS, or iSCSI. Later initscripts in the shutdown sequence may still need to access the filesystems and this just tore network out from under them. That is, neither xen nor its init script is allowed to stop network. Setting back to assigned. Wait a minute . . . It's been a while since I looked at this, but IIRC, the call to Vifctl.network('stop') is just supposed to break down the bridge and set everything back to the state before xend started, right? If that's the case, then how would this affect NFS or iSCSI filesystems in use in Dom0? Taking down the bridge shouldn't change any connectivity to external mounts, should it? I could see a problem if there were DomUs that depended on NFS or iSCSI filesystems in Dom0, but all the DomUs should have been shut down when xendomains is stopped, which occurs before xend is stopped. What am I missing? (In reply to comment #24) > Wait a minute . . . It's been a while since I looked at this, but IIRC, the > call to Vifctl.network('stop') is just supposed to break down the bridge and > set everything back to the state before xend started, right? If that's the > case, then how would this affect NFS or iSCSI filesystems in use in Dom0? > Taking down the bridge shouldn't change any connectivity to external mounts, > should it? Yes, it does. The process of breaking down the bridge causes dom0 to lose connectivity while the bridge is being broken down. If the commands you need to run happen to be on remote storage, then once you execute the first of the breakdown commands, you can no longer get to the rest of the commands you need to bring networking back up. Chris Lalancette (In reply to comment #25) > The process of breaking down the bridge causes dom0 to lose > connectivity while the bridge is being broken down. ... Aha. Didn't know that. Thanks for the explanation. ~ Bryan (In reply to comment #25) > (In reply to comment #24) > > Wait a minute . . . It's been a while since I looked at this, but IIRC, the > > call to Vifctl.network('stop') is just supposed to break down the bridge and > > set everything back to the state before xend started, right? If that's the > > case, then how would this affect NFS or iSCSI filesystems in use in Dom0? > > Taking down the bridge shouldn't change any connectivity to external mounts, > > should it? > > Yes, it does. The process of breaking down the bridge causes dom0 to lose > connectivity while the bridge is being broken down. If the commands you need > to run happen to be on remote storage, then once you execute the first of the > breakdown commands, you can no longer get to the rest of the commands you need > to bring networking back up. > > Chris Lalancette Yeah, that makes sense. The problem is how to solve this because when we don't run the script for stopping bridge, the network interface is shut down in a strange state which makes Wake-On-LAN impossible. What I am thinking about it to run the script right before the shutdown, ie. the most ideally after unmounting file systems or so... Anyway the scripts have a code not to be able to be started when using network root devices, so there should be no problem with that, shouldn't it ? The /etc/xen/scripts/network-bridge have: ... if is_network_root ; then [ -x /usr/bin/logger ] && /usr/bin/logger "network-bridge: bridging not supported on network root; not starting" return fi ... for op_start() operation (which can be run as /etc/xen/scripts/network-bridge start). The function of 'is_network_root' is defined as: is_network_root () { local rootfs=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/") { print $3; }}' /etc/mtab) local rootopts=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/") { print $4; }}' /etc/mtab) [[ "$rootfs" =~ "^nfs" ]] || [[ "$rootopts" =~ "_netdev" ]] && return 0 || return 1 } Which means that it should *not* be able to start if using network root and that way stopping the network-bridge will do nothing because no bridge can be found. ... and that means it would be safe to call "network-bridge stop" at shutdown time. Well, you can also have a look at http://kbase.redhat.com/faq/docs/DOC-21309 . It's about setting network bridge manually for using SNMPd along with Xen but the reason of setting bridge manually is not important. Important is that it describes how to set up the bridge manually which is the best and supported solution. Michal This is limitation of bridging. We can't do anything with that one in the virtualization stack. Michal My workaround is to put the script below into /sbin/halt.local (The biggest problem is that brctl is not in /sbin but in /usr/sbin which seems like a bug to me.) Simon #!/bin/sh # workaround hack for WOL issues with XEN kernels # see also https://bugzilla.redhat.com/show_bug.cgi?id=247523 mount -o ro /usr /etc/xen/scripts/network-bridge stop umount /usr modprobe -r eth0 modprobe eth0 ethtool -s eth0 wol g > If we can't run Dom0 from a network file system, then
> shutting down the bridge when xend shuts down won't cause any harm.
>
We can run Dom0 from a network filesystem (iSCSI storage tested by me).
|