Created attachment 523731 [details]
A screenshot made at the last system hang mentioned.
Description of problem:
With KVM (libvirtd) installed, when kernel 126.96.36.199-5 is used and NetworkManager service is disabled, system hangs during shutdown process, trace message starts with:
BUG:unable to handle kernel NULL pointer dereference at 0000000000000228
Trace information visible on screen mentions dnsmasq which is used by libvirtd.
If shutdown immediately after the reboot in such a situation, Linux shuts down normally.
If NetworkManager is started, Fedora 15 shuts down normally. NetworkManager is disabled in the above configuration, since it can't handle bridged networks and incorrectly reports them as disconnected, causing multiple problems.
Nothing is written about this to system logs (since filesystems are dismounted by the time the bug strikes).
Version-Release number of selected component (if applicable):
Every shutdown of the Fedora 15 with the above setup.
Steps to Reproduce:
1. Install libvirtd as part of KVM/QEMU and set it to start automatically.
2. Boot the Fedora 15 with the above components and make use of KVM virtual machines
3. Shut down system
System hangs during shutdown, reporting some trace info on the screen.
System shuts down and turns off the computer.
static int selinux_socket_unix_may_send(struct socket *sock,
struct socket *other)
struct sk_security_struct *ssec = sock->sk->sk_security;
==> struct sk_security_struct *osec = other->sk->sk_security;
other->sk is NULL, so we get an exception trying to follow the pointer to sk_security
Not sure what this means? Did the socket at the other end disconnect?
Not sure what this means either, but I'm installing a F15 system right now to find out.
Unfortunately, I'm not able to reproduce the problem on my test system; does this happen every time you shutdown the system or is it sporadic?
I'll go wander through the UNIX socket code now to see if anything jumps out at me.
(In reply to comment #3)
> Unfortunately, I'm not able to reproduce the problem on my test system; does
> this happen every time you shutdown the system or is it sporadic?
> I'll go wander through the UNIX socket code now to see if anything jumps out at
Nothing looks obviously broken to me in net/af_unix.c but I'm not exactly a UNIX socket expert. The only thing that gave me some pause is a lack of checking in unix_release_sock(), but I'm not sure if that is critical. I'll attach a simple patch which adds some additional checking, but until I can recreate the problem I have no idea if this patch is needed or not.
Konstantin, can you please rely to my questions in comment #3? Also, any chance you can try the attached patch?
Created attachment 524271 [details]
Potential AF_UNIX socket fix
Adds some additional checking in unix_release_sock()
Paul, answering the questions in #3: if I
- use system for more than several minutes (I suspect 3-5 minutes are enough to have all the services started, libvirtd included)
- do NOT stop libvirtd explicitly before shutdown
then the above crash occurs.
I will try the patch on weekend, since it's the computer I use heavily for business.
Thanks for the information on reproducing the problem, unfortunately that matches what I've been trying and I've still not seen the problem. I suspect the problem you are seeing is a timing/race issue which can be very tricky to diagnose - I appreciate your willingness to help try the patch and debug the problem.
Let me know what happens with the patch.
Hi Konstantin, any updates?
Hi Paul, I had np chance to handle that the previous weekend, the next attempt on the followign weekend.
I noticed that the only way to shut down system normally is to stop libvirtd and wait for at least 30 seconds before issuing shutdown command.
Okay, thanks for the update.
I suspect the problem is a socket created by dnsmasq, which is started by libvirtd; when you shutdown libvirtd I suspect it also stops dnsmasq. Now, why the problem only accurs when you shutdown the entire system? I suspect it is a very narrow race condition on socket close/destroy that only happens on your particular system during system shutdown.
Sorry for the long silence, Paul.
I have this problem only at shutdown. I will also redirect other people having the same problem here, in case they can provide more details.
No worries on the delay.
Have you had a chance to try the patch in comment #5? Would it help if I built you a kernel RPM with the patch included? If so, let me know what kernel you are currently using (you mention 188.8.131.52-5 but that was over two months ago).
Now that I use the latest update for Fedora 16 x86_64 the system hangs unconditionally when I try to shutdown/reboot it.
Stopping libvirtd doesn't help.
Although patching live system which is heavily used daily isn't too encouraging, I'd like to ask you for directions on how to test whatever patch I could try.
Otherwise the only option I have is to cease using Fedora at all, since it's simply unsafe, I have to press power/reset buttons to halt/reboot the computer, with obvious consequences for file systems.
(In reply to comment #13)
> Although patching live system which is heavily used daily isn't too
> encouraging, I'd like to ask you for directions on how to test whatever patch I
> could try.
You would simply apply the patch attached to this bug to the kernel sources and reboot the system using the patched kernel. If the patched kernel solves the hand on shutdown then we have our fix, if not, we can try something else.
Are you able to patch the kernel yourself or do you need a pre-built kernel RPM?
2Paul Moore: I will be able to patch, I just need an instruction on building the custom kernel.
I just saw what appears to be the same problem with the new f16 kernel, 3.3.2-1.fc16.x86_64. It was a NULL ptr deref at 0x0000228 and EIP was at selinux_socket_unix_may_send+0x33/0x90.
Thanks for the additional information.
Unfortunately, I'm still unable to reproduce the problem on my test system. I'm going to build a test kernel RPM with the patch from comment #5 applied and you guys can try it out to see if it solves the problem.
The test kernel RPM is at the URL below, please give it a try and let me know if it solves the kernel panic/oops/hang at shutdown.