Hide Forgot
Description of problem: Attach a nfs volume to a VM, and block the connection of the nfs by using iptables, to destroy the VM will cause libvirt hang up. Version-Release number of selected component (if applicable): libvirt-0.9.1-1.el6.x86_64 How reproducible: always Steps to Reproduce: # virsh start vm1 Domain vm1 started # virsh list Id Name State ---------------------------------- 2 vm1 running # iptables -A OUTPUT -d 10.66.90.115 -p tcp --dport 2049 -j DROP # LIBVIRT_DEBUG=1 virsh destroy vm1 ... 06:54:07.385: 3261: debug : virDomainLookupByName:1995 : conn=0x14f34e0, name=vm1 06:54:07.385: 3261: debug : remoteIO:10761 : Do proc=23 serial=3 length=36 wait=(nil) 06:54:07.385: 3261: debug : remoteIO:10833 : We have the buck 23 0x14f37f0 0x14f37f0 06:54:07.385: 3261: debug : virEventPollUpdateHandle:144 : Update handle w=2 e=0 06:54:07.385: 3261: debug : virEventPollInterruptLocked:686 : Skip interrupt, 0 0 06:54:07.386: 3261: debug : remoteIODecodeMessageLength:10153 : Got length, now need 84 total (80 more) 06:54:07.386: 3261: debug : remoteIOEventLoop:10687 : Giving up the buck 23 0x14f37f0 (nil) 06:54:07.386: 3261: debug : virEventPollUpdateHandle:144 : Update handle w=2 e=1 06:54:07.386: 3261: debug : virEventPollInterruptLocked:686 : Skip interrupt, 0 0 06:54:07.386: 3261: debug : remoteIO:10861 : All done with our call 23 (nil) 0x14f37f0 06:54:07.386: 3261: debug : virDomainDestroy:2040 : dom=0x14f3330, (VM: name=vm1, uuid=4ab7ea78-40b0-475d-9f32-9397d87a76d5), 06:54:07.386: 3261: debug : remoteIO:10761 : Do proc=12 serial=4 length=56 wait=(nil) 06:54:07.386: 3261: debug : remoteIO:10833 : We have the buck 12 0x14f37f0 0x14f37f0 06:54:07.386: 3261: debug : virEventPollUpdateHandle:144 : Update handle w=2 e=0 06:54:07.386: 3261: debug : virEventPollInterruptLocked:686 : Skip interrupt, 0 0 # iptables -D OUTPUT -d 10.66.90.115 -p tcp --dport 2049 -j DROP # virsh domstate vm1 shut off Actual results: Hang up with virsh destroy command. Expected results: Destroy should be fail, and the VM state is still running. Additional info:
> Attach a nfs volume to a VM, and block the connection of the nfs by using > iptables, to destroy the VM will cause libvirt hang up. This is expected behaviour with NFS by default. NFS mounts default to using the 'hard' flag. This means that if the NFS server goes away (eg due to network connectivity lost / blocked), the client will retry indefinitely. The application (like libvirt) does not have any say in this matter, the kernel retries forever and the system won't ever return an error to userspace. If the NFS mount uses 'soft' flag, then the kernel will timeout after N retries (controlled by the retrans mount flag), and return an error to userspace. When you kill a guest with libvirt, one of the things it has todo is restore security labelling. THis obviously involves I/O operations, so if the NFS server is blocked/dead and 'hard' mount option is set, then libvirtd will "hang" until the NFS server recovers. If you mount with 'soft', then libvirt shouldn't "hang", but introduces some risk to data integrity.