Bug 653898

Summary: [kvm] [DW] guest machine hangs after migration using NFS with hard mounts
Product: Red Hat Enterprise Linux 5 Reporter: Haim <hateya>
Component: kvmAssignee: Juan Quintela <quintela>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: low    
Version: 5.5.zCC: abaron, bazulay, bcao, chellwig, danken, hateya, iheim, juzhang, kwolf, mgoldboi, mkenneth, tburke, virt-maint, yeylon, ykaul
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 711374 (view as bug list) Environment:
Last Closed: 2011-06-07 10:20:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 654023    
Bug Blocks: 580948, 711374    

Description Haim 2010-11-16 12:29:45 UTC
Description of problem:

the following scenario was tested as a part of DW effort and found problematic. 


guest machine console hung (stop responding from inside guest) after migrating guest from source server who lost connection to storage server. 


starting with 3 hosts, one acts as SPM, guest machine runs on non-spm machine, them, communication between non-spm machine running the host and storage server is blocked, guest machine goes to pause, then, proactively or not (admin\rhevm) guest is migrated to destination server (that can communicate to storage), migration succeeds, thus, when accessing guest using console (vnc), trying to type user name and hit enter, guest stuck. 
this was reproducible on all 5 guests. 

please note that mount type on all hosts was 'hard' and not 'soft'.  

repro steps: 

1) make sure you have at least 2 hosts, better use 3.
2) make sure to use NFS storage with hard mounts
3) start guest machine on host A 
4) block communication between host A to storage server (using iptables on 
   storage side)
5) perform I\O action on guest so it pauses 
6) migrate guest to host B and resume it 
7) open console to guest and try to run some I/O

I think we need to debug it with 'gdb' in order to extract logs.

Comment 1 Haim 2010-11-16 12:33:17 UTC

Comment 2 Haim 2010-11-16 12:36:35 UTC
please note that guest machine reported as running, however, it can't be used, this is a very bad user experience, I would have expect guest go to pause non-responding, as currently there is no indication something is wrong.

Comment 18 RHEL Program Management 2011-01-11 20:10:46 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 19 RHEL Program Management 2011-01-11 22:56:24 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 26 Mike Cao 2011-01-31 06:17:01 UTC
Actually I don't know whether I fully understood the comment #0.Following is my steps.

1.configure nfs server
#cat /etc/sysconfig/nfs

2.use RHEL5 host to mount the nfs 
host A:#mount <nfs-ip>:/home /opt
host B:#mount <nfs-ip>:/home /opt

3.start VM in host A and do some I/O operations:
 /usr/libexec/qemu-kvm  -M rhel5.6.0 -m 2G -smp 2 -name win2k3_64 -uuid 12bb419b-8730-cbbd-b0d9-168fa4225b6d -monitor stdio -boot c -drive file=/opt/fedora.img,if=ide,boot=on,format=raw,cache=none,werror=stop -net nic,macaddr=54:52:00:0e:bf:a1,vlan=0 -net tap,script=/etc/qemu-ifup,vlan=0 -serial pty -parallel none -usb -vnc :1 -k en-gb -vga cirrus -balloon virtio

4.start listenning port on host B;

5.block communication between host A to storage server 
eg :on the nfs server host
#iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 111 -j REJECT;
#iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 111 -j REJECT;
#iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 662 -j REJECT;
#iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 662 -j REJECT; 
#iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 892 -j REJECT; 
#iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 892 -j REJECT; 
#iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 2049 -j REJECT; #iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 2049 -j REJECT; #iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 32803 -j REJECT; #iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 32803 -j REJECT; #iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 32769 -j REJECT; #iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 32769 -j REJECT

Actual Results:
After step 5 ,the qemu-monitor in host A is freezed ,I can NOT input migration command it it 。

Additional info:
after flush all the firewall rules on nfs server #iptables -F ,qemu-monitor works ,after execute migration command in qemu-monitor ,guest on host B works fine.


Hi, Haim
Is my steps belows right ? 
How did you do migrate operation after blocking communication between host A and storage server by using iptables ?