Bug 653898 - [kvm] [DW] guest machine hangs after migration using NFS with hard mounts
Summary: [kvm] [DW] guest machine hangs after migration using NFS with hard mounts
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.5.z
Hardware: Unspecified
OS: Unspecified
low
high
Target Milestone: rc
: ---
Assignee: Juan Quintela
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 654023
Blocks: Rhel5KvmTier2 711374
TreeView+ depends on / blocked
 
Reported: 2010-11-16 12:29 UTC by Haim
Modified: 2014-01-13 00:47 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 711374 (view as bug list)
Environment:
Last Closed: 2011-06-07 10:20:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Haim 2010-11-16 12:29:45 UTC
Description of problem:

the following scenario was tested as a part of DW effort and found problematic. 

synopsis: 

guest machine console hung (stop responding from inside guest) after migrating guest from source server who lost connection to storage server. 

scenario: 

starting with 3 hosts, one acts as SPM, guest machine runs on non-spm machine, them, communication between non-spm machine running the host and storage server is blocked, guest machine goes to pause, then, proactively or not (admin\rhevm) guest is migrated to destination server (that can communicate to storage), migration succeeds, thus, when accessing guest using console (vnc), trying to type user name and hit enter, guest stuck. 
this was reproducible on all 5 guests. 

please note that mount type on all hosts was 'hard' and not 'soft'.  

repro steps: 

1) make sure you have at least 2 hosts, better use 3.
2) make sure to use NFS storage with hard mounts
3) start guest machine on host A 
4) block communication between host A to storage server (using iptables on 
   storage side)
5) perform I\O action on guest so it pauses 
6) migrate guest to host B and resume it 
7) open console to guest and try to run some I/O

I think we need to debug it with 'gdb' in order to extract logs.

Comment 1 Haim 2010-11-16 12:33:17 UTC
vdsm22-4.5-62.26.el5_5rhev2_2
kvm-83-164.el5_5.23
kernel-2.6.18-194.17.1.el5

Comment 2 Haim 2010-11-16 12:36:35 UTC
please note that guest machine reported as running, however, it can't be used, this is a very bad user experience, I would have expect guest go to pause non-responding, as currently there is no indication something is wrong.

Comment 18 RHEL Program Management 2011-01-11 20:10:46 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 19 RHEL Program Management 2011-01-11 22:56:24 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 26 Mike Cao 2011-01-31 06:17:01 UTC
Actually I don't know whether I fully understood the comment #0.Following is my steps.

1.configure nfs server
#cat /etc/sysconfig/nfs
LOCKD_TCPPORT=32803
LOCKD_UDPPORT=32769
MOUNTD_PORT=892
STATD_PORT=662

2.use RHEL5 host to mount the nfs 
host A:#mount <nfs-ip>:/home /opt
host B:#mount <nfs-ip>:/home /opt

3.start VM in host A and do some I/O operations:
eg:
 /usr/libexec/qemu-kvm  -M rhel5.6.0 -m 2G -smp 2 -name win2k3_64 -uuid 12bb419b-8730-cbbd-b0d9-168fa4225b6d -monitor stdio -boot c -drive file=/opt/fedora.img,if=ide,boot=on,format=raw,cache=none,werror=stop -net nic,macaddr=54:52:00:0e:bf:a1,vlan=0 -net tap,script=/etc/qemu-ifup,vlan=0 -serial pty -parallel none -usb -vnc :1 -k en-gb -vga cirrus -balloon virtio

4.start listenning port on host B;

5.block communication between host A to storage server 
eg :on the nfs server host
#iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 111 -j REJECT;
#iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 111 -j REJECT;
#iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 662 -j REJECT;
#iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 662 -j REJECT; 
#iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 892 -j REJECT; 
#iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 892 -j REJECT; 
#iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 2049 -j REJECT; #iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 2049 -j REJECT; #iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 32803 -j REJECT; #iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 32803 -j REJECT; #iptables -I INPUT -s <host A's ip> -m tcp -p tcp --dport 32769 -j REJECT; #iptables -I INPUT -s <host A's ip> -m udp -p udp --dport 32769 -j REJECT

Actual Results:
After step 5 ,the qemu-monitor in host A is freezed ,I can NOT input migration command it it 。

Additional info:
after flush all the firewall rules on nfs server #iptables -F ,qemu-monitor works ,after execute migration command in qemu-monitor ,guest on host B works fine.

bcao---->hateya

Hi, Haim
Is my steps belows right ? 
How did you do migrate operation after blocking communication between host A and storage server by using iptables ?

Mike


Note You need to log in before you can comment on or make changes to this bug.