Bug 1254888
| Summary: | live migration Hosted Engine VM failed | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Ying Cui <ycui> | ||||||||||||
| Component: | vdsm | Assignee: | Francesco Romani <fromani> | ||||||||||||
| Status: | CLOSED NOTABUG | QA Contact: | Ilanit Stein <istein> | ||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||
| Priority: | medium | ||||||||||||||
| Version: | 3.5.4 | CC: | alukiano, bazulay, cshao, dougsland, ecohen, fdeutsch, gklein, huiwa, lpeer, lsurette, mgoldboi, nsednev, ofrenkel, rbarry, sbonazzo, ycui, yeylon, ylavi | ||||||||||||
| Target Milestone: | --- | ||||||||||||||
| Target Release: | 3.6.0 | ||||||||||||||
| Hardware: | Unspecified | ||||||||||||||
| OS: | Unspecified | ||||||||||||||
| Whiteboard: | virt | ||||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||
| Clone Of: | Environment: | ||||||||||||||
| Last Closed: | 2015-08-23 11:17:46 UTC | Type: | Bug | ||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||
| Documentation: | --- | CRM: | |||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
| Embargoed: | |||||||||||||||
| Bug Depends On: | |||||||||||||||
| Bug Blocks: | 1250199 | ||||||||||||||
| Attachments: | 
 | ||||||||||||||
| Additional info:
# from /var/log/message
<snip>
....
Aug 19 04:49:26 rhevhtest-1 journal: metadata not found: Requested metadata element is not present
Aug 19 04:49:26 rhevhtest-1 journal: Forwarding to syslog missed 6 messages.
Aug 19 04:49:27 rhevhtest-1 journal: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Error initiating connection
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/migration.py", line 106, in _setupVdsConnection
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1224, in __call__
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1578, in __request
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1264, in request
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1292, in single_request
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1439, in send_content
  File "/usr/lib64/python2.7/httplib.py", line 969, in endheaders
  File "/usr/lib64/python2.7/httplib.py", line 829, in _send_output
  File "/usr/lib64/python2.7/httplib.py", line 791, in send
  File "/usr/share/vdsm/kaxmlrpclib.py", line 151, in connect
  File "/usr/lib/python2.7/site-packages/vdsm/SecureXMLRPCServer.py", line 92, in connect
  File "/usr/lib64/python2.7/socket.py", line 553, in create_connection
gaierror: [Errno -2] Name or service not known
Aug 19 04:49:27 rhevhtest-1 journal: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::'progress'
Aug 19 04:49:27 rhevhtest-1 journal: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Failed to destroy remote VM
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/migration.py", line 164, in _recover
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1224, in __call__
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1578, in __request
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1264, in request
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1292, in single_request
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1439, in send_content
  File "/usr/lib64/python2.7/httplib.py", line 969, in endheaders
  File "/usr/lib64/python2.7/httplib.py", line 829, in _send_output
  File "/usr/lib64/python2.7/httplib.py", line 791, in send
  File "/usr/share/vdsm/kaxmlrpclib.py", line 151, in connect
  File "/usr/lib/python2.7/site-packages/vdsm/SecureXMLRPCServer.py", line 92, in connect
  File "/usr/lib64/python2.7/socket.py", line 553, in create_connection
gaierror: [Errno -2] Name or service not known
Aug 19 04:49:27 rhevhtest-1 journal: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Failed to migrate
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/migration.py", line 231, in run
  File "/usr/share/vdsm/virt/migration.py", line 120, in _setupRemoteMachineParams
  File "/usr/share/vdsm/virt/vm.py", line 2837, in getStats
  File "/usr/share/vdsm/virt/vm.py", line 2883, in _getRunningVmStats
KeyError: 'progress'
</snip>
# hosted-engine --vm-status
--== Host 1 status ==--
Status up-to-date                  : True
Hostname                           : rhevhtest-1.redhat.com
Host ID                            : 1
Engine status                      : {"health": "good", "vm": "up", "detail": "up"}
Score                              : 2400
Local maintenance                  : False
Host timestamp                     : 12309
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=12309 (Wed Aug 19 06:06:05 2015)
        host-id=1
        score=2400
        maintenance=False
        state=EngineUp
--== Host 2 status ==--
Status up-to-date                  : True
Hostname                           : rhevhtest-2.redhat.com
Host ID                            : 2
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 2400
Local maintenance                  : False
Host timestamp                     : 7451
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=7451 (Wed Aug 19 06:06:01 2015)
        host-id=2
        score=2400
        maintenance=False
        state=EngineDown
--== Host 3 status ==--
Status up-to-date                  : True
Hostname                           : rhevhtest-3.redhat.com
Host ID                            : 3
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 2400
Local maintenance                  : False
Host timestamp                     : 6407
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=6407 (Wed Aug 19 06:06:09 2015)
        host-id=3
        score=2400
        maintenance=False
        state=EngineDown
# systemctl status ovirt-ha-agent.service ovirt-ha-broker.service vdsmd.service vdsm-network.service supervdsmd.service
ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled)
   Active: active (running) since Wed 2015-08-19 07:34:17 UTC; 10min ago
  Process: 1824 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-agent start (code=exited, status=0/SUCCESS)
 Main PID: 1944 (ovirt-ha-agent)
   CGroup: /system.slice/ovirt-ha-agent.service
           └─1944 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
Aug 19 07:34:17 localhost systemd-ovirt-ha-agent[1824]: Starting ovirt-ha-agent: [  OK  ]
Aug 19 07:34:17 localhost systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent.
ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled)
   Active: active (running) since Wed 2015-08-19 07:34:17 UTC; 10min ago
  Process: 1178 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-broker start (code=exited, status=0/SUCCESS)
 Main PID: 1822 (ovirt-ha-broker)
   CGroup: /system.slice/ovirt-ha-broker.service
           └─1822 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker
Aug 19 07:34:14 localhost systemd[1]: Starting oVirt Hosted Engine High Availability Communications Broker...
Aug 19 07:34:17 localhost systemd-ovirt-ha-broker[1178]: Starting ovirt-ha-broker: [  OK  ]
Aug 19 07:34:17 localhost systemd[1]: Started oVirt Hosted Engine High Availability Communications Broker.
Aug 19 07:35:17 rhevhtest-1.redhat.com ovirt-ha-broker[1822]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error handling request...c350d0c1'
                                                              Traceback (most recent call last):
                                                                File "/usr/lib/python2.7/site-packages/ovirt_host...
Aug 19 07:35:20 rhevhtest-1.redhat.com ovirt-ha-broker[1822]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error handling request...c350d0c1'
                                                              Traceback (most recent call last):
                                                                File "/usr/lib/python2.7/site-packages/ovirt_host...
Aug 19 07:35:20 rhevhtest-1.redhat.com ovirt-ha-broker[1822]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error handling request...c350d0c1'
                                                              Traceback (most recent call last):
                                                                File "/usr/lib/python2.7/site-packages/ovirt_host...
Aug 19 07:38:16 rhevhtest-1.redhat.com ovirt-ha-broker[1822]: ovirt-ha-broker cpu_load_no_engine.EngineHealth ERROR Failed to read vm stats: [Errno 2] No such file...c/0/stat'
Aug 19 07:40:58 rhevhtest-1.redhat.com ovirt-ha-broker[1822]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.notifications.Notifications ERROR [Errno 111] Connection refused
                                                              Traceback (most recent call last):
                                                                File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/notifications.py", line 21, in send_email
Aug 19 07:41:08 rhevhtest-1.redhat.com ovirt-ha-broker[1822]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.notifications.Notifications ERROR [Errno 111] Connection refused
                                                              Traceback (most recent call last):
                                                                File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/notifications.py", line 21, in send_email
Aug 19 07:41:19 rhevhtest-1.redhat.com ovirt-ha-broker[1822]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.notifications.Notifications ERROR [Errno 111] Connection refused
                                                              Traceback (most recent call last):
                                                                File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/notifications.py", line 21, in send_email
vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
   Active: active (running) since Wed 2015-08-19 07:35:05 UTC; 10min ago
  Process: 17114 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
 Main PID: 17232 (vdsm)
   CGroup: /system.slice/vdsmd.service
           ├─17232 /usr/bin/python /usr/share/vdsm/vdsm
           ├─17454 /usr/libexec/ioprocess --read-pipe-fd 57 --write-pipe-fd 56 --max-threads 10 --max-queued-requests 10
           ├─18272 /usr/libexec/ioprocess --read-pipe-fd 46 --write-pipe-fd 45 --max-threads 10 --max-queued-requests 10
           └─18386 /usr/libexec/ioprocess --read-pipe-fd 59 --write-pipe-fd 57 --max-threads 10 --max-queued-requests 10
Aug 19 07:37:30 rhevhtest-1.redhat.com python[17232]: DIGEST-MD5 ask_user_info()
Aug 19 07:37:30 rhevhtest-1.redhat.com python[17232]: DIGEST-MD5 make_client_response()
Aug 19 07:37:30 rhevhtest-1.redhat.com python[17232]: DIGEST-MD5 client step 3
Aug 19 07:37:30 rhevhtest-1.redhat.com vdsm[17232]: vdsm vm.Vm WARNING vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Unknown type found, device: '{'device': 'unix'...'}}' found
Aug 19 07:37:30 rhevhtest-1.redhat.com vdsm[17232]: vdsm vm.Vm WARNING vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Unknown type found, device: '{'device': 'unix'...'}}' found
Aug 19 07:37:30 rhevhtest-1.redhat.com vdsm[17232]: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Alias not found for device type graphics during ...ation host
Aug 19 07:39:19 rhevhtest-1.redhat.com vdsm[17232]: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Error initiating connection
                                                    Traceback (most recent call last):
                                                      File "/usr/share/vdsm/virt/migration.py", line 106, in _setupVdsConnection...
Aug 19 07:39:19 rhevhtest-1.redhat.com vdsm[17232]: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::'progress'
Aug 19 07:39:19 rhevhtest-1.redhat.com vdsm[17232]: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Failed to destroy remote VM
                                                    Traceback (most recent call last):
                                                      File "/usr/share/vdsm/virt/migration.py", line 164, in _recover...
Aug 19 07:39:19 rhevhtest-1.redhat.com vdsm[17232]: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Failed to migrate
                                                    Traceback (most recent call last):
                                                      File "/usr/share/vdsm/virt/migration.py", line 231, in run...
vdsm-network.service - Virtual Desktop Server Manager network restoration
   Loaded: loaded (/usr/lib/systemd/system/vdsm-network.service; enabled)
   Active: active (exited) since Wed 2015-08-19 07:34:59 UTC; 10min ago
  Process: 16482 ExecStart=/usr/bin/vdsm-tool restore-nets (code=exited, status=0/SUCCESS)
  Process: 16367 ExecStartPre=/usr/bin/vdsm-tool --vvverbose --append --logfile=/var/log/vdsm/upgrade.log upgrade-3.0.0-networks (code=exited, status=0/SUCCESS)
  Process: 14532 ExecStartPre=/usr/bin/vdsm-tool --vvverbose --append --logfile=/var/log/vdsm/upgrade.log upgrade-unified-persistence (code=exited, status=0/SUCCESS)
 Main PID: 16482 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/vdsm-network.service
           └─17070 /sbin/dhclient -H rhevhtest-1 -1 -q -lf /var/lib/dhclient/dhclient--rhevm.lease -pf /var/run/dhclient-rhevm.pid rhevm
Aug 19 07:34:50 rhevhtest-1.redhat.com python[16494]: DIGEST-MD5 client step 2
Aug 19 07:34:50 rhevhtest-1.redhat.com python[16494]: DIGEST-MD5 ask_user_info()
Aug 19 07:34:50 rhevhtest-1.redhat.com python[16494]: DIGEST-MD5 make_client_response()
Aug 19 07:34:50 rhevhtest-1.redhat.com python[16494]: DIGEST-MD5 client step 3
Aug 19 07:34:56 rhevhtest-1.redhat.com dhclient[16993]: DHCPREQUEST on rhevm to 255.255.255.255 port 67 (xid=0x5f4e9b1a)
Aug 19 07:34:56 rhevhtest-1.redhat.com dhclient[16993]: DHCPACK from 10.66.11.253 (xid=0x5f4e9b1a)
Aug 19 07:34:58 rhevhtest-1.redhat.com dhclient[16993]: bound to 10.66.11.167 -- renewal in 36332 seconds.
Aug 19 07:34:59 rhevhtest-1.redhat.com python[16494]: DIGEST-MD5 client mech dispose
Aug 19 07:34:59 rhevhtest-1.redhat.com python[16494]: DIGEST-MD5 common mech dispose
Aug 19 07:34:59 rhevhtest-1.redhat.com systemd[1]: Started Virtual Desktop Server Manager network restoration.
supervdsmd.service - "Auxiliary vdsm service for running helper functions as root"
   Loaded: loaded (/usr/lib/systemd/system/supervdsmd.service; static)
   Active: active (running) since Wed 2015-08-19 07:34:47 UTC; 10min ago
 Main PID: 14533 (supervdsmServer)
   CGroup: /system.slice/supervdsmd.service
           └─14533 /usr/bin/python /usr/share/vdsm/supervdsmServer --sockfile /var/run/vdsm/svdsm.sock
Aug 19 07:34:47 rhevhtest-1.redhat.com systemd[1]: Started "Auxiliary vdsm service for running helper functions as root".
Aug 19 07:34:47 rhevhtest-1.redhat.com python[14533]: DIGEST-MD5 client step 2
Aug 19 07:34:47 rhevhtest-1.redhat.com python[14533]: DIGEST-MD5 parse_server_challenge()
Aug 19 07:34:47 rhevhtest-1.redhat.com python[14533]: DIGEST-MD5 ask_user_info()
Aug 19 07:34:47 rhevhtest-1.redhat.com python[14533]: DIGEST-MD5 client step 2
Aug 19 07:34:47 rhevhtest-1.redhat.com python[14533]: DIGEST-MD5 ask_user_info()
Aug 19 07:34:47 rhevhtest-1.redhat.com python[14533]: DIGEST-MD5 make_client_response()
Aug 19 07:34:47 rhevhtest-1.redhat.com python[14533]: DIGEST-MD5 client step 3
Hint: Some lines were ellipsized, use -l to show in full.
Created attachment 1064650 [details]
journalctl
Created attachment 1064651 [details]
rhevh_var_log
Created attachment 1064652 [details]
sosreport from hypervisor
Created attachment 1064655 [details]
engine.log
Created attachment 1064669 [details]
screenshot for migration
Doesn't seems integration related, moving to virt for now. does it fail between specific hosts? or its randomly failing? (In reply to Omer Frenkel from comment #8) > does it fail between specific hosts? or its randomly failing? it fail between any two hosts. See pic attachment 1064669 [details] Here are 3 rhevh hosts with HE VM rhevh-1 with HEVM tried to migrate HEVM to rhevh-2 failed tried to migrate HEVM to rhevh-3 failed These: # from /var/log/message <snip> .... Aug 19 04:49:26 rhevhtest-1 journal: metadata not found: Requested metadata element is not present Aug 19 04:49:26 rhevhtest-1 journal: Forwarding to syslog missed 6 messages. Aug 19 04:49:27 rhevhtest-1 journal: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Error initiating connection Traceback (most recent call last): File "/usr/share/vdsm/virt/migration.py", line 106, in _setupVdsConnection File "/usr/lib64/python2.7/xmlrpclib.py", line 1224, in __call__ File "/usr/lib64/python2.7/xmlrpclib.py", line 1578, in __request File "/usr/lib64/python2.7/xmlrpclib.py", line 1264, in request File "/usr/lib64/python2.7/xmlrpclib.py", line 1292, in single_request File "/usr/lib64/python2.7/xmlrpclib.py", line 1439, in send_content File "/usr/lib64/python2.7/httplib.py", line 969, in endheaders File "/usr/lib64/python2.7/httplib.py", line 829, in _send_output File "/usr/lib64/python2.7/httplib.py", line 791, in send File "/usr/share/vdsm/kaxmlrpclib.py", line 151, in connect File "/usr/lib/python2.7/site-packages/vdsm/SecureXMLRPCServer.py", line 92, in connect File "/usr/lib64/python2.7/socket.py", line 553, in create_connection gaierror: [Errno -2] Name or service not known Aug 19 04:49:27 rhevhtest-1 journal: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::'progress' Aug 19 04:49:27 rhevhtest-1 journal: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Failed to destroy remote VM Traceback (most recent call last): File "/usr/share/vdsm/virt/migration.py", line 164, in _recover File "/usr/lib64/python2.7/xmlrpclib.py", line 1224, in __call__ File "/usr/lib64/python2.7/xmlrpclib.py", line 1578, in __request File "/usr/lib64/python2.7/xmlrpclib.py", line 1264, in request File "/usr/lib64/python2.7/xmlrpclib.py", line 1292, in single_request File "/usr/lib64/python2.7/xmlrpclib.py", line 1439, in send_content File "/usr/lib64/python2.7/httplib.py", line 969, in endheaders File "/usr/lib64/python2.7/httplib.py", line 829, in _send_output File "/usr/lib64/python2.7/httplib.py", line 791, in send File "/usr/share/vdsm/kaxmlrpclib.py", line 151, in connect File "/usr/lib/python2.7/site-packages/vdsm/SecureXMLRPCServer.py", line 92, in connect File "/usr/lib64/python2.7/socket.py", line 553, in create_connection gaierror: [Errno -2] Name or service not known Aug 19 04:49:27 rhevhtest-1 journal: vdsm vm.Vm ERROR vmId=`c5312b39-316c-4906-bc20-df2a472bda1f`::Failed to migrate Traceback (most recent call last): File "/usr/share/vdsm/virt/migration.py", line 231, in run File "/usr/share/vdsm/virt/migration.py", line 120, in _setupRemoteMachineParams File "/usr/share/vdsm/virt/vm.py", line 2837, in getStats File "/usr/share/vdsm/virt/vm.py", line 2883, in _getRunningVmStats KeyError: 'progress' </snip> are low level network errors. Looking in /var/log/messages, it seems there are resolving error from unrelated services. E.g: Aug 19 06:21:38 rhevhtest-1 python: Error in communication with subscription manager, trying to recover: (this repeats many times after) Can the hosts ping each other (e.g. rhevtest-1 to rhevtest-2)? Any selinux denials in sight? (probably not, but need to ask anyway...) > Can the hosts ping each other (e.g. rhevtest-1 to rhevtest-2)? Yes, they can ping each other. You can see hosted-engine --vm-status info as well. And I have one env. can reproduce this issue. It will be kept until next Monday - Aug 25, so you can ssh to them to check. rhevh-test1 with HEVM: 10.66.11.167 root password: redhat rhevh-test2 : 10.66.10.107 root password: redhat rhevh-test3 : 10.66.65.29 root password: redhat HEVM : rhevm-appliance-20150727.0-1.x86_64.rhevm.ova 10.66.10.105 root password: redhat # login 10.66.11.167 [root@rhevhtest-1 admin]# ping 10.66.10.107 PING 10.66.10.107 (10.66.10.107) 56(84) bytes of data. 64 bytes from 10.66.10.107: icmp_seq=1 ttl=64 time=0.574 ms 64 bytes from 10.66.10.107: icmp_seq=2 ttl=64 time=0.306 ms 64 bytes from 10.66.10.107: icmp_seq=3 ttl=64 time=0.324 ms --- 10.66.10.107 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 1999ms rtt min/avg/max/mdev = 0.306/0.401/0.574/0.123 ms [root@rhevhtest-1 admin]# ping 10.66.65.29 PING 10.66.65.29 (10.66.65.29) 56(84) bytes of data. 64 bytes from 10.66.65.29: icmp_seq=1 ttl=61 time=0.491 ms 64 bytes from 10.66.65.29: icmp_seq=2 ttl=61 time=0.392 ms 64 bytes from 10.66.65.29: icmp_seq=3 ttl=61 time=0.491 ms --- 10.66.65.29 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 1999ms rtt min/avg/max/mdev = 0.392/0.458/0.491/0.046 ms [root@rhevhtest-1 admin]# ping 10.66.10.105 PING 10.66.10.105 (10.66.10.105) 56(84) bytes of data. 64 bytes from 10.66.10.105: icmp_seq=1 ttl=64 time=0.186 ms 64 bytes from 10.66.10.105: icmp_seq=2 ttl=64 time=0.165 ms 64 bytes from 10.66.10.105: icmp_seq=3 ttl=64 time=0.154 ms --- 10.66.10.105 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2001ms rtt min/avg/max/mdev = 0.154/0.168/0.186/0.017 ms > Any selinux denials in sight? (probably not, but need to ask anyway...) Set all rhevh setenforce 0, then try migrate, still can not migrate HEVM successful. the same error in log and rhevm portal UI. Thanks. this seems to be a network configuration issue, so closing. please re-open if needed. Please try reproducing the issue, following the 4.4 steps taken from https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.3/html/Installation_Guide/Configuring_the_Self-Hosted_Engine.html You should not use 127.0.0.1 localhost.localdomain localhost localhost.localdomain You should use only environment with reserved FQDNs for your hosts and the engine. You should reserve MAC address for your engine in DHCP server, so it'll always receive the same IP. You should update the DNS server records, so your FQDNs will be resolvable. Once you have properly configured environment, you should proceed with the HE deployment following user manual provided above or one of mine: https://mojo.redhat.com/docs/DOC-1013769 It looks like you have unresolvable host FQDN, hence migration to it fails. Seems like not a bug to me. Nikolai reopened by mistake replied comment 18, we already reserved FQDNs issues 3 months ago. just cleanup this needinfo. | 
Description of problem: Hosted-Engine VM migration failed between two Hosts. Version-Release number of selected component (if applicable): # rpm -qa ovirt-node ovirt-hosted-engine-setup ovirt-hosted-engine-ha vdsm kernel ovirt-node-plugin-hosted-engine ovirt-hosted-engine-setup-1.2.5.3-1.el7ev.noarch vdsm-4.16.24-2.el7ev.x86_64 ovirt-hosted-engine-ha-1.2.6-2.el7ev.noarch ovirt-node-plugin-hosted-engine-0.2.0-18.0.el7ev.noarch kernel-3.10.0-229.11.1.el7.x86_64 ovirt-node-3.2.3-18.el7.noarch # cat /etc/rhev-hypervisor-release Red Hat Enterprise Virtualization Hypervisor release 7.1 (20150813.0.el7ev) How reproducible: Test this scenario 4 times on the same machines repeat step 1 to step 7 three times, HE-VM can be migrated successful. One time, failed migrated. around 25% Steps to Reproduce: 1. Precondition: all rhevh are clean installation, ssh and network already setup yet. 2. Setup HE on the first RHEV-H successful. - nfs storage - em1 3. Setup additional HE on second or third RHEV-H successful. 4. All above RHEV-H are UP in Hosted Engine 5. login to engine webadmin portal 6. Navigate to "Virtual Machine" 7. Select hosted-engine VM, then click "Migrate" button, then choose a RHEV-H host to migrate the HE VM. Actual results: Migration failed due to Error: Could not connect to peer host Expected results: live migration Hosted Engine successful on RHEVH. Note: not sure this is RHEV-H special issue or RHEL can reproduce it as well.