Bug 1508341

Summary: VM live migration fails on ComputeOvsDpdk role
Product: Red Hat OpenStack Reporter: Eyal Dannon <edannon>
Component: rhosp-directorAssignee: Angus Thomas <athomas>
Status: CLOSED DUPLICATE QA Contact: Amit Ugol <augol>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 12.0 (Pike)CC: atelang, dbecker, edannon, jamsmith, mbabushk, mburns, mcornea, morazi, owalsh, rhel-osp-director-maint, sasha, skramaja, supadhya, zgreenbe
Target Milestone: ---Keywords: Reopened, Tracking, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-01 13:30:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1495599, 1508867    
Bug Blocks:    

Description Eyal Dannon 2017-11-01 09:18:06 UTC
Description of problem:
In OSPD12 containerized environment, live migration (block) drops error[1].

Taking a look at [2] (which was verified 2 days ago) - migration works after disabling selinux, it did not solved my issue..

As far as I understand the ssh connectivity should be established as part of director deployment.


[1]
2017-11-01 08:59:28.501 1 ERROR nova.virt.libvirt.driver [req-e1664229-52cb-4727-ba9b-19c0764d727d 0e5bf6256c9c48e08ea6515cc68d2b5f 892c5bc64cab485881475bf998af8e11 - default default] [instance: d72bb6d0-47e7-4a74-b56d-0aaddb86f7c5] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+ssh://nova_migration:2022/system?keyfile=/etc/nova/migration/identity: Cannot recv data: ssh: connect to host computeovsdpdk-1.localdomain port 2022: Connection timed out: Connection reset by peer: libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+ssh://nova_migration:2022/system?keyfile=/etc/nova/migration/identity: Cannot recv data: ssh: connect to host computeovsdpdk-1.localdomain port 2022: Connection timed out: Connection reset by peer
2017-11-01 08:59:28.748 1 ERROR nova.virt.libvirt.driver [req-e1664229-52cb-4727-ba9b-19c0764d727d 0e5bf6256c9c48e08ea6515cc68d2b5f 892c5bc64cab485881475bf998af8e11 - default default] [instance: d72bb6d0-47e7-4a74-b56d-0aaddb86f7c5] Migration operation has aborted

[2] https://bugzilla.redhat.com/show_bug.cgi?id=1450100

Version-Release number of selected component (if applicable):
OSPD12
openstack-nova-compute-16.0.2-0.20171023105738.a2e4540.el7ost.noarch
container-selinux-2.28-1.git85ce147.el7.noarch
openstack-tripleo-heat-templates-7.0.3-0.20171023134948.el7ost.noarch

How reproducible:
Always

Steps to Reproduce:
1. Boot an instance
2. Set "enforce 0" on both compute nodes
3. Run: "openstack server migrate test --live computeovsdpdk-1.localdomain --block-migration"


Actual results:
Migration fails

Expected results:
Migration should work

Additional info:
Selinux bz: https://bugzilla.redhat.com/show_bug.cgi?id=1495599

Comment 1 Ollie Walsh 2017-11-01 14:56:54 UTC
need to set selinux permissive mode within the nova_migration_target container, not on the compute nodes

*** This bug has been marked as a duplicate of bug 1495599 ***

Comment 2 Ollie Walsh 2017-11-01 17:51:54 UTC
Including the OS::TripleO::Services::NovaMigrationTarget in the ComputeOvsDpdk role should resolve this.

Comment 3 atelang 2018-02-01 13:30:35 UTC

*** This bug has been marked as a duplicate of bug 1508867 ***