Red Hat Bugzilla – Bug 1313507
live-migration uses same ports as nova-api-metadata (or any service that accepts incoming connections)
Last modified: 2017-09-22 21:54:46 EDT
By default, libvirtd uses ports 49152-49215 for live migration, as specified in qemu.conf:
#migration_port_min = 49152
#migration_port_max = 49215
However, these ports can also be randomly consumed by nova-api-metadata (or any service that accepts incoming connections), e.g.
nova-api-metada 15040 nova 9u IPv4 75481 0t0 TCP node1.example.com:49162->node2.example.com:amqp (ESTABLISHED)
nova-api-metada 15040 nova 10u IPv4 178462 0t0 TCP node1.example.com:49163->node2.example.com:amqp (ESTABLISHED)
nova-api-metada 15045 nova 9u IPv4 179423 0t0 TCP node1.example.com:49160->node2.example.com:amqp (ESTABLISHED)
nova-api-metada 15045 nova 10u IPv4 178458 0t0 TCP node1.example.com:49161->node2.example.com:amqp (ESTABLISHED)
The odd of nova-api-metadata using all the 64 ports is very low, but it does happen, and live-migration would fail with the following message:
Live Migration failure: internal error: Unable to find an unused port in range 'migration' (49152-49215)
Changing the default range in libvirt itself is not an option. We have to expect that existing RHEL users will have configured their firewalls based on the existing port range. So if we changed it in libvirt, upgrades would cause a regression for existing RHEL users.
The only viable option is to have osp-director configure /etc/libvirt/qemu.conf to set a custom migration port range when deploying opentstack nova compute nodes. This will only affect new deployments, so minimises chance of regression for existing users
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
@slee: where and in which circumstances did this occur? Is this an actual environment? Thr problem is, that the ports can be taken by any _outbound_ connection as a source port, because these are in the ephemeral port range which is from 32768 to 61000 on linux by default. The only way to avoid that for sure is to move the migration ports out of that range. The risk could be reduced by expanding the range.