Bug 1469058

Summary: Nova real time instance with SR-IOV interface is shutdown after create.
Product: Red Hat Enterprise Linux 7 Reporter: Peng Liu <pliu>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 7.3CC: berrange, dasmith, dyuan, eglynn, hhuang, jdenemar, jishao, juzhang, mtessun, pezhang, pliu, rbalakri, rbryant, sbauza, sferdjao, sgordon, srevivo, vromanso, xuzhang, yalzhang, zshi
Target Milestone: pre-dev-freeze   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-02 07:52:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
xml of VM none

Description Peng Liu 2017-07-10 10:47:45 UTC
Description of problem:
Nova real time instance with SR-IOV interface is shutdown after create. 

Version-Release number of selected component (if applicable):
0.20170413230745.47ed86e.el7.centos

How reproducible:
It happened every time in my environment.

Steps to Reproduce:
1.Setup a compute node with real time kernel with SRIOV enabled. 
2.Enable tuned profile realtime-virtual-host with isolate cpu 4-23.
2.create a SR-IOV VF port with 'neutron port-create'
3.boot a nova instance with real time flavor and the SR-IOV port

Actual results:
The VM status became 'active' for a while, then the status became 'shutdown'. Check the VM status is also 'shutdown' in libvirt.


Expected results:
The VM status should be active.


Additional info:
Host and Guest OS CentOS 7.3
kernel-rt-3.10.0-514.6.1.rt56.429.el7.x86_64
kernel-rt-kvm-3.10.0-514.6.1.rt56.429.el7.x86_64
RDO Ocata

Nova flavor
+----------------------------+---------------------------------------------------------------------------------------------------------------------+
| Property                   | Value                                                                                                               |
+----------------------------+---------------------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                                                               |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                                   |
| disk                       | 20                                                                                                                  |
| extra_specs                | {"hw:cpu_realtime_mask": "^0", "hw:cpu_policy": "dedicated", "hw:mem_page_size": "large", "hw:cpu_realtime": "yes"} |
| id                         | 5                                                                                                                   |
| name                       | m1.3core                                                                                                            |
| os-flavor-access:is_public | True                                                                                                                |
| ram                        | 4096                                                                                                                |
| rxtx_factor                | 1.0                                                                                                                 |
| swap                       |                                                                                                                     |
| vcpus                      | 3                                                                                                                   |
+----------------------------+---------------------------------------------------------------------------------------------------------------------+


Find log message in qemu log:
qemu: qemu_thread_create: Resource temporarily unavailable
2017-07-10 18:00:39.172+0000: shutting down

Comment 1 Sahid Ferdjaoui 2017-07-11 10:22:24 UTC
It's not a knowing issue so we will have to investigate little bit more.

Can you provide sosreport ?

Comment 2 Peng Liu 2017-07-12 08:05:39 UTC
Problem gone after upgrade the libvirt to 3.2. Pei Zhang from QE has successfully reproduce this issue in RHEL 7.3. New BZ will be created accordingly.

Comment 3 Pei Zhang 2017-07-14 06:12:33 UTC
Created attachment 1298132 [details]
xml of VM

QE reproduced this issue from qemu-kvm-rhev/libvirt layer:

Versions:
3.10.0-514.6.1.rt56.429.el7.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64
tuned-2.7.1-3.el7.noarch
libvirt-2.0.0-10.el7_3.9.x86_64


Steps:
1. Prepare rt host and rt guest

2. In host, create VFs
# echo 2 > /sys/bus/pci/devices/0000\:05\:00.0/sriov_numvfs

# lspci | grep Eth
05:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
05:10.0 Ethernet controller: Intel Corporation X540 Ethernet Controller Virtual Function (rev 01)
05:10.2 Ethernet controller: Intel Corporation X540 Ethernet Controller Virtual Function (rev 01)


3. Boot VM with VF, errors as Description show. See attachment for full xml.

# cat /var/log/libvirt/qemu/instance-0000001.log
...
qemu: qemu_thread_create: Resource temporarily unavailable
2017-07-14 04:58:54.099+0000: shutting down

Note: If guest is boot up, just wait for a longer time (eg, in my testing environment, waiting about 1 hour), this problem will appear. Then, next booting VM will be shutdown immediately, no matter how many times.


Additional info:
1. This issue is probably caused by libvirt, as after upgrading libvirt to libvirt-2.0.0-10.el7_3.9.x86_64, the problem is gone. So move this bug to 'libvirt' component.


2. With rhel7.4, everything works well. So this problem exists with 7.3.z.

Comment 5 Jiri Denemark 2017-07-14 08:34:18 UTC
(In reply to Pei Zhang from comment #3)
> QE reproduced this issue from qemu-kvm-rhev/libvirt layer:
> 
> Versions:
> libvirt-2.0.0-10.el7_3.9.x86_64
...
> 1. This issue is probably caused by libvirt, as after upgrading libvirt to
> libvirt-2.0.0-10.el7_3.9.x86_64, the problem is gone. So move this bug to
> 'libvirt' component.

Did you make a mistake in one of the versions? As written this doesn't make any sense.

Comment 6 Pei Zhang 2017-07-14 09:22:59 UTC
Sorry, there was a mistake.

> Additional info:
> 1. This issue is probably caused by libvirt, as after upgrading libvirt to
> libvirt-2.0.0-10.el7_3.9.x86_64, the problem is gone. So move this bug to
> 'libvirt' component.

Should be upgrading to libvirt-3.2.0-14.el7.x86_64.


libvirt-2.0.0-10.el7_3.9.x86_64   fail
libvirt-3.2.0-14.el7.x86_64       work

Comment 7 Jiri Denemark 2017-07-14 09:27:39 UTC
OK, which means it is already fixed in 7.4.