Bug 1490158

Summary:

Libvirt could not reconnect qemu

Product:

Red Hat Enterprise Linux 7

Reporter:

Junxiang Li <junli>

Component:

libvirt

Assignee:

Andrea Bolognani <abologna>

Status:

CLOSED ERRATA

QA Contact:

Junxiang Li <junli>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

7.4-Alt

CC:

abologna, dzheng, eskultet, gsun, haizhao, jdenemar, jsuchane, junli, yafu

Target Milestone:

Keywords:

Automation, TestBlocker

Target Release:

7.5-Alt

Hardware:

ppc64le

OS:

Linux

Whiteboard:

Fixed In Version:

libvirt-4.3.0-1.el7

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-10-30 09:49:58 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
systemctl restart virtlogd.socket	none

Description Junxiang Li 2017-09-11 01:44:48 UTC

Description of problem:
Libvirt could not reconnect qemu

Version-Release number of selected component (if applicable):
libvirt-3.7.0-2.el7.ppc64le
qemu-kvm-rhev-2.9.0-16.el7_4.6.ppc64le
kernel-3.10.0-709.el7.ppc64le
Red Hat Enterprise Linux release 7.5 (Maipo)

How reproducible:
75%

Steps to Reproduce:
1. systemctl restart libvirtd.service

2. service virtlogd stop

3. virtlogd -d

4 virsh define a.xml

a.xml:

<?xml version='1.0' encoding='UTF-8'?>
<domain type="kvm">
  <name>avocado-vt-vm1</name>
  <uuid>b003fa52-e93a-4226-9546-15392002b9ab</uuid>
  <memory unit="KiB">1048576</memory>
  <currentMemory unit="KiB">1048576</currentMemory>
  <vcpu placement="static">2</vcpu>
  <os>
    <type arch="ppc64le" machine="pseries-rhel7.4.0alt">hvm</type>
    <boot dev="hd" />
  </os>
  <clock offset="utc" />
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk device="disk" type="file">
      <driver name="qemu" type="qcow2" />
      <source file="/var/lib/avocado/data/avocado-vt/images/jeos-25-64.qcow2" />
      <target bus="virtio" dev="vda" />
      <address bus="0x00" domain="0x0000" function="0x0" slot="0x04" type="pci" />
    </disk>
    <controller index="0" model="qemu-xhci" type="usb">
      <address bus="0x00" domain="0x0000" function="0x0" slot="0x02" type="pci" />
    </controller>
    <controller index="0" model="pci-root" type="pci">
      <model name="spapr-pci-host-bridge" />
      <target index="0" />
    </controller>
    <controller index="0" type="virtio-serial">
      <address bus="0x00" domain="0x0000" function="0x0" slot="0x03" type="pci" />
    </controller>
    <interface type="bridge">
      <mac address="52:54:00:9a:c3:4c" />
      <source bridge="virbr0" />
      <model type="virtio" />
      <address bus="0x00" domain="0x0000" function="0x0" slot="0x01" type="pci" />
    </interface>
    <serial type="pty">
      <target port="0" />
      <address reg="0x30000000" type="spapr-vio" />
    </serial>
    <console type="pty">
      <target port="0" type="serial" />
      <address reg="0x30000000" type="spapr-vio" />
    </console>
    <channel type="unix">
      <target name="org.qemu.guest_agent.0" type="virtio" />
      <address bus="0" controller="0" port="1" type="virtio-serial" />
    </channel>
    <input bus="usb" type="keyboard">
      <address bus="0" port="1" type="usb" />
    </input>
    <input bus="usb" type="mouse">
      <address bus="0" port="2" type="usb" />
    </input>
    <graphics autoport="yes" port="-1" type="vnc">
      <listen type="address" />
    </graphics>
    <video>
      <model heads="1" primary="yes" type="vga" vram="16384" />
      <address bus="0x00" domain="0x0000" function="0x0" slot="0x06" type="pci" />
    </video>
    <memballoon model="virtio">
      <address bus="0x00" domain="0x0000" function="0x0" slot="0x05" type="pci" />
    </memballoon>
    <panic model="pseries" />
  </devices>
  <numatune><memory mode="strict" placement="auto" /></numatune>
</domain>

5. virsh start avocado-vt-vm1

6. pkill virtlogd

7. systemctl restart virtlogd.socket

8. systemctl reset-failed libvirtd.service

9. systemctl start libvirtd.service

10. virsh list

Actual results:
 Id    Name                           State
----------------------------------------------------

Expected results:
 Id    Name                           State
----------------------------------------------------
 1     avocado-vt-vm1                 running

Additional info:

1.ps -ef | grep qemu
qemu      71549      1 16 01:52 ?        00:00:17 /usr/libexec/qemu-kvm -name guest=avocado-vt-vm1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-avocado-vt-vm1/master-key.aes -machine pseries-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off -m 1024 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid b003fa52-e93a-4226-9546-15392002b9ab -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-avocado-vt-vm1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device qemu-xhci,id=usb,bus=pci.0,addr=0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x3 -drive file=/var/lib/avocado/data/avocado-vt/images/jeos-25-64.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:9a:c3:4c,bus=pci.0,addr=0x1 -chardev pty,id=charserial0 -device spapr-vty,chardev=charserial0,reg=0x30000000 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-1-avocado-vt-vm1/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-kbd,id=input0,bus=usb.0,port=1 -device usb-mouse,id=input1,bus=usb.0,port=2 -vnc 127.0.0.1:0 -device VGA,id=video0,vgamem_mb=16,bus=pci.0,addr=0x6 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on

2.
From the log, it looks like libvirt didn't run job qemuProcessReconnect

3.
Both on Power9 AND Power8

Comment 3 Erik Skultety 2017-09-11 11:48:28 UTC

Does this affect PPC only or are you able to reproduce it on x86 as well? Anyhow, this should not qualify as high severity/priority issue, since judging from the reproduction steps, this issue wouldn't occur if you simply rely on and use systemd to manage the service, right?

Comment 4 Peter Krempa 2017-09-11 11:52:44 UTC

Also please attach the debug log of libvirtd as is usually requested in such cases.

Comment 5 Junxiang Li 2017-09-12 08:54:51 UTC

Created attachment 1324768 [details]
systemctl restart virtlogd.socket

Comment 6 Junxiang Li 2017-09-12 09:10:55 UTC

(In reply to Erik Skultety from comment #3)
> Does this affect PPC only or are you able to reproduce it on x86 as well?
> Anyhow, this should not qualify as high severity/priority issue, since
> judging from the reproduction steps, this issue wouldn't occur if you simply
> rely on and use systemd to manage the service, right?

This bug just affects PPC only. And I'm so sorry about the steps are not very accurate.

Now, the new steps:
1. Prepare a host with numa node, and it is must like this:
# numactl --hardware
available: 2 nodes (0,8)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 15290 MB
node 0 free: 14014 MB
node 8 cpus: 8 9 10 11 12 13 14 15
node 8 size: 16303 MB
node 8 free: 15772 MB

instead of：
# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 8 16 24 32
node 0 size: 32768 MB
node 0 free: 21097 MB
node 1 cpus: 40 48 56 64 72
node 1 size: 0 MB
node 1 free: 0 MB

2. Start a guest

3. With "virsh list", the guest is running(You'd better wait until the guest could login)

4. Run "systemctl restart virtlogd.socket"

5. Now, the guest disappear in "virsh list" result

So, this issue would occur if I use systemd to manage the service. And I'm not sure whether it is low level.

Comment 10 Andrea Bolognani 2018-04-09 08:24:56 UTC

(In reply to junli from comment #6)
> Now, the new steps:
> 1. Prepare a host with numa node, and it is must like this:
> # numactl --hardware
> available: 2 nodes (0,8)
> node 0 cpus: 0 1 2 3 4 5 6 7
> node 0 size: 15290 MB
> node 0 free: 14014 MB
> node 8 cpus: 8 9 10 11 12 13 14 15
> node 8 size: 16303 MB
> node 8 free: 15772 MB
> 
> instead of：
> # numactl --hardware
> available: 2 nodes (0-1)
> node 0 cpus: 0 8 16 24 32
> node 0 size: 32768 MB
> node 0 free: 21097 MB
> node 1 cpus: 40 48 56 64 72
> node 1 size: 0 MB
> node 1 free: 0 MB
> 
> 2. Start a guest
> 
> 3. With "virsh list", the guest is running(You'd better wait until the guest
> could login)
> 
> 4. Run "systemctl restart virtlogd.socket"
> 
> 5. Now, the guest disappear in "virsh list" result
> 
> So, this issue would occur if I use systemd to manage the service. And I'm
> not sure whether it is low level.

I just tried following the steps listed above and I can't reproduce
the issue. Can you confirm you're still seeing this with the latest
packages?

$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 8 16 24 32 40
node 0 size: 32768 MB
node 0 free: 23765 MB
node 1 cpus: 48 56 64 72 80 88
node 1 size: 32768 MB
node 1 free: 19371 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

$ rpm -q libvirt
libvirt-3.9.0-14.el7.ppc64le

Comment 12 Andrea Bolognani 2018-04-10 12:09:32 UTC

(In reply to junli from comment #11)
> There is a machine could reproduce the bug easily
> IP: [REDACTED]
> password: [REDACTED]
> 
> # rpm -q libvirt qemu-kvm-rhev kernel
> libvirt-3.9.0-14.virtcov.el7_5.2.ppc64le
> qemu-kvm-rhev-2.10.0-21.el7_5.1.ppc64le
> kernel-3.10.0-862.el7.ppc64le
> 
> Steps:
> 1. There is a xml could help you to reproduce
> virsh define /root/a.xml
> 
> 2. Start the guest and make sure it could login 
> virsh start avocado-vt-vm1 --console
> 
> 3. Run virsh list on host
> virsh list
> 
> 4. Try to restart virtlogd
> systemctl restart virtlogd.socket
> 
> 5. Run virsh list
> virsh list --all
> The guest is shutoff now
> 
> 6. But the qemu process is running
> ps -ef |grep qemu
> 
> ps: Due to the lack machine, I don't want to keep the machine too long. 
> So could you help check it in this week? Thanks!

Okay, I could reproduce the issue on that machine.

You don't even need to restart virtlogd.socket: it's enough to
restart libvirtd to make the guest disappear, which of course
should not happen because you're supposed to be able to restart
libvirtd at any given time without causing your guests to change
state.

It seems indeed to be related to the NUMA topology: I changed the
guest from

  <vcpu placement='auto'>2</vcpu>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>

to

  <vcpu placement='static'>2</vcpu>

and the guest doesn't disappear on libvirtd restart anymore.

However, I still can't reproduce the issue on a different host
that has

  # lscpu
  Architecture:          ppc64le
  Byte Order:            Little Endian
  CPU(s):                96
  On-line CPU(s) list:   0,8,16,24,32,40,48,56,64,72,80,88
  Off-line CPU(s) list:  1-7,9-15,17-23,25-31,33-39,41-47,49-55,57-63,65-71,73-79,81-87,89-95
  Thread(s) per core:    1
  Core(s) per socket:    6
  Socket(s):             2
  NUMA node(s):          2
  Model:                 2.1 (pvr 004b 0201)
  Model name:            POWER8E (raw), altivec supported
  CPU max MHz:           3325.0000
  CPU min MHz:           2061.0000
  L1d cache:             64K
  L1i cache:             32K
  L2 cache:              512K
  L3 cache:              8192K
  NUMA node0 CPU(s):     0,8,16,24,32,40
  NUMA node1 CPU(s):     48,56,64,72,80,88

  # ppc64_cpu --info
  Core   0:    0*    1     2     3     4     5     6     7
  Core   1:    8*    9    10    11    12    13    14    15
  Core   2:   16*   17    18    19    20    21    22    23
  Core   3:   24*   25    26    27    28    29    30    31
  Core   4:   32*   33    34    35    36    37    38    39
  Core   5:   40*   41    42    43    44    45    46    47
  Core   6:   48*   49    50    51    52    53    54    55
  Core   7:   56*   57    58    59    60    61    62    63
  Core   8:   64*   65    66    67    68    69    70    71
  Core   9:   72*   73    74    75    76    77    78    79
  Core  10:   80*   81    82    83    84    85    86    87
  Core  11:   88*   89    90    91    92    93    94    95

  # numactl --hardware
  available: 2 nodes (0-1)
  node 0 cpus: 0 8 16 24 32 40
  node 0 size: 32768 MB
  node 0 free: 23920 MB
  node 1 cpus: 48 56 64 72 80 88
  node 1 size: 32768 MB
  node 1 free: 18969 MB
  node distances:
  node   0   1
    0:  10  20
    1:  20  10

as opposed to

  # lscpu
  Architecture:          ppc64le
  Byte Order:            Little Endian
  CPU(s):                160
  On-line CPU(s) list:   0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152
  Off-line CPU(s) list:  1-7,9-15,17-23,25-31,33-39,41-47,49-55,57-63,65-71,73-79,81-87,89-95,97-103,105-111
  ,113-119,121-127,129-135,137-143,145-151,153-159
  Thread(s) per core:    1
  Core(s) per socket:    5
  Socket(s):             4
  NUMA node(s):          4
  Model:                 2.1 (pvr 004b 0201)
  Model name:            POWER8E (raw), altivec supported
  CPU max MHz:           3690.0000
  CPU min MHz:           2061.0000
  L1d cache:             64K
  L1i cache:             32K
  L2 cache:              512K
  L3 cache:              8192K
  NUMA node0 CPU(s):     0,8,16,24,32
  NUMA node1 CPU(s):     120,128,136,144,152
  NUMA node16 CPU(s):    40,48,56,64,72
  NUMA node17 CPU(s):    80,88,96,104,112

  # ppc64_cpu --info
  Core   0:    0*    1     2     3     4     5     6     7
  Core   1:    8*    9    10    11    12    13    14    15
  Core   2:   16*   17    18    19    20    21    22    23
  Core   3:   24*   25    26    27    28    29    30    31
  Core   4:   32*   33    34    35    36    37    38    39
  Core   5:   40*   41    42    43    44    45    46    47
  Core   6:   48*   49    50    51    52    53    54    55
  Core   7:   56*   57    58    59    60    61    62    63
  Core   8:   64*   65    66    67    68    69    70    71
  Core   9:   72*   73    74    75    76    77    78    79
  Core  10:   80*   81    82    83    84    85    86    87
  Core  11:   88*   89    90    91    92    93    94    95
  Core  12:   96*   97    98    99   100   101   102   103
  Core  13:  104*  105   106   107   108   109   110   111
  Core  14:  112*  113   114   115   116   117   118   119
  Core  15:  120*  121   122   123   124   125   126   127
  Core  16:  128*  129   130   131   132   133   134   135
  Core  17:  136*  137   138   139   140   141   142   143
  Core  18:  144*  145   146   147   148   149   150   151
  Core  19:  152*  153   154   155   156   157   158   159

  # numactl --hardware
  available: 4 nodes (0-1,16-17)
  node 0 cpus: 0 8 16 24 32
  node 0 size: 65536 MB
  node 0 free: 57663 MB
  node 1 cpus: 120 128 136 144 152
  node 1 size: 65536 MB
  node 1 free: 61983 MB
  node 16 cpus: 40 48 56 64 72
  node 16 size: 65536 MB
  node 16 free: 57580 MB
  node 17 cpus: 80 88 96 104 112
  node 17 size: 65536 MB
  node 17 free: 61766 MB
  node distances:
  node   0   1  16  17
    0:  10  20  40  40
    1:  20  10  40  40
   16:  40  40  10  20
   17:  40  40  20  10

I think libvirt (or numad?) might get confused by either the NUMA
nodes having gaps in their numbering (0, 1, 16, 17) or the CPUs
appearing out of order (0-39, 120-159, 40-79, 80-119).

I'll investigate further.

Comment 16 Andrea Bolognani 2018-04-12 06:48:51 UTC

Patches posted upstream.

  https://www.redhat.com/archives/libvir-list/2018-April/msg00937.html

Comment 17 Andrea Bolognani 2018-04-20 13:19:14 UTC

Fix merged upstream.

commit 931144858f7f6f7b82f343c8dc7e4db5d8570d0f
Author: Andrea Bolognani <abologna>
Date:   Tue Apr 10 16:12:05 2018 +0200

    qemu: Figure out nodeset bitmap size correctly
    
    The current private XML parsing code relies on the assumption
    that NUMA node IDs start from 0 and are densely allocated,
    neither of which is necessarily the case.
    
    Change it so that the bitmap size is dynamically calculated by
    looking at NUMA node IDs instead, which ensures all nodes will
    be able to fit and thus the bitmap will be parsed successfully.
    
    Update one of the test cases so that it would fail with the
    previous approach, but passes with the new one.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1490158
    
    Signed-off-by: Andrea Bolognani <abologna>
    Reviewed-by: Ján Tomko <jtomko>

v4.2.0-422-g931144858f

Comment 19 Junxiang Li 2018-06-20 07:11:47 UTC

Verified on packages:
# rpm -q libvirt qemu-kvm-rhev kernel
libvirt-4.4.0-2.virtcov.el7.ppc64le
qemu-kvm-rhev-2.12.0-3.el7.ppc64le
kernel-3.10.0-897.el7.ppc64le

Verified on specific machine:
# numactl --hardware
available: 4 nodes (0-1,16-17)
node 0 cpus: 0 8 16 24 32
node 0 size: 65536 MB
node 0 free: 58472 MB
node 1 cpus: 40 48 56 64 72
node 1 size: 65536 MB
node 1 free: 57279 MB
node 16 cpus: 80 88 96 104 112
node 16 size: 65536 MB
node 16 free: 63074 MB
node 17 cpus: 120 128 136 144 152
node 17 size: 65536 MB
node 17 free: 63809 MB
node distances:
node   0   1  16  17 
  0:  10  20  40  40 
  1:  20  10  40  40 
 16:  40  40  10  20 
 17:  40  40  20  10

Steps:
1. Define a guest with 
<numatune><memory mode="strict" placement="auto" /></numatune>

2. Start the guest and make sure it could login 
virsh start avocado-vt-vm1 --console

3. Run virsh list on host
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 1     avocado-vt-vm1                 running

4. Try to restart virtlogd
systemctl restart virtlogd.socket

5. Run virsh list
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 1     avocado-vt-vm1                 running

6. Try to restart libvirtd
systemctl restart libvirtd

7. Run virsh list on host
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 1     avocado-vt-vm1                 running

Comment 21 errata-xmlrpc 2018-10-30 09:49:58 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3113