Bug 1485260 - virsh vcpupin returns wrong info on large machine
Summary: virsh vcpupin returns wrong info on large machine
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Virtualization Maintenance
QA Contact: jiyan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-25 08:37 UTC by chhu
Modified: 2020-05-05 09:45 UTC (History)
8 users (show)

Fixed In Version: libvirt-6.0.0-1.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-05 09:43:16 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:2017 0 None None None 2020-05-05 09:45:46 UTC

Description chhu 2017-08-25 08:37:52 UTC
Description of problem:
Virsh vcpuin return wrong info on large machine

Version-Release number of selected component (if applicable):
libvirt-3.2.0-14.el7_4.3.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.5.x86_64
kernel: 3.10.0-693.2.1.el7.x86_64

How reproducible:
100% on large machine

Steps to Reproduce:
1. Start a guest with 384 cpus: (without numa node, hugepage,cpupin part in xml)
<vcpu placement='static'>384</vcpu>

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 19    r7-4t                          running

2. Check the vcpupin of the guest
# virsh vcpupin r7-4t
VCPU: CPU Affinity
----------------------------------
   0: 0-359
   1: 0-359
   2: 0-359
......
 280: 0-359
......
 300: 0-359
......

3. Do vcpupin operations:
# virsh vcpupin r7-4t 280 300-301
# virsh vcpupin r7-4t 300 330-331

4. Check the virsh vcpupin info, the info for vcpu280 is wrong.
# virsh vcpupin r7-4t
VCPU: CPU Affinity
----------------------------------
   0: 0-359
   1: 0-359
   2: 0-359
......
 280: 300-301,320,322,325
......
300: 330-331
......

5. Check the guest xml is correct.
# virsh dumpxml r7-4t
<domain type='kvm' id='19'>
  <name>r7-4t</name>
......
  <vcpu placement='static'>384</vcpu>
  <cputune>
    <vcpupin vcpu='280' cpuset='300-301'/>
    <vcpupin vcpu='300' cpuset='330-331'/>
  </cputune>
......

6. Check the cgroup and taskset are correct.
# cgget -g cpuset /machine.slice/machine-qemu\\x2d19\\x2dr7\\x2d4t.scope/vcpu280| grep cpuset.cpus
cpuset.cpus: 300-301

# cat 185024/cpuset
/machine.slice/machine-qemu\x2d19\x2dr7\x2d4t.scope/vcpu280
# taskset -c -p 185024
pid 185024's current affinity list: 300,301

Actual result:
In step4, vcpupin return wrong info

Expected result:
In step4, vcpupin return the correct info

Others:
1. numad service is not started.

Comment 4 Ján Tomko 2020-02-18 13:40:19 UTC
Fixed upstream by:
commit 51f9f80d350e633adf479c6a9b3c55f82ca9cbd4
Author:     Allen, John <John.Allen>
CommitDate: 2019-04-25 10:18:48 +0200

    Handle copying bitmaps to larger data buffers
    
    If a bitmap of a shorter length than the data buffer is passed to
    virBitmapToDataBuf, it will read off the end of the bitmap and copy junk
    into the returned buffer. Add a check to only copy the length of the
    bitmap to the buffer.
    
    The problem can be observed after setting a vcpu affinity using the vcpupin
    command on a system with a large number of cores:
      # virsh vcpupin example_domain 0 0
      # virsh vcpupin example_domain 0
         VCPU   CPU Affinity
        ---------------------------
         0      0,192,197-198,202
    
    Signed-off-by: John Allen <john.allen>

git describe: v5.2.0-360-g51f9f80d35 contains: v5.3.0-rc1~7

Comment 7 jiyan 2020-03-12 02:02:04 UTC
According to the patch, this bug seems to be same with 
Bug 1703159 - virsh vcpupin reports bogus affinities (RHEL-7.7) and Bug 1703160 - virsh vcpupin reports bogus affinities (RHEL-8.1.0)

What is strange is that I can bot hit these issue on 5 versions before libvirt-6.0.0-1.el8.
For example: libvirt-5.6.0-7.module+el8.2.0+4670+07fe2774.x86_64

Version:
libvirt-5.6.0-7.module+el8.2.0+4670+07fe2774.x86_64
kernel-4.18.0-187.el8.x86_64
qemu-kvm-4.2.0-13.module+el8.2.0+5898+fb4bceae.x86_64

Steps:
# virsh domstate test82

# virsh dumpxml test82 |grep vcpu
  <vcpu placement='static'>2</vcpu>

# virsh start test82 
Domain test82 started

# virsh vcpupin test82 
 VCPU   CPU Affinity
----------------------
 0      0-447
 1      0-447

# virsh vcpupin test82 0 0

# virsh vcpupin test82 
 VCPU   CPU Affinity
----------------------
 0      0
 1      0-447


And in bug description: libvirt-3.2.0-14.el7_4.3.x86_64, which is for RHEL-7. So could you pls check this bug to see whether this problem still exists on RHEL-8.2.0AV?

Comment 8 Ján Tomko 2020-03-20 13:39:49 UTC
The patch was included in the upstream libvirt release v5.3.0, so even RHEL-AV-8.1.0 should be fixed already.

Comment 9 jiyan 2020-03-23 01:12:00 UTC
SO according to the previous comment, I will mark this bug as verified.

Comment 11 errata-xmlrpc 2020-05-05 09:43:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2017


Note You need to log in before you can comment on or make changes to this bug.