RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1716387 - internal error: Failed to parse bitmap '' when starting guest
Summary: internal error: Failed to parse bitmap '' when starting guest
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.7
Hardware: ppc64le
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Andrea Bolognani
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-03 11:56 UTC by Andrea Bolognani
Modified: 2019-08-06 13:15 UTC (History)
11 users (show)

Fixed In Version: libvirt-4.5.0-19.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1703661
Environment:
Last Closed: 2019-08-06 13:14:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:2294 0 None None None 2019-08-06 13:15:37 UTC

Description Andrea Bolognani 2019-06-03 11:56:24 UTC
+++ This bug was initially created as a clone of Bug #1703661 +++

Description of problem:
When numatune placement auto is set, guest can not be started.

Version-Release number of selected component (if applicable):
# rpm -q libvirt qemu-kvm-rhev kernel
libvirt-4.5.0-15.virtcov.el7.ppc64le
qemu-kvm-rhev-2.12.0-27.el7.ppc64le
kernel-3.10.0-1034.el7.ppc64le

How reproducible:
100%

Steps to Reproduce:
1. To define a guest with:
<numatune><memory mode="strict" placement="auto" /></numatune>
2. Try to start it

Actual results:
# virsh start test1
error: Failed to start domain test1
error: invalid argument: Failed to parse bitmap ''

Expected results:
Domain test1 started

Additional info:
1. It looks like this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1233023
2. When I use the compose RHEL-7.7-20190403.0(libvirt-4.5.0-11), the test pass. And then I update the libvirt to the latest(4.5.0-15), it will reproduce the problem. But when I downgrade it to 4.5.0-11, it still could be reproduced.
3. When I use the compose RHEL-7.7-20190424.0(libvirt-4.5.0-14), it could be reproduced

Comment 2 Andrea Bolognani 2019-06-03 14:11:19 UTC
Alright, I finally managed to reproduce this issue as well, and
looking at the log the cause is pretty clear:

  debug : virCommandRunAsync:2499 : About to run /usr/bin/numad -w 8:1024
  debug : virFileClose:111 : Closed fd 21
  debug : virFileClose:111 : Closed fd 24
  debug : virFileClose:111 : Closed fd 26
  debug : virCommandRunAsync:2502 : Command result 0, with PID 83032
  error : virCommandWait:2623 : internal error: Child process (/usr/bin/numad -w 8:1024) unexpected exit status 1: Mon Jun  3 07:22:27 2019: Cannot open numad log file (errno: 13) -- using stderr
  Mon Jun  3 07:22:27 2019: msgget failed

  debug : virCommandRun:2350 : Result status 0, stdout: '' stderr: 'Mon Jun  3 07:22:27 2019: Cannot open numad log file (errno: 13) -- using stderr
  Mon Jun  3 07:22:27 2019: msgget failed
  '
  debug : virFileClose:111 : Closed fd 25
  error : virNumaGetAutoPlacementAdvice:72 : internal error: Failed to query numad for the advisory nodeset
  debug : virFileClose:111 : Closed fd 22
  debug : qemuProcessPrepareDomainNUMAPlacement:5812 : Nodeset returned from numad:
  error : virBitmapParseSeparator:581 : invalid argument: Failed to parse bitmap ''

libvirt is calling numad asking it to calculate a nodeset for the
guest, but numad fails and prints nothing on stdout; libvirt tries to
parse the empty output as a bitmap and fails, which is expected
because '' is indeed not a valid bitmap.

There are two issues involved here: the first one is that libvirt is
clearly not reporting a good error on numad failure, and the second
one is that numad fails in the first place.

I'll cook up patches for the libvirt issue in a minute; as for the
numad issue, the stderr output provides a clue about what's going on
and audit.log confirms it:

  type=VIRT_MACHINE_ID msg=audit(1559565132.493:29425): pid=81775 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm vm="avocado-vt-vm1" uuid=2a946185-d7ad-4cc0-be1b-642b34591465 vm-ctx=system_u:system_r:svirt_t:s0:c528,c731 img-ctx=system_u:object_r:svirt_image_t:s0:c528,c731 model=selinux exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'
  type=VIRT_MACHINE_ID msg=audit(1559565132.493:29426): pid=81775 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm vm="avocado-vt-vm1" uuid=2a946185-d7ad-4cc0-be1b-642b34591465 vm-ctx=+107:+107 img-ctx=+107:+107 model=dac exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'
  type=AVC msg=audit(1559565132.523:29427): avc:  denied  { open } for  pid=85288 comm="numad" path="/var/log/numad.log" dev="dm-0" ino=101386427 scontext=system_u:system_r:numad_t:s0-s0:c0.c1023 tcontext=system_u:object_r:var_log_t:s0 tclass=file permissive=0
  type=SYSCALL msg=audit(1559565132.523:29427): arch=c0000015 syscall=5 success=no exit=-13 a0=13b01a7b8 a1=441 a2=1b6 a3=1 items=0 ppid=81775 pid=85288 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="numad" exe="/usr/bin/numad" subj=system_u:system_r:numad_t:s0-s0:c0.c1023 key=(null)
  type=PROCTITLE msg=audit(1559565132.523:29427): proctitle=2F7573722F62696E2F6E756D6164002D7700383A31303234
  type=AVC msg=audit(1559565132.523:29428): avc:  denied  { unix_read unix_write } for  pid=85288 comm="numad" key=-559038737  scontext=system_u:system_r:numad_t:s0-s0:c0.c1023 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=msgq permissive=0
  type=SYSCALL msg=audit(1559565132.523:29428): arch=c0000015 syscall=117 success=no exit=-13 a0=d a1=ffffffffdeadbeef a2=3b0 a3=0 items=0 ppid=81775 pid=85288 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="numad" exe="/usr/bin/numad" subj=system_u:system_r:numad_t:s0-s0:c0.c1023 key=(null)
  type=PROCTITLE msg=audit(1559565132.523:29428): proctitle=2F7573722F62696E2F6E756D6164002D7700383A31303234
  type=VIRT_RESOURCE msg=audit(1559565132.543:29429): pid=81775 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm resrc=disk reason=start vm="avocado-vt-vm1" uuid=2a946185-d7ad-4cc0-be1b-642b34591465 old-disk="?" new-disk="/var/lib/avocado/data/avocado-vt/images/jeos-27-ppc64le.qcow2" exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'
  type=VIRT_RESOURCE msg=audit(1559565132.543:29430): pid=81775 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm resrc=net reason=start vm="avocado-vt-vm1" uuid=2a946185-d7ad-4cc0-be1b-642b34591465 old-net="?" new-net="52:54:00:22:d8:18" exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'
  type=VIRT_RESOURCE msg=audit(1559565132.543:29431): pid=81775 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm resrc=rng reason=start vm="avocado-vt-vm1" uuid=2a946185-d7ad-4cc0-be1b-642b34591465 old-rng="?" new-rng="/dev/urandom" exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'
  type=VIRT_RESOURCE msg=audit(1559565132.543:29432): pid=81775 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm resrc=mem reason=start vm="avocado-vt-vm1" uuid=2a946185-d7ad-4cc0-be1b-642b34591465 old-mem=0 new-mem=1048576 exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'
  type=VIRT_RESOURCE msg=audit(1559565132.543:29433): pid=81775 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm resrc=vcpu reason=start vm="avocado-vt-vm1" uuid=2a946185-d7ad-4cc0-be1b-642b34591465 old-vcpu=0 new-vcpu=8 exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'
  type=VIRT_CONTROL msg=audit(1559565132.543:29434): pid=81775 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm op=start reason=booted vm="avocado-vt-vm1" uuid=2a946185-d7ad-4cc0-be1b-642b34591465 vm-pid=-1 exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=failed'

Indeed, after running 'setenforce 0' the guest can be started
successfully, so it would look like selinux-policy might need to be
amended so that the above is permitted. I'll report a separate bug
for that.

Incidentally, the machine I was testing this on originally had
SELinux in permissive mode, and that's why I could not reproduce
the issue there.

Comment 3 Andrea Bolognani 2019-06-03 17:15:11 UTC
Patch posted upstream.

  https://www.redhat.com/archives/libvir-list/2019-June/msg00054.html

Comment 4 Andrea Bolognani 2019-06-04 09:39:51 UTC
Fix merged upstream.

  commit b34fb1fb6f99628932ad68db1ce4985a06def17f
  Author: Andrea Bolognani <abologna>
  Date:   Mon Jun 3 18:45:27 2019 +0200

    util: Propagate numad failures correctly
    
    Right now, if numad fails, we raise an error but return an
    empty string to the caller instead of a NULL pointer, which
    means processing will continue and the user will see
    
      # virsh start guest
      error: Failed to start domain guest
      error: invalid argument: Failed to parse bitmap ''
    
    instead of a more reasonable
    
      # virsh start guest
      error: Failed to start domain guest
      error: operation failed: Failed to query numad for the advisory nodeset
    
    Make sure the user gets a better error message.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1716387
    
    Signed-off-by: Andrea Bolognani <abologna>
    Reviewed-by: Daniel Henrique Barboza <danielhb413>
    Reviewed-by: Ján Tomko <jtomko>

  v5.4.0-46-gb34fb1fb6f

Comment 7 Junxiang Li 2019-06-10 06:11:54 UTC
Reproduce:
env:
    # rpm -q libvirt
    libvirt-4.5.0-18.el7.ppc64le
step:
    1. Edit the guest xml with "<numatune><memory mode="strict" placement="auto" /></numatune>"
    2. Try to start the guest
result:
    report the following error message:
    error: Failed to start domain avocado-vt-vm1
    error: invalid argument: Failed to parse bitmap ''

Verify:
env:
    # rpm -q libvirt
    libvirt-4.5.0-19.el7.ppc64le
step:
    1. Edit the guest xml with "<numatune><memory mode="strict" placement="auto" /></numatune>"
    2. Try to start the guest
result:
    report the following error message:
    error: Failed to start domain avocado-vt-vm1
    error: cannot set CPU affinity on process 156584: Invalid argument

In summary, this problem has been fixed.

Comment 8 Andrea Bolognani 2019-06-10 08:11:18 UTC
(In reply to Junxiang Li from comment #7)
> Reproduce:
> env:
>     # rpm -q libvirt
>     libvirt-4.5.0-18.el7.ppc64le
> step:
>     1. Edit the guest xml with "<numatune><memory mode="strict"
> placement="auto" /></numatune>"
>     2. Try to start the guest
> result:
>     report the following error message:
>     error: Failed to start domain avocado-vt-vm1
>     error: invalid argument: Failed to parse bitmap ''
> 
> Verify:
> env:
>     # rpm -q libvirt
>     libvirt-4.5.0-19.el7.ppc64le
> step:
>     1. Edit the guest xml with "<numatune><memory mode="strict"
> placement="auto" /></numatune>"
>     2. Try to start the guest
> result:
>     report the following error message:
>     error: Failed to start domain avocado-vt-vm1
>     error: cannot set CPU affinity on process 156584: Invalid argument
> 
> In summary, this problem has been fixed.

This is not exactly the outcome I expected. Did you update libvirt
only between the two test runs?

As per the commit message included in Comment 4, libvirt should now
report

  error: operation failed: Failed to query numad for the advisory nodeset

whereas the error message you have above would indicate that numad
managed to run successfully the second time, and you're now hitting
Bug 1703661 (which makes sense, since that bug is fixed by the very
next libvirt build).

Can you confirm that you didn't update packages other than libvirt
between the two test runs, and that SELinux was in enforce mode in
both cases? It would also be nice to get libvirtd logs such as the
one included in Comment 2.

Comment 9 Andrea Bolognani 2019-06-10 15:33:41 UTC
I reprovisioned a machine with RHEL-7.7-20190403.0 and installed the
known problematic libvirt-4.5.0-15.el7, as described in Comment 0
and also as I remember doing to reproduce the issue before writing
Comment 2, but I can't get numad to fail despite having SELinux in
enforcing mode, so I'm just hitting Bug 1703661 as well.

I'm kind of out of ideas at the moment, so I did something nasty to
force the bug to reproduce:

  # cat >/usr/bin/numad <<EOF
  > #!/bin/sh
  > exit 1
  > EOF
  # restorecon /usr/bin/numad

Sure enough, the bug reproduces after that:

  # virsh start guest
  error: Failed to start domain guest
  error: invalid argument: Failed to parse bitmap ''

and goes away after upgrading to libvirt-4.5.0-20.el7.ppc64le:

  # virsh start guest
  error: Failed to start domain guest
  error: operation failed: Failed to query numad for the advisory nodeset

That's good enough to mark the bug as VERIFIED in my book.

It's weird that we can't seem to make the original, SELinux-related
failure show up again, but at this point chasing that specific ghost
doesn't seem like the best use of anyone's time.

Comment 10 Junxiang Li 2019-06-11 02:55:13 UTC
Due to the numad problem could not reproduce stable, so use the Comment 9's way to reproduce and verify it again.
Prepare:
# cat >/usr/bin/numad <<EOF
  > #!/bin/sh
  > exit 1
  > EOF
# restorecon /usr/bin/numad

Reproduce:
env:
    # rpm -q libvirt
    libvirt-4.5.0-18.el7.ppc64le
step:
    1. Edit the guest xml with "<numatune><memory mode="strict" placement="auto" /></numatune>"
    2. Try to start the guest
result:
    report the following error message:
    error: Failed to start domain test2
    error: invalid argument: Failed to parse bitmap ''

Verify:
env:
    # rpm -q libvirt
    libvirt-4.5.0-19.el7.ppc64le
step:
    1. Edit the guest xml with "<numatune><memory mode="strict" placement="auto" /></numatune>"
    2. Try to start the guest
result:
    report the following error message:
    error: Failed to start domain test2
    error: operation failed: Failed to query numad for the advisory nodeset

Reproduce again:
env(downgrade from 4.5.0-19):
    # rpm -q libvirt
    libvirt-4.5.0-18.el7.ppc64le
step:
    1. Edit the guest xml with "<numatune><memory mode="strict" placement="auto" /></numatune>"
    2. Try to start the guest
result:
    report the following error message:
    error: Failed to start domain test2
    error: invalid argument: Failed to parse bitmap ''

The bug is verified.

Comment 11 jiyan 2019-06-26 02:08:32 UTC
Also reproduced this bug on libvirt-4.5.0-18.el7.x86_64.

Version:
libvirt-4.5.0-18.el7.x86_64
kernel-3.10.0-1057.el7.x86_64
qemu-kvm-rhev-2.12.0-33.el7.x86_64

Steps:
# cat /usr/bin/numad
#!/bin/sh
exit 1

# restorecon /usr/bin/numad

# virsh domstate avocado-vt-vm1
shut off

# virsh dumpxml avocado-vt-vm1 --inactive |grep "<numatune" -A2
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>

# virsh start avocado-vt-vm1
error: Failed to start domain avocado-vt-vm1
error: invalid argument: Failed to parse bitmap ''

Comment 12 jiyan 2019-06-26 02:39:04 UTC
Also verified this bug on libvirt-4.5.0-19.el7.x86_64

Version:
libvirt-4.5.0-19.el7.x86_64
kernel-3.10.0-1057.el7.x86_64
qemu-kvm-rhev-2.12.0-33.el7.x86_64

Steps:
# virsh domstate avocado-vt-vm1
shut off

# virsh dumpxml avocado-vt-vm1 --inactive |grep "<numatune" -A2
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>

# virsh start avocado-vt-vm1
error: Failed to start domain avocado-vt-vm1
error: operation failed: Failed to query numad for the advisory nodeset

# yum downgrade libvirt* -y

# systemctl restart libvirtd

# rpm -qa libvirt
libvirt-4.5.0-18.el7.x86_64

# virsh start avocado-vt-vm1
error: Failed to start domain avocado-vt-vm1
error: invalid argument: Failed to parse bitmap ''

Comment 15 errata-xmlrpc 2019-08-06 13:14:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2294


Note You need to log in before you can comment on or make changes to this bug.