RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1462092 - The first time to hot-unplug vcpu failed after restart libvirtd during hotplug vcpu
Summary: The first time to hot-unplug vcpu failed after restart libvirtd during hotpl...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Peter Krempa
QA Contact: Jingjing Shao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-16 07:33 UTC by Jingjing Shao
Modified: 2018-04-10 10:50 UTC (History)
7 users (show)

Fixed In Version: libvirt-3.8.0-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-10 10:48:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirtd.log (126.74 KB, text/plain)
2017-06-16 07:49 UTC, Jingjing Shao
no flags Details
libvirtd_part1.log (18.04 MB, text/plain)
2017-10-31 08:33 UTC, Jingjing Shao
no flags Details
libvirtd_part2.log (18.68 MB, text/plain)
2017-10-31 08:34 UTC, Jingjing Shao
no flags Details
guest.log (12.92 KB, text/plain)
2017-10-31 08:35 UTC, Jingjing Shao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:0704 0 None None None 2018-04-10 10:50:02 UTC

Description Jingjing Shao 2017-06-16 07:33:41 UTC
Description of problem:
The first time to hot-unplug vcpu  failed after restart libvirtd during hotplug vcpu

Version-Release number of selected component (if applicable):
libvirt-3.2.0-10.el7.x86_64

How reproducible:
100%

Steps to Reproduce:

1. # virsh vcpucount r7.2
maximum      config       200
maximum      live         200
current      config         3
current      live           3

2. On one terminal,
#  virsh setvcpus r7.2  200

3.On second terminal
#  service libvirtd restart

4.On the first terminal
#  virsh setvcpus r7.2  200
error: Disconnected from qemu:///system due to keepalive timeout
error: internal error: connection closed due to keepalive timeout

5. #  service libvirtd restart
Redirecting to /bin/systemctl restart libvirtd.service

The libvirtd can start successfully.

6.# virsh vcpucount r7.2
maximum      config       200
maximum      live         200
current      config         3
current      live          39

7. virsh setvcpus r7.2  20
error: Failed to create controller cpu for group: No such file or directory

8.#  virsh setvcpus r7.2  20
#
#

Actual results:
As the step shows

Expected results:
The first time should succeed

Additional info:
The libvirtd.log are attached as below

Comment 2 Jingjing Shao 2017-06-16 07:49:54 UTC
Created attachment 1288272 [details]
libvirtd.log

Comment 3 Peter Krempa 2017-09-25 20:34:08 UTC
So the problem is that if you restart libvirtd and it is still finishing a job for a long time systemd may decide to forcefully kill it:

Sep 25 22:16:51 andariel systemd[1]: libvirtd.service: State 'stop-sigterm' timed out. Killing.
Sep 25 22:16:51 andariel systemd[1]: libvirtd.service: Killing process 24770 (libvirtd) with signal SIGKILL.
Sep 25 22:16:51 andariel audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg
Sep 25 22:16:51 andariel systemd[1]: libvirtd.service: Main process exited, code=killed, status=9/KILL
Sep 25 22:16:51 andariel systemd[1]: Stopped Virtualization daemon.
Sep 25 22:16:51 andariel systemd[1]: libvirtd.service: Unit entered failed state.
Sep 25 22:16:51 andariel systemd[1]: libvirtd.service: Failed with result 'timeout'.
Sep 25 22:16:51 andariel systemd[1]: Starting Virtualization daemon...

The issue then is that libvirt may not finish creating cgroups for the new vcpu and thus will fail on the further unplug, because the cgroup did not exist yet.

Comment 4 Peter Krempa 2017-09-27 06:56:08 UTC
Upstream will no longer report error if the cgroup does not exist on cpu unplug:

commit cf30a8cabd5943992e30c45efdd5fd7b82dd53cc
Author: Peter Krempa <pkrempa>
Date:   Mon Sep 25 22:34:44 2017 +0200

    qemu: hotplug: Ignore cgroup errors when hot-unplugging vcpus
    
    When the vcpu is successfully removed libvirt would remove the cgroup.
    In cases when removal of the cgroup fails libvirt would report an error.
    
    This does not make much sense, since the vcpu was removed and we can't
    really do anything with the cgroup. This patch silences the errors from
    cgroup removal.

Comment 6 Jingjing Shao 2017-10-30 14:29:56 UTC
Hi Peter,

I test with the rpm as below , but when guest will shutdown after the libvirtd restart.

3.10.0-752.el7.x86_64
qemu-kvm-rhev-2.10.0-3.el7.x86_64
libvirt-3.8.0-1.el7.x86_64


# virsh vcpucount V
maximum      config       200
maximum      live         200
current      config         3
current      live           3

2. On one terminal,
#  virsh setvcpus V 200

3.On second terminal
#  service libvirtd restart

4.On the first terminal
#  virsh setvcpus V  200
error: Disconnected from qemu:///system due to keepalive timeout
error: internal error: connection closed due to keepalive timeout

5. #  service libvirtd restart
Redirecting to /bin/systemctl restart libvirtd.service

6.# virsh list  --all
 Id    Name                           State
----------------------------------------------------
 -     V                              shut off

Comment 7 Peter Krempa 2017-10-30 15:30:46 UTC
What's the reason? Please post debug logs and VM log.

Comment 8 Jingjing Shao 2017-10-31 08:32:12 UTC
Hi Peter,

I just can reproduce the result as comment7 with 50% percent and did not find the special point in steps when I reproduced it.

So I just attachment the libvirtd.log and guest log , can you help to check them ? thank you in advance.

Comment 9 Jingjing Shao 2017-10-31 08:33:34 UTC
Created attachment 1345809 [details]
libvirtd_part1.log

Comment 10 Jingjing Shao 2017-10-31 08:34:53 UTC
Created attachment 1345810 [details]
libvirtd_part2.log

Comment 11 Jingjing Shao 2017-10-31 08:35:32 UTC
Created attachment 1345811 [details]
guest.log

Comment 12 Jingjing Shao 2017-11-29 11:29:28 UTC
Test with libvirt-3.9.0-3.virtcov.el7.x86_64 several times, can not reproduce the result in comment6 and get the expected result as below. So change the status to verified.


1. # virsh vcpucount rhel
maximum      config       200
maximum      live         200
current      config         3
current      live           3

2. On one terminal,
#  virsh setvcpus rhel  200

3.On second terminal
#  service libvirtd restart

4.On the first terminal
#  virsh setvcpus rhel  200
error: Disconnected from qemu:///system due to keepalive timeout
error: internal error: connection closed due to keepalive timeout

5. #  service libvirtd restart
Redirecting to /bin/systemctl restart libvirtd.service

The libvirtd can start successfully.

6.# virsh vcpucount rhel
maximum      config       200
maximum      live         200
current      config         3
current      live          39

7. # virsh setvcpus rhel  20

8. # virsh vcpucount rhel
maximum      config       200
maximum      live         200
current      config         3
current      live          20

Comment 16 errata-xmlrpc 2018-04-10 10:48:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0704


Note You need to log in before you can comment on or make changes to this bug.