Bug 1439452
| Summary: | guest turned shut off after restart libvirtd during hot-unplug vcpu | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Luyao Huang <lhuang> | ||||
| Component: | libvirt | Assignee: | Peter Krempa <pkrempa> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Jingjing Shao <jishao> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 7.4 | CC: | dyuan, lhuang, pkrempa, rbalakri, xuzhang, yalzhang | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libvirt-3.2.0-4.el7 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2017-08-02 00:05:54 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Fixed upstream:
commit 355f5ab998994d40e011cec491483506bbefe04f
Author: Peter Krempa <pkrempa>
Date: Thu Apr 13 14:22:16 2017 +0200
qemu: hotplug: Don't save status XML when monitor is closed
In the vcpu hotplug code if exit from the monitor failed we would still
attempt to save the status XML. When the daemon is terminated the
monitor socket is closed. In such case, the written status XML would not
contain the monitor path and thus be invalid.
Avoid this issue by only saving status XML on success of the monitor
command.
Hi Peter,
I try to verify this issue,the guest will be running, but I get another issue : libvirtd will crash when restart libvirtd during hot-plug vcpu
libvirt-3.2.0-4.el7.x86_64
# virsh vcpucount V
maximum config 200
maximum live 200
current config 2
current live 186
# virsh setvcpus V 200
at the same time, open another terminal to restart libvirtd,it will be hung.
# service libvirtd restart
Redirecting to /bin/systemctl restart libvirtd.service
^C
The first terminal,
# virsh vcpucount V
error: failed to connect to the hypervisor
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
Check the libvirtd status
# service libvirtd status
Redirecting to /bin/systemctl status libvirtd.service
● libvirtd.service - Virtualization daemon
Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
Active: deactivating (stop-sigterm) since Thu 2017-05-11 15:42:30 CST; 14s ago
Docs: man:libvirtd(8)
http://libvirt.org
Main PID: 29168 (libvirtd)
CGroup: /system.slice/libvirtd.service
├─ 1266 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
├─ 1267 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
└─29168 /usr/sbin/libvirtd --listen
May 11 15:41:05 sriov2 systemd[1]: Starting Virtualization daemon...
May 11 15:41:05 sriov2 libvirtd[29168]: 2017-05-11 07:41:05.604+0000: 29168: info : libvirt version: 3.2.0, package: 4.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2017-05-03-08:24...bos.redhat.com)
May 11 15:41:05 sriov2 libvirtd[29168]: 2017-05-11 07:41:05.604+0000: 29168: info : hostname: sriov2
May 11 15:41:05 sriov2 libvirtd[29168]: 2017-05-11 07:41:05.604+0000: 29168: debug : virLogParseOutputs:1730 : outputs=1:file:/var/log/libvirt/libvirtd.log
May 11 15:41:05 sriov2 libvirtd[29168]: 2017-05-11 07:41:05.608+0000: 29168: debug : virLogParseOutput:1558 : output=1:file:/var/log/libvirt/libvirtd.log
May 11 15:41:05 sriov2 systemd[1]: Started Virtualization daemon.
May 11 15:41:06 sriov2 dnsmasq[1266]: read /etc/hosts - 3 addresses
May 11 15:41:06 sriov2 dnsmasq[1266]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
May 11 15:41:06 sriov2 dnsmasq-dhcp[1266]: read /var/lib/libvirt/dnsmasq/default.hostsfile
May 11 15:42:30 sriov2 systemd[1]: Stopping Virtualization daemon...
Hint: Some lines were ellipsized, use -l to show in full.
(In reply to Jingjing Shao from comment #4) > Hi Peter, > > I try to verify this issue,the guest will be running, but I get another > issue : libvirtd will crash when restart libvirtd during hot-plug vcpu For crashes, please always attach backtrace and debug log. > > libvirt-3.2.0-4.el7.x86_64 > > # virsh vcpucount V > maximum config 200 > maximum live 200 > current config 2 > current live 186 > > > # virsh setvcpus V 200 > > at the same time, open another terminal to restart libvirtd,it will be hung. > # service libvirtd restart > Redirecting to /bin/systemctl restart libvirtd.service > ^C > > The first terminal, > # virsh vcpucount V > error: failed to connect to the hypervisor > error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such > file or directory > > Check the libvirtd status > > # service libvirtd status > Redirecting to /bin/systemctl status libvirtd.service > ● libvirtd.service - Virtualization daemon > Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor > preset: enabled) > Active: deactivating (stop-sigterm) since Thu 2017-05-11 15:42:30 CST; > 14s ago This looks more like libvirtd is still waiting for the thread doing the hotplug operation to finish, and thus isn't accepting new connections. > May 11 15:41:05 sriov2 systemd[1]: Started Virtualization daemon. > May 11 15:41:06 sriov2 dnsmasq[1266]: read /etc/hosts - 3 addresses > May 11 15:41:06 sriov2 dnsmasq[1266]: read > /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses > May 11 15:41:06 sriov2 dnsmasq-dhcp[1266]: read > /var/lib/libvirt/dnsmasq/default.hostsfile > May 11 15:42:30 sriov2 systemd[1]: Stopping Virtualization daemon... ... as this message shows. > Hint: Some lines were ellipsized, use -l to show in full. If it indeed crashed, please attach the backtrace/logs as requested. (In reply to Peter Krempa from comment #5) > (In reply to Jingjing Shao from comment #4) > > Hi Peter, > > > > I try to verify this issue,the guest will be running, but I get another > > issue : libvirtd will crash when restart libvirtd during hot-plug vcpu > > For crashes, please always attach backtrace and debug log. Sorry that, do the test again and attach the libvirtd log # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 3 Open second terminal # gdb -p `pidof libvirtd` Missing separate debuginfos, use: debuginfo-install libvirt-daemon-3.2.0-4.el7.x86_64 (gdb) c Continuing. On the first terminal, # virsh setvcpus r7.2 200 Open the third terminal, # service libvirtd restart Redirecting to /bin/systemctl restart libvirtd.service On the first terminal, # virsh setvcpus r7.2 200 error: Disconnected from qemu:///system due to keepalive timeout error: internal error: connection closed due to keepalive timeout Check the libvirtd status # service libvirtd status Redirecting to /bin/systemctl status libvirtd.service ● libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled) Active: deactivating (stop-sigterm) since Fri 2017-05-12 20:29:37 CST; 49s ago Docs: man:libvirtd(8) http://libvirt.org Main PID: 10384 (libvirtd) CGroup: /system.slice/libvirtd.service ├─ 1266 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper ├─ 1267 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper └─10384 /usr/sbin/libvirtd May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.024+0000: 10585: debug : virFileIsSharedFSType:3391 : Check if path /nfs/r7.1.img with FS magic 26985 is shared May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.024+0000: 10585: info : virSecuritySELinuxSetFileconHelper:1200 : Setting security context 'system_u:object_r:svirt_image_t:s0:c357,c...' not supported May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.024+0000: 10585: debug : virFileClose:110 : Closed fd 4 May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.024+0000: 10585: debug : virFileClose:110 : Closed fd 29 May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.026+0000: 10586: debug : virFileClose:110 : Closed fd 27 May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.026+0000: 10586: info : virSecurityDACSetOwnershipInternal:556 : Setting DAC user and group on '/nfs/r7.1.img' to '107:107' May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.048+0000: 10586: info : virSecurityDACSetOwnershipInternal:556 : Setting DAC user and group on '<null>' to '107:107' May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.048+0000: 10586: debug : virFileClose:110 : Closed fd 4 May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.048+0000: 10586: debug : virFileClose:110 : Closed fd 29 May 12 20:29:37 sriov2 systemd[1]: Stopping Virtualization daemon... Hint: Some lines were ellipsized, use -l to show in full. On the second terminal, Program received signal SIGCONT, Continued. [Switching to Thread 0x7facc02a4700 (LWP 10385)] 0x00007facccc14945 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 Created attachment 1278175 [details] libvirtd log for comment4 Nothing in the log, nor the gdb output suggest a crash. Hi Peter Sorry again, The libvirtd can be started successfully. But I also find another issue as below. After talk with lhuang, it may be caused by the error cgroup in libvirtd restart. Will it be fixed or not ? Can you help to check it ? # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 3 On one terminal, # virsh setvcpus r7.2 200 On second terminal # service libvirtd restart On the first terminal # virsh setvcpus r7.2 200 error: Disconnected from qemu:///system due to keepalive timeout error: internal error: connection closed due to keepalive timeout # service libvirtd restart Redirecting to /bin/systemctl restart libvirtd.service The libvirtd can start successfully. But there is also a question, when I hotplug vcpu after the steps as above, it will get error, but do it again,it will be successful virsh setvcpus r7.2 20 error: Failed to create controller cpu for group: No such file or directory # virsh setvcpus r7.2 20 # # That looks like a separate issue. Please file a separate bug for this and attach the domain XML prior to issuing the 'setvcpus' after you restart libvirtd. Also please attach the debug log. I try it with the newest version libvirt-3.2.0-6.el7.x86_64 qemu-kvm-rhev-2.9.0-6.el7.x86_64 but I can not reproduce this issue in comment9, so I verify this bug as below. # virsh list --all Id Name State ---------------------------------------------------- 2 r7.2 running # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 20 (1)hotplug with libvirtd restart On one terminal, # virsh setvcpus r7.2 200 On second terminal # service libvirtd restart On the first terminal # virsh setvcpus r7.2 200 error: Disconnected from qemu:///system due to keepalive timeout error: internal error: connection closed due to keepalive timeout # service libvirtd restart Redirecting to /bin/systemctl restart libvirtd.service The libvirtd can start successfully. # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 47 Continue to hotplug or hotunplug # virsh setvcpus r7.2 20 # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 20 Or # virsh setvcpus r7.2 200 # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 200 (2)hotunplug with libvirtd restart # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 200 On one terminal, # virsh setvcpus r7.2 20 # virsh list --all Id Name State ---------------------------------------------------- 2 r7.2 running Continue to hotunplug On second terminal # service libvirtd restart On the first terminal # virsh setvcpus r7.2 20 error: Disconnected from qemu:///system due to I/O error error: Cannot recv data: Connection reset by peer # service libvirtd restart Redirecting to /bin/systemctl restart libvirtd.service The libvirtd can start successfully. # virsh setvcpus r7.2 20 [root@sriov2 jishao]# virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 20 (In reply to Peter Krempa from comment #10) > That looks like a separate issue. Please file a separate bug for this and > attach the domain XML prior to issuing the 'setvcpus' after you restart > libvirtd. Also please attach the debug log. With the libvirt-3.2.0-10.el7.x86_64, I can reproduce this issue again. so file a bug https://bugzilla.redhat.com/show_bug.cgi?id=1462092 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 |
Description of problem: guest turned shut off after restart libvirtd during hot-unplug vcpu Version-Release number of selected component (if applicable): libvirt-3.2.0-1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Start a guest with many vcpu (to make the unplug time long enough) # virsh vcpucount r7 maximum config 200 maximum live 200 current config 200 current live 200 2. unplug 180 vcpus # virsh setvcpus r7 20 3. at the same time, open another terminal to restart libvirtd: # service libvirtd restart Redirecting to /bin/systemctl restart libvirtd.service 4. recheck guest status: # virsh list --all Id Name State ---------------------------------------------------- - r7 shut off 5. check the libvirtd log: libvirtd: 2017-04-06 03:26:17.628+0000: 18522: error : qemuDomainObjPrivateXMLParse:1908 : internal error: no monitor path 6. check the runing guest xml in /var/run/libvirt/qemu: <domstatus state='running' reason='booted' pid='17431'> <-----no monitor path <domain type='kvm' id='2'> <name>r7</name> <uuid>67c7a123-5415-4136-af62-a2ee098ba6cd</uuid> <maxMemory slots='16' unit='KiB'>15243264</maxMemory> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static' current='198'>200</vcpu> 7. still can find the qemu process: # ps aux|grep qemu qemu 17431 18.5 1.4 3383116 464432 ? Sl 23:19 1:51 /usr/libexec/qemu-kvm -name guest=r7,debug-threads=on -S -object sec...... Actual results: guest turned shut off after restart libvirtd during hot-unplug vcpu Expected results: guest still running Additional info: