Bug 1439452
Summary: | guest turned shut off after restart libvirtd during hot-unplug vcpu | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Luyao Huang <lhuang> | ||||
Component: | libvirt | Assignee: | Peter Krempa <pkrempa> | ||||
Status: | CLOSED ERRATA | QA Contact: | Jingjing Shao <jishao> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.4 | CC: | dyuan, lhuang, pkrempa, rbalakri, xuzhang, yalzhang | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-3.2.0-4.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-08-02 00:05:54 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Luyao Huang
2017-04-06 03:30:41 UTC
Fixed upstream: commit 355f5ab998994d40e011cec491483506bbefe04f Author: Peter Krempa <pkrempa> Date: Thu Apr 13 14:22:16 2017 +0200 qemu: hotplug: Don't save status XML when monitor is closed In the vcpu hotplug code if exit from the monitor failed we would still attempt to save the status XML. When the daemon is terminated the monitor socket is closed. In such case, the written status XML would not contain the monitor path and thus be invalid. Avoid this issue by only saving status XML on success of the monitor command. Hi Peter, I try to verify this issue,the guest will be running, but I get another issue : libvirtd will crash when restart libvirtd during hot-plug vcpu libvirt-3.2.0-4.el7.x86_64 # virsh vcpucount V maximum config 200 maximum live 200 current config 2 current live 186 # virsh setvcpus V 200 at the same time, open another terminal to restart libvirtd,it will be hung. # service libvirtd restart Redirecting to /bin/systemctl restart libvirtd.service ^C The first terminal, # virsh vcpucount V error: failed to connect to the hypervisor error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory Check the libvirtd status # service libvirtd status Redirecting to /bin/systemctl status libvirtd.service ● libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled) Active: deactivating (stop-sigterm) since Thu 2017-05-11 15:42:30 CST; 14s ago Docs: man:libvirtd(8) http://libvirt.org Main PID: 29168 (libvirtd) CGroup: /system.slice/libvirtd.service ├─ 1266 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper ├─ 1267 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper └─29168 /usr/sbin/libvirtd --listen May 11 15:41:05 sriov2 systemd[1]: Starting Virtualization daemon... May 11 15:41:05 sriov2 libvirtd[29168]: 2017-05-11 07:41:05.604+0000: 29168: info : libvirt version: 3.2.0, package: 4.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2017-05-03-08:24...bos.redhat.com) May 11 15:41:05 sriov2 libvirtd[29168]: 2017-05-11 07:41:05.604+0000: 29168: info : hostname: sriov2 May 11 15:41:05 sriov2 libvirtd[29168]: 2017-05-11 07:41:05.604+0000: 29168: debug : virLogParseOutputs:1730 : outputs=1:file:/var/log/libvirt/libvirtd.log May 11 15:41:05 sriov2 libvirtd[29168]: 2017-05-11 07:41:05.608+0000: 29168: debug : virLogParseOutput:1558 : output=1:file:/var/log/libvirt/libvirtd.log May 11 15:41:05 sriov2 systemd[1]: Started Virtualization daemon. May 11 15:41:06 sriov2 dnsmasq[1266]: read /etc/hosts - 3 addresses May 11 15:41:06 sriov2 dnsmasq[1266]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses May 11 15:41:06 sriov2 dnsmasq-dhcp[1266]: read /var/lib/libvirt/dnsmasq/default.hostsfile May 11 15:42:30 sriov2 systemd[1]: Stopping Virtualization daemon... Hint: Some lines were ellipsized, use -l to show in full. (In reply to Jingjing Shao from comment #4) > Hi Peter, > > I try to verify this issue,the guest will be running, but I get another > issue : libvirtd will crash when restart libvirtd during hot-plug vcpu For crashes, please always attach backtrace and debug log. > > libvirt-3.2.0-4.el7.x86_64 > > # virsh vcpucount V > maximum config 200 > maximum live 200 > current config 2 > current live 186 > > > # virsh setvcpus V 200 > > at the same time, open another terminal to restart libvirtd,it will be hung. > # service libvirtd restart > Redirecting to /bin/systemctl restart libvirtd.service > ^C > > The first terminal, > # virsh vcpucount V > error: failed to connect to the hypervisor > error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such > file or directory > > Check the libvirtd status > > # service libvirtd status > Redirecting to /bin/systemctl status libvirtd.service > ● libvirtd.service - Virtualization daemon > Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor > preset: enabled) > Active: deactivating (stop-sigterm) since Thu 2017-05-11 15:42:30 CST; > 14s ago This looks more like libvirtd is still waiting for the thread doing the hotplug operation to finish, and thus isn't accepting new connections. > May 11 15:41:05 sriov2 systemd[1]: Started Virtualization daemon. > May 11 15:41:06 sriov2 dnsmasq[1266]: read /etc/hosts - 3 addresses > May 11 15:41:06 sriov2 dnsmasq[1266]: read > /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses > May 11 15:41:06 sriov2 dnsmasq-dhcp[1266]: read > /var/lib/libvirt/dnsmasq/default.hostsfile > May 11 15:42:30 sriov2 systemd[1]: Stopping Virtualization daemon... ... as this message shows. > Hint: Some lines were ellipsized, use -l to show in full. If it indeed crashed, please attach the backtrace/logs as requested. (In reply to Peter Krempa from comment #5) > (In reply to Jingjing Shao from comment #4) > > Hi Peter, > > > > I try to verify this issue,the guest will be running, but I get another > > issue : libvirtd will crash when restart libvirtd during hot-plug vcpu > > For crashes, please always attach backtrace and debug log. Sorry that, do the test again and attach the libvirtd log # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 3 Open second terminal # gdb -p `pidof libvirtd` Missing separate debuginfos, use: debuginfo-install libvirt-daemon-3.2.0-4.el7.x86_64 (gdb) c Continuing. On the first terminal, # virsh setvcpus r7.2 200 Open the third terminal, # service libvirtd restart Redirecting to /bin/systemctl restart libvirtd.service On the first terminal, # virsh setvcpus r7.2 200 error: Disconnected from qemu:///system due to keepalive timeout error: internal error: connection closed due to keepalive timeout Check the libvirtd status # service libvirtd status Redirecting to /bin/systemctl status libvirtd.service ● libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled) Active: deactivating (stop-sigterm) since Fri 2017-05-12 20:29:37 CST; 49s ago Docs: man:libvirtd(8) http://libvirt.org Main PID: 10384 (libvirtd) CGroup: /system.slice/libvirtd.service ├─ 1266 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper ├─ 1267 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper └─10384 /usr/sbin/libvirtd May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.024+0000: 10585: debug : virFileIsSharedFSType:3391 : Check if path /nfs/r7.1.img with FS magic 26985 is shared May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.024+0000: 10585: info : virSecuritySELinuxSetFileconHelper:1200 : Setting security context 'system_u:object_r:svirt_image_t:s0:c357,c...' not supported May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.024+0000: 10585: debug : virFileClose:110 : Closed fd 4 May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.024+0000: 10585: debug : virFileClose:110 : Closed fd 29 May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.026+0000: 10586: debug : virFileClose:110 : Closed fd 27 May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.026+0000: 10586: info : virSecurityDACSetOwnershipInternal:556 : Setting DAC user and group on '/nfs/r7.1.img' to '107:107' May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.048+0000: 10586: info : virSecurityDACSetOwnershipInternal:556 : Setting DAC user and group on '<null>' to '107:107' May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.048+0000: 10586: debug : virFileClose:110 : Closed fd 4 May 12 20:28:46 sriov2 libvirtd[10384]: 2017-05-12 12:28:46.048+0000: 10586: debug : virFileClose:110 : Closed fd 29 May 12 20:29:37 sriov2 systemd[1]: Stopping Virtualization daemon... Hint: Some lines were ellipsized, use -l to show in full. On the second terminal, Program received signal SIGCONT, Continued. [Switching to Thread 0x7facc02a4700 (LWP 10385)] 0x00007facccc14945 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 Created attachment 1278175 [details] libvirtd log for comment4 Nothing in the log, nor the gdb output suggest a crash. Hi Peter Sorry again, The libvirtd can be started successfully. But I also find another issue as below. After talk with lhuang, it may be caused by the error cgroup in libvirtd restart. Will it be fixed or not ? Can you help to check it ? # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 3 On one terminal, # virsh setvcpus r7.2 200 On second terminal # service libvirtd restart On the first terminal # virsh setvcpus r7.2 200 error: Disconnected from qemu:///system due to keepalive timeout error: internal error: connection closed due to keepalive timeout # service libvirtd restart Redirecting to /bin/systemctl restart libvirtd.service The libvirtd can start successfully. But there is also a question, when I hotplug vcpu after the steps as above, it will get error, but do it again,it will be successful virsh setvcpus r7.2 20 error: Failed to create controller cpu for group: No such file or directory # virsh setvcpus r7.2 20 # # That looks like a separate issue. Please file a separate bug for this and attach the domain XML prior to issuing the 'setvcpus' after you restart libvirtd. Also please attach the debug log. I try it with the newest version libvirt-3.2.0-6.el7.x86_64 qemu-kvm-rhev-2.9.0-6.el7.x86_64 but I can not reproduce this issue in comment9, so I verify this bug as below. # virsh list --all Id Name State ---------------------------------------------------- 2 r7.2 running # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 20 (1)hotplug with libvirtd restart On one terminal, # virsh setvcpus r7.2 200 On second terminal # service libvirtd restart On the first terminal # virsh setvcpus r7.2 200 error: Disconnected from qemu:///system due to keepalive timeout error: internal error: connection closed due to keepalive timeout # service libvirtd restart Redirecting to /bin/systemctl restart libvirtd.service The libvirtd can start successfully. # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 47 Continue to hotplug or hotunplug # virsh setvcpus r7.2 20 # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 20 Or # virsh setvcpus r7.2 200 # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 200 (2)hotunplug with libvirtd restart # virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 200 On one terminal, # virsh setvcpus r7.2 20 # virsh list --all Id Name State ---------------------------------------------------- 2 r7.2 running Continue to hotunplug On second terminal # service libvirtd restart On the first terminal # virsh setvcpus r7.2 20 error: Disconnected from qemu:///system due to I/O error error: Cannot recv data: Connection reset by peer # service libvirtd restart Redirecting to /bin/systemctl restart libvirtd.service The libvirtd can start successfully. # virsh setvcpus r7.2 20 [root@sriov2 jishao]# virsh vcpucount r7.2 maximum config 200 maximum live 200 current config 3 current live 20 (In reply to Peter Krempa from comment #10) > That looks like a separate issue. Please file a separate bug for this and > attach the domain XML prior to issuing the 'setvcpus' after you restart > libvirtd. Also please attach the debug log. With the libvirt-3.2.0-10.el7.x86_64, I can reproduce this issue again. so file a bug https://bugzilla.redhat.com/show_bug.cgi?id=1462092 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 |