Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
I start 512 guests with loop, when start on 396th guest, it hang without return, but libvirtd still running and on other console, virsh list still work well. I check the guest status with virsh list (without --all or inactive), it shows that the 307th guest is in shut off status but in active domain list.
# virsh list |grep "396"
396 rhel6u1-x86_646 shut off
When do
# virsh start rhel5u7-x86_6464
error: Domain is already active
#virsh destroy rhel6u1-x86_646
error: Failed to destroy domain rhel6u1-x86_646
error: Requested operation is not valid: domain is not running
After destroy, the guest return to normal shut off status and can be started again
Version-Release number of selected component (if applicable):
libvirt-0.9.4-17.el6.x86_64
kernel-2.6.32-206.el6.x86_64
qemu-kvm-0.12.1.2-2.196.el6.x86_64
How reproducible:
sometimes
Steps to Reproduce:
1. start 512 guest with command
# for i in {1..512}; do virsh start guest$i; done
2.
3.
Actual results:
It may hang on one guest start up, but the virsh list will show error info
Expected results:
virsh start will not hang, and virsh list will show correctly
Additional info:
# free -g
total used free shared buffers cached
Mem: 992 865 127 0 2 688
-/+ buffers/cache: 174 818
Swap: 0 0 0
# top -p `pidof libvirtd`
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7202 root 20 0 994m 32m 5236 S 26.8 0.0 6:46.61 libvirtd
I don't know if it is helpful, but I found in libvirtd.log it may report error like
23:30:12.323: 7202: error : qemuMonitorIO:583 : internal error End of file from monitor
23:31:19.956: 7202: error : qemuMonitorIO:583 : internal error End of file from monitor
10:08:59.096: 7202: error : virNetSocketReadWire:911 : End of file while reading data: Input/output error
10:09:00.844: 7202: error : virNetSocketReadWire:911 : End of file while reading data: Input/output error
(In reply to comment #0)
> Description of problem:
> # virsh list |grep "396"
> 396 rhel6u1-x86_646 shut off
>
> When do
> # virsh start rhel5u7-x86_6464
Here I mean the same guest rhel6u1-x86_646, should be
# virsh start rhel6u1-x86_646
error: Domain is already active
> error: Domain is already active
>
> #virsh destroy rhel6u1-x86_646
> error: Failed to destroy domain rhel6u1-x86_646
> error: Requested operation is not valid: domain is not running
>
> After destroy, the guest return to normal shut off status and can be started
> again
>
The fact that the domain is listed means it has been added to the hash table of started domains, although the actual start process has not yet progressed far enough to reach the point where the domain is marked as running. We have to drop mutex to call into the domain monitor to verify that the domain started, so that explains why there is a window where a domain can show up in the active list while still being shut off. But until I know the root cause for why the creation seems to hang, I'm not sure if it is worth tweaking code to try to prevent this data race.
Description of problem: I start 512 guests with loop, when start on 396th guest, it hang without return, but libvirtd still running and on other console, virsh list still work well. I check the guest status with virsh list (without --all or inactive), it shows that the 307th guest is in shut off status but in active domain list. # virsh list |grep "396" 396 rhel6u1-x86_646 shut off When do # virsh start rhel5u7-x86_6464 error: Domain is already active #virsh destroy rhel6u1-x86_646 error: Failed to destroy domain rhel6u1-x86_646 error: Requested operation is not valid: domain is not running After destroy, the guest return to normal shut off status and can be started again Version-Release number of selected component (if applicable): libvirt-0.9.4-17.el6.x86_64 kernel-2.6.32-206.el6.x86_64 qemu-kvm-0.12.1.2-2.196.el6.x86_64 How reproducible: sometimes Steps to Reproduce: 1. start 512 guest with command # for i in {1..512}; do virsh start guest$i; done 2. 3. Actual results: It may hang on one guest start up, but the virsh list will show error info Expected results: virsh start will not hang, and virsh list will show correctly Additional info: # free -g total used free shared buffers cached Mem: 992 865 127 0 2 688 -/+ buffers/cache: 174 818 Swap: 0 0 0 # top -p `pidof libvirtd` PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7202 root 20 0 994m 32m 5236 S 26.8 0.0 6:46.61 libvirtd I don't know if it is helpful, but I found in libvirtd.log it may report error like 23:30:12.323: 7202: error : qemuMonitorIO:583 : internal error End of file from monitor 23:31:19.956: 7202: error : qemuMonitorIO:583 : internal error End of file from monitor 10:08:59.096: 7202: error : virNetSocketReadWire:911 : End of file while reading data: Input/output error 10:09:00.844: 7202: error : virNetSocketReadWire:911 : End of file while reading data: Input/output error