Previously, after a storage domain connectivity failure, storage mailbox threads could fail in an unmanaged state, consuming threads from the thread pool and eventually locking the system.
Now, these threads are forcibly reclaimed for reuse in the threadpool at a later time.
Descriptionvvyazmin@redhat.com
2012-11-04 12:58:09 UTC
Created attachment 637986[details]
## Logs vdsm, rhevm, screen-shots
Description of problem: Summary: Threads leakage after failure of storage domain
Version-Release number of selected component (if applicable):
RHEVM 3.1 - SI23
RHEVM: rhevm-3.1.0-25.el6ev.noarch
VDSM: vdsm-4.9.6-40.0.el6_3.x86_64
LIBVIRT: libvirt-0.9.10-21.el6_3.5.x86_64
QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.295.el6_3.4.x86_64
SANLOCK: sanlock-2.3-4.el6_3.x86_64
How reproducible:
100%
Steps to Reproduce:
1. Create iSCSI DC with one Host and one SD
2. Create a VM with OS installed (in my case was VM with RHEL 6.3, and VirtIO disk)
3. Run a VM
4. Create a Live-Snapshot
5. Wait when disks VM in “Locked” state
6. Block via IpTables SD connection
7. Wait when Host on “Non Responsive” state
8. Remove restriction from IpTables
Actual results:
After 2 hours, no more free threads
Disks VM in “Locked” state
Snapshot on Locked state
Expected results:
System need handler with SD disconnection, and know to recover from this state
Additional info:
[root@cougar08 ~]#
ps auxH | awk '/vdsm/ {print $1}' |grep vdsm | wc -l
4096
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
http://rhn.redhat.com/errata/RHSA-2012-1508.html
Created attachment 637986 [details] ## Logs vdsm, rhevm, screen-shots Description of problem: Summary: Threads leakage after failure of storage domain Version-Release number of selected component (if applicable): RHEVM 3.1 - SI23 RHEVM: rhevm-3.1.0-25.el6ev.noarch VDSM: vdsm-4.9.6-40.0.el6_3.x86_64 LIBVIRT: libvirt-0.9.10-21.el6_3.5.x86_64 QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.295.el6_3.4.x86_64 SANLOCK: sanlock-2.3-4.el6_3.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create iSCSI DC with one Host and one SD 2. Create a VM with OS installed (in my case was VM with RHEL 6.3, and VirtIO disk) 3. Run a VM 4. Create a Live-Snapshot 5. Wait when disks VM in “Locked” state 6. Block via IpTables SD connection 7. Wait when Host on “Non Responsive” state 8. Remove restriction from IpTables Actual results: After 2 hours, no more free threads Disks VM in “Locked” state Snapshot on Locked state Expected results: System need handler with SD disconnection, and know to recover from this state Additional info: [root@cougar08 ~]# ps auxH | awk '/vdsm/ {print $1}' |grep vdsm | wc -l 4096