| Summary: | fence_xvm in virtual cluster stops working: need to restart fence_virtd | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Gianluca Cecchi <gianluca.cecchi> |
| Component: | fence-virt | Assignee: | Ryan McCabe <rmccabe> |
| Status: | CLOSED WONTFIX | QA Contact: | cluster-qe <cluster-qe> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 6.0 | CC: | cfeist, cluster-maint, djansa, fdinitto, heinzm, jbainbri, mgrac, michael.jansen, mwest, rmccabe, tlavigne |
| Target Milestone: | rc | Keywords: | Reopened |
| Target Release: | --- | Flags: | heinzm:
needinfo-
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-12-15 07:24:33 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Bug Depends On: | |||
| Bug Blocks: | 756082 | ||
|
Description
Gianluca Cecchi
2011-04-27 15:07:45 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. Interesting, so it is rejecting the request or unable to perform the operation. I wonder why. Oh, sorry, misread. It's like it's no longer responding to requests, like you said. Let me know if you need any configuration files at guest and/or host side. At thsi moment it seems that the problem happens more with one particular host (rhev1). From a firewall point of view, at the moment I inserted this line in INPUT chain, just to allow all traffic, in /etc/sysconfig/iptables: -I INPUT -d 225.0.0.12 -j ACCEPT donna if there is a better/more restrective one.... no line in FORWARD chain.. I suppose it is not neceesary? general question: if guest1 that is running on host1 runs # fence_xvm -H guest2 -k /etc/cluster/host2.key -ddd -o null is it supposed to generate "strace" output on both fence_virtd processes or only on host2 one? You'll see output on the fence_virtd side as well. Sorry, but I've not understood your comment. Is it for the last question of my comment#5 ? Perhaps I didn't explain well that question: 1) Suppose I run on host1 where fence_virtd has pid PID1: strace PID1 2) Suppose I run on host2 where fence_virtd has pid PID2: strace PID2 3) Then on guest1 that is on host1 I run: # fence_xvm -H guest2 -k /etc/cluster/host2.key -ddd -o null Will 3) generate output in both strace commands 1) and 2) ? (In reply to comment #7) > Sorry, but I've not understood your comment. Is it for the last question of my > comment#5 ? > Perhaps I didn't explain well that question: > > 1) Suppose I run on host1 where fence_virtd has pid PID1: > strace PID1 > > 2) Suppose I run on host2 where fence_virtd has pid PID2: > strace PID2 > > 3) Then on guest1 that is on host1 I run: > # fence_xvm -H guest2 -k /etc/cluster/host2.key -ddd -o null > > Will 3) generate output in both strace commands 1) and 2) ? It depends on how fence_xvm's forwarding is done. You will likely see some processing of the multicast packet sent from fence_xvm on both hosts, but then the fence_virtd process (on *one* host) will connect back to the fence_xvm instance, so you will only see that part on one host. Hello there! I am also looking at using the cluster suite with kvm virtual machines and I have encountered some problems with the fence_virtd daemon. My setup is not multicast. I use the serial plugin to talk from VM to underlying vmhost, and run cman (not rgmanager) and use the checkpoint plugin to do fencing between vms running on different physical hosts. The problem I run into is that, if I write a loop which fences one of a clustered pair of vms (the fencing is just using a fence_node command (-o reboot) from a vm that is not part of the cluster), then the fence_node command can be made to work for a while, but each time a fence request is sent around the corosync cpg the fence_virtd daemons open additional sockets (which I think are due to connections to libvirt that are not released), and eventually you hit the 1024 open file limit and fence_virtd stops working. It takes quite a number of fence events, of course, to reach this stage. I was wondering whether the problem here has an open file problem, too. Closing since we have not been able to reproduce this issue, if this is still an issue with the current cluster packages please feel free to re-open this bug. This is still present in RHEL8 fence-virtd-0.4.0-7.el8.x86_64 ! Just recently had to restart fence_virtd to get it to work again on two RHEL8 hosts. Hi Heinz, Thanks for your report. Are you able to provide reproducer steps which consistently make this happen? Ideally we would like as detailed as possible from a bare RHEL/CentOS install. We did a significant amount of work trying to reproduce this when it was logged but we could never get it to consistently fail. Our package maintainer could not get it to fail at all. I was doing High Availability technical support at that time. I had it failing in my test environment one day but the next day it all worked fine and the problem vanished never to be seen again. Setting needinfo on you for this. Unfortunately I'm afraid without consistent steps to get this to fail, it will be impossible to fix. Jamie After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. |