| Summary: | fence_xvm + fence_virtd hash-handling mismatch leads to fake fencing | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Jaroslav Kortus <jkortus> | ||||
| Component: | fence-virt | Assignee: | Lon Hohberger <lhh> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 6.1 | CC: | cluster-maint, djansa, mgrac | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | fence-virt-0.2.3-2.el6 | Doc Type: | Bug Fix | ||||
| Doc Text: |
Cause: A hash handling mismatch.
Consequence: False success for fencing in some cases. This has the potential to cause data corruption in live-hang scenarios.
Fix: Correct hash handling mismatch.
Result: No more false successes for fencing, thereby preserving data integrity.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2011-12-06 11:38:06 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
fake fencing example:
$ fence_virtd -F -d 99 -f /etc/fence_virt.conf
Background mode disabled
Using /etc/fence_virt.conf
Debugging threshold is now 99
backends {
libvirt {
uri = "qemu:///system";
}
}
listeners {
multicast {
interface = "virbr1";
auth = "sha256";
hash = "sha256";
address = "225.0.0.12";
key_file = "/etc/cluster/fence_xvm.key";
}
}
fence_virtd {
debug = "99";
backend = "libvirt";
listener = "multicast";
}
Backend plugin: libvirt
Listener plugin: multicast
Searching /usr/lib64/fence-virt for plugins...
Searching for plugins in /usr/lib64/fence-virt
Loading plugin from /usr/lib64/fence-virt/multicast.so
Failed to map backend_plugin_version
Registered listener plugin multicast 1.0
Loading plugin from /usr/lib64/fence-virt/checkpoint.so
Registered backend plugin checkpoint 0.8
Loading plugin from /usr/lib64/fence-virt/libvirt.so
Registered backend plugin libvirt 0.1
3 plugins found
Available backends:
checkpoint 0.8
libvirt 0.1
Available listeners:
multicast 1.0
Debugging threshold is now 99
Using qemu:///system
Debugging threshold is now 99
Got /etc/cluster/fence_xvm.key for key_file
Got sha256 for hash
Got sha256 for auth
Got 225.0.0.12 for address
Got virbr1 for interface
Reading in key file /etc/cluster/fence_xvm.key into 0x206eb80 (4096 max size)
Stopped reading @ 17 bytes
Actual key length = 17 bytes
Setting up ipv4 multicast receive (225.0.0.12:1229)
Joining multicast group
ipv4_recv_sk: success, fd = 6
Request 2 seqno 776444 domain node02
Plain TCP request
ipv4_connect: Connecting to client
ipv4_connect: Success; fd = 7
Hash mismatch:
C = 8f5a3ddc9bdaff8440cb8f2c0482bc3b2f3e0be7a9bbdfbab6dc0097ab1e0739f383934161f463ccbd4785ebacaad8aee418c53204d10353906599158949e5b9
H = b8761d8db7bbb78c196d18c274517756ac289962a7c20a836a041f23b02982490000000000000000000000000000000000000000000000000000000000000000
R = 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Remote failed challenge
Could call back for fence request: Connection reset by peer
Manual attempt:
$ fence_xvm -H node02 -C sha512 -c sha512 -o reboot -t 10 -dddd -k /etc/cluster/fence_xvm.key
Debugging threshold is now 4
-- args @ 0x7fff24072060 --
args->domain = node02
args->op = 2
args->net.key_file = /etc/cluster/fence_xvm.key
args->net.hash = 3
args->net.addr = 225.0.0.12
args->net.auth = 3
args->net.port = 1229
args->net.ifindex = 0
args->net.family = 2
args->timeout = 10
args->retr_time = 20
args->flags = 0
args->debug = 4
-- end args --
Reading in key file /etc/cluster/fence_xvm.key into 0x7fff24070f30 (4096 max size)
Stopped reading @ 17 bytes
Actual key length = 17 bytes
Adding IP 127.0.0.1 to list (family 2)
Adding IP 192.168.122.116 to list (family 2)
Adding IP 192.168.100.101 to list (family 2)
ipv4_listen: Setting up ipv4 listen socket
ipv4_listen: Success; fd = 3
Setting up ipv4 multicast send (225.0.0.12:1229)
Joining IP Multicast group (pass 1)
Joining IP Multicast group (pass 2)
Setting TTL to 2 for fd4
ipv4_send_sk: success, fd = 4
Opening /dev/urandom
Sending to 225.0.0.12 via 127.0.0.1
Setting up ipv4 multicast send (225.0.0.12:1229)
Joining IP Multicast group (pass 1)
Joining IP Multicast group (pass 2)
Setting TTL to 2 for fd4
ipv4_send_sk: success, fd = 4
Opening /dev/urandom
Sending to 225.0.0.12 via 192.168.122.116
Setting up ipv4 multicast send (225.0.0.12:1229)
Joining IP Multicast group (pass 1)
Joining IP Multicast group (pass 2)
Setting TTL to 2 for fd4
ipv4_send_sk: success, fd = 4
Opening /dev/urandom
Sending to 225.0.0.12 via 192.168.100.101
Waiting for connection from XVM host daemon.
Issuing TCP challenge
Hash mismatch:
C = b3f9592dc1322003f9dffc01e18d30bec7d0a4cffe5b7e86966d85fd06eb994280c9c603a2729d120fdd587261c7e2ac23179ec4d4ba285c35b3dcc5c4e1a0e2
H = 335822bbe84f125c9e2951e8cc090d9585d053c878085441a5844fd38a0db14cb4f8660ae5dfaa99abd4d0fda6448c5237a6d789151d3836050cc073c1a0b4b3
R = 762029f2195976077950a59d5ba12e56f318aaa51441d8ce2f81841d7e013d900000000000000000000000000000000000000000000000000000000000000000
Invalid response to challenge
(12:46:02) [root@node01:~]$ echo $?
0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Maybe this is the problem right here!
$ cat /etc/cluster/cluster.conf | grep fence_xvm
<fencedevice agent="fence_xvm" auth="sha512" hash="sha512" key_file="/etc/cluster/fence_xvm.key" name="xvm" timeout="5"/>
$ fence_node node02
fence node02 success
(fence_virtd output as above)
And now fence_node:
fence_xvm must not return success on hash mismatches. With fix from upstream: Hash mismatch: C = 10c57b3860e26cd131c66c78c6521efa15525f57b77144d2849c247267ca98164e353406fa179cce36f732d872548532c5517e0de9c9c042fbb9105e0d3093b1 H = 32ba316f796adaf839e2954aaf50ddedada9b60f0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 R = 4d4dc594d49f77ad540d60191ef99a3d6a346ac297f0c155cdeb1ab17046cd370000000000000000000000000000000000000000000000000000000000000000 Invalid response to challenge Operation failed [root@ayanami client]# echo $? 1 Created attachment 516527 [details]
Full logs from test run
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
Cause: A hash handling mismatch.
Consequence: False success for fencing in some cases. This has the potential to cause data corruption in live-hang scenarios.
Fix: Correct hash handling mismatch.
Result: No more false successes for fencing, thereby preserving data integrity.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1566.html |
Description of problem: fence_xvm and fence_virtd both have options for setting hashes for verification. These (according to man pages) should include "none", sha1, sha256 and sha512. I'm setting them in pairs sha256/sha256 for auth and signing, and the sha256 seems to be the only one working. More, there are some combinations which I'd expect to fail. For example, setting sha256/sha256 in fence_virtd and sha256(auth)/sha512 signing in fence_xvm. And last but not least, one combination (sha256/sha256 in fence_virtd, (sha256(auth)/sha512)) fails when fence_xvm is called manually (Remote failed challenge) but it succeeds via fence_node (i.e. when cluster decides to fence the node). This is the worst case, as we can't pretend something was fenced, it must really be dead :). Version-Release number of selected component (if applicable): fence-virt-0.2.1-8.el6.x86_64 How reproducible: 100% Steps to Reproduce (auth hash first): 1. setup correct pairs on both sides and see that it does not work 2. setup fence_virtd with sha256/sha256 and fence_xvm with sha256/sha512 and see it rebooting the domain 3. setup fence virtd with sha256/sha256 and see fence_node returning success while the domain is not rebooted Actual results: see above Expected results: 1. matching pairs should always work 2. fencing with different signing key is not supposed to work (maybe just my expectation differs here, please clarify) 3. return success if and only if the domain is really rebooting Additional info: ######### host node fence_virt.conf ########## [root@marathon-03:~]$ cat /etc/fence_virt.conf fence_virtd { listener = "multicast"; backend = "libvirt"; } listeners { multicast { key_file = "/etc/cluster/fence_xvm.key"; address = "225.0.0.12"; hash="sha256"; auth="sha256"; interface="virbr1"; } } backends { libvirt { uri = "qemu:///system"; } } ########## virtual node01 cluster.conf ############# [root@node01:~]$ cat /etc/cluster/cluster.conf <?xml version="1.0"?> <cluster config_version="9" name="STSRHTS14642"> <cman/> <fence_daemon clean_start="0" post_join_delay="20"/> <clusternodes> <clusternode name="node01" nodeid="1" votes="1"> <fence> <method name="virt"> <device domain="node01" name="xvm"/> </method> </fence> </clusternode> <clusternode name="node02" nodeid="2" votes="1"> <fence> <method name="virt"> <device domain="node02" name="xvm"/> </method> </fence> </clusternode> <clusternode name="node03" nodeid="3" votes="1"> <fence> <method name="virt"> <device domain="node03" name="xvm"/> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice agent="fence_xvm" auth="sha512" hash="sha512" key_file="/etc/cluster/fence_xvm.key" name="xvm" timeout="5"/> </fencedevices> </cluster> fence_virtd started as: fence_virtd -F -d 99 -f /etc/fence_virt.conf