Bug 155393
Summary: | RFE: option to disable ARP checking on connections | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Lon Hohberger <lhh> | ||||||||||
Component: | clumanager | Assignee: | Lon Hohberger <lhh> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | medium | ||||||||||||
Version: | 3 | CC: | bfox, ckloiber, cluster-maint, jbacik, kanderso, peggy.proffitt, rkenna | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | All | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | RHBA-2005-676 | Doc Type: | Bug Fix | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2005-09-30 14:56:17 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | 149311 | ||||||||||||
Bug Blocks: | 162166 | ||||||||||||
Attachments: |
|
Description
Lon Hohberger
2005-04-19 21:40:12 UTC
Created attachment 114843 [details]
Patch which kills ARP checking
Created attachment 114845 [details]
Corrected patch: Kill ARP table checking
The introduction of this patch will have a couple of outcomes: (a) Incorrectly configured service IPs which take over the default route due to incorrect subnet masks will make the cluster cease to function. Ex: Main cluster IP is 192.168.0.1/24 and we add 192.168.0.2/16 as a service IP, bringing up that service IP will kill the cluster node's traffic. In part, this case was what why we added the ARP check in the first place: so a misconfiguration didn't cause a node outage. (b) Incorrectly configured network stuff will mysteriously start working again. The "Denied connect from Foo, not in subnet" messages will go away. Lon, that patch works fine. I rebuilt clumanager-1.2.26.1 with the ARP patch, and it fixes the network tiebreaker bug as well. What we need now is an official hotfix package that we can support. Adding PM for further evaluation. Downgrading to initscripts-7.31.9.EL-1 also seems to fix this problem; there's a possible regression in the initscripts code which apparently causes problems with bonded routing. We're still evaluating this. I'm going to add the option to toggle this -- just in case we don't have a fix for the initscripts package any time soon. The change is low impact. Created attachment 115636 [details]
Patch fully implementing RFE.
This adds a configuration key:
cluster%msgsvc_noarp
When set to "yes" or "1", the cluster software will not use its internal ARP
table to validate new connections (nor will it ask the kernel for any ARP
information). This should quash "not in subnet" messages regardless of the
version of initscripts installed.
Created attachment 115934 [details]
rebuilt clumanager package
Attaching a rebuild clumanager with this patch.
Note: this version is for testing the noarp patch only. It is NOT intended for
production systems at this point.
Lon, can you explain more about in what file the user is supposed to set the "cluster%msgsvc_noarp" configuration key? Yes, the patch should have edited "man cludb" To block arp checking: cludb -p cluster%msgsvc_noarp 1 cludb -p cluster%msgsvc_noarp yes To re-enable it: cludb -p cluster%msgsvc_noarp 0 cludb -p cluster%msgsvc_noarp no Sorry - to be clear: you only have to do one of the above two things to change it. e.g. "1" is the same as "yes", "0" is the same as "no" I'm worried that this conflicts with another problem that Chris Kloiber saw. Is there any way we can have the customer(s) run "clurgmgrd -fd" and get the output up to the point where it says "Cluster I/F: xxxxxx" ? 1.2.27pre1-1 is available from: http://people.redhat.com/lhh/packages.html It should address this issue. This package should be used for testing only. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2005-676.html |