Bug 1310529
Summary: | Is there a way to disable the creation of check_writable.nodename.xxxxx hidden files ? | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | nikhil kshirsagar <nkshirsa> |
Component: | resource-agents | Assignee: | Oyvind Albrigtsen <oalbrigt> |
Status: | CLOSED WONTFIX | QA Contact: | cluster-qe <cluster-qe> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 6.6 | CC: | agk, cfeist, cluster-maint, cww, fdinitto, jkortus, jpokorny, mnovacek, nkshirsa, oalbrigt, rmccabe |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-06-20 15:10:44 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1414139 | ||
Bug Blocks: | 1269194 |
Description
nikhil kshirsagar
2016-02-22 07:11:20 UTC
What version of resource-agents is the customer using? (In reply to nikhil from comment #0) > Description of problem: > Red Hat Linux HA Cluster is creating hidden .check_writable.nodename.xxxxx > and interfering with customer application. These .check_writable files are > on a Samba mount and picked up by Autosys client when created. Is it > possible to disable this feature or at least exclude specific directories > underneath the mount point. It´s not clear what you mean by: "or at least exclude specific directories from underneath the mount point." The check is very specific and only uses the top level mount point and files are moved immediately: if [ $rw -eq $YES ]; then file=$(mktemp "$mount_point/.check_writable.$(hostname).XXXXXX") if [ ! -e "$file" ]; then ocf_log err "${OCF_RESOURCE_INSTANCE}: is_alive: failed write test on [$mount_point]. Return code: $errcode" return $NO fi rm -f $file > /dev/null 2> /dev/null fi > > Version-Release number of selected component (if applicable): > > > How reproducible: > > > Steps to Reproduce: > 1. > 2. > 3. > > Actual results: > > > Expected results: > > > Additional info: > https://bugzilla.redhat.com/show_bug.cgi?id=1023099 has the patch for it, > but it appears like the files are not deleted. files are deleted immediately after, see above. I think the customer has a situation where these files are picked up as soon as they are created by the Autosys client, so it seems like they want to disable the creation of these files. I have asked for the version of resource-agents. This check is run when OCF_CHECK_LEVEL is less than 20. [ $OCF_CHECK_LEVEL -lt 20 ] && return $YES ... if [ $rw -eq $YES ]; then file=$(mktemp "$mount_point/.check_writable.$(hostname).XXXXXX") Can you get the configuration from the customer as well? I'm having issues reproducing it. I think the customer has a situation where these files are picked up as soon as they are created by the Autosys client. Is there any particular information you need? It may be that since the files are short lived, their script was run at the exact time the file was created, and managed to copy it over before the files were deleted. I am confirming from them if this was indeed the situation. Step by step reproducer Before: Add service to /etc/cluster/cluster.conf: <fs device="/dev/sdb" fstype="ext4" mountpoint="/mnt/fstest" name="filesystem" > # mount -o ro,remount /mnt/fstest/ # grep is_alive: /var/log/cluster/rgmanager.log Nov 24 14:16:53 rgmanager [fs] fs:filesystem: is_alive: failed write test on [/mnt/fstest]. Return code: 0 Remounts automatically # mount ... /dev/sdb on /mnt/fstest type ext4 (rw) After: Add "write_check" part to service /etc/cluster/cluster.conf: <fs device="/dev/sdb" fstype="ext4" mountpoint="/mnt/fstest" name="filesystem" write_check="off"/> # mount -o ro,remount /mnt/fstest/ # grep is_alive: /var/log/cluster/rgmanager.log No new is_alive errors reported, and the partition isnt remounted rw # mount ... /dev/sdb on /mnt/fstest type ext4 (ro) Hold on a sec! Isn't it enough to explicitly set: <action name="status" depth="20" timeout="0" interval="0"/> <action name="monitor" depth="20" timeout="0" interval="0"/> for any resource that defines the same implicitly with nonzero interval and which you want to override to achieve status (monitor) with level of 20 never being triggered? Set = configure in cluster.conf. Just rechecked that the approach in [comment 22] does indeed work. Few pointers to the other (luci) bug I was recently dealing with and which was addressing actions in general: - "monitor" action disregarded: [bug 1173942 comment 17], point 1b. - where should actions be defined: [bug 1173942 comment 17], point 2. So, in order to prevent recurring "status" invocations for a particular resource, just override the "status" action at depth=20, which uses interval 1 or 10 minutes (fs and {cluster,net}fs, respectively) by default, with a custom action at that depth and interval set to 0: # ccs --addaction <RESOURCE> status depth=20 interval=0 --activate Note that, IIRC, --activate is important here, as that effectively triggers "push configuration logic to live cluster". Upon that event, rgmanager will trigger an internal reload and it will notices that depth=20 action should now not recur. This can be tested with starting cman separately, as usual, then running "rgmanager -f &" which will log some additional messages as rgmanager progress in its event handling. Hence I suppose this is just a matter of documentation and the bug should be closed as NOTABUG. Upon further investigation, I think there's something wrong with rgmanager as it behaves differently upon self-configuration at the initial start and when there's a configuration reload triggered by cluster-wide "propagate/reread configuration" signal. Using "rgmanager -f &" rather than "service rgmanager start": - at the beginning: Replacing action 'status' depth 20: interval: 60->2 - upon the reread: Replacing action 'status' depth 20: interval: 60->0 Sounds like a bug in rgmanager. re [comment 27]: > Sounds like a bug in rgmanager. Indeed: [bug 1414139] Supposing that the rgmanager [comment 28] unblocks a very straightforward way to prevent it from running implicit status operation for particular resource (+ on particular depths), which is mentioned in [comment 25], I consider this bug eligible for CLOSED WORKSFORME (or similar) resolution. I'll close it when the fix is available in RHEL6. Red Hat Enterprise Linux 6 transitioned to the Production 3 Phase on May 10, 2017. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available. The official life cycle policy can be reviewed here: http://redhat.com/rhel/lifecycle This issue does not appear to meet the inclusion criteria for the Production Phase 3 and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL: https://access.redhat.com |