Bug 1246149
| Summary: | Allow disabling resource-discovery at resource creation time | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | michal novacek <mnovacek> | ||||||||||
| Component: | pacemaker | Assignee: | Ken Gaillot <kgaillot> | ||||||||||
| Status: | NEW --- | QA Contact: | cluster-qe <cluster-qe> | ||||||||||
| Severity: | medium | Docs Contact: | |||||||||||
| Priority: | low | ||||||||||||
| Version: | 8.0 | CC: | agk, bperkins, christianelwin.romein, cluster-maint, fdinitto, kgaillot, michele, mnovacek, sbradley, tojeline | ||||||||||
| Target Milestone: | rc | Keywords: | FutureFeature, Triaged | ||||||||||
| Target Release: | --- | ||||||||||||
| Hardware: | Unspecified | ||||||||||||
| OS: | Unspecified | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | Doc Type: | Enhancement | |||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | Type: | Feature Request | |||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Attachments: |
|
||||||||||||
I'm seeing an error from haproxy about not being able to bind to a socket: Jul 23 12:01:48 virt-094 haproxy-systemd-wrapper: [ALERT] 203/120148 (16806) : Starting frontend vip: cannot bind socket [10.34.71.198:80] and then a few lines later, it looks like the IP address is setup. Can you add a constraint for the vip to start before haproxy and see if that solves the issue? https://github.com/beekhof/osp-ha-deploy/blob/master/pcmk/lb.scenario#L50 when using haproxy in clone mode you need to allow haproxy to bind to non-local IP. Adding the collocation constraint for haproxy-clone and vip starts correctly haproxy on the node where vip runs. haproxy will move with vip when vip is moved. On the other node (=where vip is not running) haproxy does not start with the 'Cannot bind to socket' message event though net.ipv4.ip_nonlocal_bind=1 is setp What would you recommend checking next?
I have found a reproducer.
1/ have cluster configured and running (1)
2/ run commands.sh
3/ see haproxy-clone not started
4/ run 'pcs resource cleanup haproxy-clone'
5/ watch haproxy-clone started
(1)
Cluster Name: STSRHTS20027
Corosync Nodes:
virt-094 virt-095 virt-096 virt-097
Pacemaker Nodes:
virt-094 virt-095 virt-096 virt-097
Resources:
Clone: dlm-clone
Meta Attrs: interleave=true ordered=true
Resource: dlm (class=ocf provider=pacemaker type=controld)
Operations: start interval=0s timeout=90 (dlm-start-timeout-90)
stop interval=0s timeout=100 (dlm-stop-timeout-100)
monitor interval=30s on-fail=fence (dlm-monitor-interval-30s)
Clone: clvmd-clone
Meta Attrs: interleave=true ordered=true
Resource: clvmd (class=ocf provider=heartbeat type=clvm)
Attributes: with_cmirrord=1
Operations: start interval=0s timeout=90 (clvmd-start-timeout-90)
stop interval=0s timeout=90 (clvmd-stop-timeout-90)
monitor interval=30s on-fail=fence (clvmd-monitor-interval-30s)
Stonith Devices:
Resource: fence-virt-094 (class=stonith type=fence_xvm)
Attributes: action=reboot debug=1 pcmk_host_check=static-list pcmk_host_list=virt-094 pcmk_host_map=virt-094:virt-094.cluster-qe.lab.eng.brq.redhat.com
Operations: monitor interval=60s (fence-virt-094-monitor-interval-60s)
Resource: fence-virt-095 (class=stonith type=fence_xvm)
Attributes: action=reboot debug=1 pcmk_host_check=static-list pcmk_host_list=virt-095 pcmk_host_map=virt-095:virt-095.cluster-qe.lab.eng.brq.redhat.com
Operations: monitor interval=60s (fence-virt-095-monitor-interval-60s)
Resource: fence-virt-096 (class=stonith type=fence_xvm)
Attributes: action=reboot debug=1 pcmk_host_check=static-list pcmk_host_list=virt-096 pcmk_host_map=virt-096:virt-096.cluster-qe.lab.eng.brq.redhat.com
Operations: monitor interval=60s (fence-virt-096-monitor-interval-60s)
Resource: fence-virt-097 (class=stonith type=fence_xvm)
Attributes: action=reboot debug=1 pcmk_host_check=static-list pcmk_host_list=virt-097 pcmk_host_map=virt-097:virt-097.cluster-qe.lab.eng.brq.redhat.com
Operations: monitor interval=60s (fence-virt-097-monitor-interval-60s)
Fencing Levels:
Location Constraints:
Ordering Constraints:
start dlm-clone then start clvmd-clone (kind:Mandatory) (id:order-dlm-clone-clvmd-clone-mandatory)
Colocation Constraints:
clvmd-clone with dlm-clone (score:INFINITY) (id:colocation-clvmd-clone-dlm-clone-INFINITY)
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: STSRHTS20027
dc-version: 1.1.12-44eb2dd
have-watchdog: false
last-lrm-refresh: 1437746699
no-quorum-policy: freeze
[root@virt-095 ~]# pcs status
Cluster name: STSRHTS20027
Last updated: Fri Jul 24 16:08:29 2015 Last change: Fri Jul 24 16:07:43 2015
Stack: corosync
Current DC: virt-097 (version 1.1.12-44eb2dd) - partition with quorum
4 nodes and 12 resources configured
Online: [ virt-094 virt-095 virt-096 virt-097 ]
Full list of resources:
fence-virt-094 (stonith:fence_xvm): Started virt-094
fence-virt-095 (stonith:fence_xvm): Started virt-095
fence-virt-096 (stonith:fence_xvm): Started virt-096
fence-virt-097 (stonith:fence_xvm): Started virt-097
Clone Set: dlm-clone [dlm]
Started: [ virt-094 virt-095 virt-096 virt-097 ]
Clone Set: clvmd-clone [clvmd]
Started: [ virt-094 virt-095 virt-096 virt-097 ]
PCSD Status:
virt-094: Online
virt-095: Online
virt-096: Online
virt-097: Online
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
Created attachment 1055794 [details]
reproducer commands
Created attachment 1055795 [details]
pcs cluster report output
You really need to fix your sysctl as I mentioned before. cat /proc/sys/net/ipv4/ip_nonlocal_bind 0 [root@virt-094 ~]# [root@virt-095 ~]# cat /proc/sys/net/ipv4/ip_nonlocal_bind 1 Michal, Have you had a chance to make the change from comment #9? Thanks, Chris Yes, and the problem still stands. I put ip_nonlocal_bind to the new reproducer script so it is clear that it I use them. I'm not sure thought that it is a pcs problem, it seems more likely to be resource-agent. Created attachment 1059823 [details]
added ip_nonlocal_bind to reproducer script
The following versions of compononents have been used when writing comment #11 pcs-0.9.137-13.el7_1.3.x86_64 pacemaker-1.1.13-6.el7.x86_64 resource-agents-3.9.5-50.el7.x86_64 I do not think pcs is the right component to blame here as it serves merely as a tool for creating CIB and running resources cleanup in pacemaker. Moving to resource-agents for further investigation. moving to pacemaker. If anything systemd resources are internal to pcmk. The reproducer script adds the resources, pushes that to the cluster, *then* adds the constraints. This means that the cluster will initially schedule the resources without the constraints, and services might be started on undesired nodes. It would be better to do all the commands in a single file, then push that one file to the cluster (or at least, put any constraints related to a resource in the same file that creates it, so they go into effect immediately). I haven't had time to thoroughly analyze the logs yet. If haproxy initially failed to start on all nodes, that would explain the behavior. Keep in mind that even if constraints forbid a resource from running on a particular node, by default pacemaker will still run a one-time monitor (probe) on the node to ensure that the resource is indeed not running there. So if the software is not installed on that node, the probe can fail and cause problems. (In reply to Ken Gaillot from comment #17) > The reproducer script adds the resources, pushes that to the cluster, *then* > adds the constraints. This means that the cluster will initially schedule > the resources without the constraints, and services might be started on > undesired nodes. It would be better to do all the commands in a single file, > then push that one file to the cluster (or at least, put any constraints > related to a resource in the same file that creates it, so they go into > effect immediately). > ... This might be the problem. We really do push resources in one batch and than all the constraints for the cluster in another. And yes, the haproxy is installed only on two nodes (out of four) where it is supposed to run by constraints. Unluckily, this behavior would not be easy to change in our testing framework. Is there any other way on how this can be done "the right way"? Something like putting all nodes into standby mode and then push all resources and constraints and un-standby? > > I haven't had time to thoroughly analyze the logs yet. If haproxy initially > failed to start on all nodes, that would explain the behavior. > > Keep in mind that even if constraints forbid a resource from running on a > particular node, by default pacemaker will still run a one-time monitor > (probe) on the node to ensure that the resource is indeed not running there. > So if the software is not installed on that node, the probe can fail and > cause problems. Michael,
Your "prefers" constraints are fine, but instead of "avoids", you need the advanced constraint command:
pcs constraint location add vip-avoids-node2 vip ${nodes[2]} -INFINITY resource-discovery=never
for each resource/node combination ("vip-avoids_node2" is an arbitrary ID). This is identical to "avoids", except that resource-discovery=never disables startup probes on that node. This is desirable when the software is not installed on all nodes.
Unfortunately, you still have a problem before the constraint is created. Neither standby mode, maintenance mode, nor disabling the resource at creation will prevent startup probes. I don't know of a way around that; the ideal is really to push the resource creation and constraints together. Of course, you could just cleanup after creating the constraints.
As mentioned in Comment 19, modifying the constraints will help, but pushing the resource creation and constraints together is the only way currently to completely avoid the issue. Leaving this BZ open as a feature request to allow disabling resource-discovery at resource creation time, which will not be addressed in the 7.3 timeframe, but will be evaluated for 7.4. This will not be implemented in the 7.4 timeframe Due to time constraints, this will not make 7.5 Moving to RHEL 8, as new features will no longer be added to RHEL 7 as of 7.8 |
Created attachment 1055413 [details] pcs cluster report output Description of problem: I have this strange problem with haproxy resource agent. In our automated scenario I set up systemd:haproxy clone and start (enable) it. It is configured to start only two instances on four node cluster (clone-max=2) and is restricted by constraints to start on two specific nodes only. What happens is that sometimes the haproxy-clone will not start at all. It is needed to run 'pcs resource cleanup haproxy-clone' after which it starts and all is happy ever after. haproxy can be started with 'systemctl start haproxy' and with 'pcs resource debug-start haproxy' in the case where it is not started after 'pcs resource enable haproxy-clone' I'd like you to have a look at the attached crm-report whether there is something that you can see related to this problem (because I don't) and or help me find this problem. Version-Release number of selected component (if applicable): pcs-0.9.137-13.el7_1.3.x86_64 pacemaker-1.1.13-5.el7.x86_64 How reproducible: most of the time Steps to Reproduce: 1. pcs resource enable haproxy-clone Actual results: haproxy clone not started unless 'pcs resource cleanup haproxy-clone' Expected results: haproxy clone started