RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1508373 - Correct handling of bundles
Summary: Correct handling of bundles
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pacemaker
Version: 7.4
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 7.5
Assignee: Ken Gaillot
QA Contact: Ofer Blaut
URL:
Whiteboard:
Depends On:
Blocks: 1509871
TreeView+ depends on / blocked
 
Reported: 2017-11-01 10:26 UTC by Andrew Beekhof
Modified: 2018-04-10 15:34 UTC (History)
5 users (show)

Fixed In Version: pacemaker-1.1.18-4.el7
Doc Type: No Doc Update
Doc Text:
Previously, multiple problems occurred when pacemaker handled the new "bundle" feature. As a consequence, scheduling actions related to bundles led to unpredictable outcomes. With this update, these problems have been fixed, and scheduling with bundles now works as expected.
Clone Of:
: 1509871 (view as bug list)
Environment:
Last Closed: 2018-04-10 15:32:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:0860 0 None None None 2018-04-10 15:34:11 UTC

Description Andrew Beekhof 2017-11-01 10:26:51 UTC
Description of problem:

A number of correctness issues were encountered and patched while using bundles to simulate OSP installs with a large numbers of IHA computes.

- [pacemaker] allow resources to stop before their state is known everywhere
- [pacemaker] ensure remote nodes get probed
- [pacemaker] Do not probe connection resources until the container is active
- [pacemaker] Exclude resources and nodes from the symmetric_default constraint in some circumstances
- [pacemaker] Do not always expire failed operations of nested remotes 
- [bundles] turn on stderr logging so 'docker logs' works
- [bundles] correctly populate allowed_nodes
- [bundles] Only wait for other containers on the same node to be probed 
- [bundles] There is no need for port mapping directives when net=host is specified 

These will all need to be patched in time for OSP12 GA.
Even the ones that look like optimizations prevent the cluster from getting into 'stuck' states where it cannot make progress towards service recovery.

Comment 2 Andrew Beekhof 2017-11-01 10:36:24 UTC
http://github.com/beekhof/pacemaker
 '~' represents a related but optional patch


+ 7bc58b4: Fix: PE: Have bundles log to stderr so that 'docker logs' works 
~ bc4228d: Test: PE: Resources are allowed to stop before their state is known everywhere 
+ b322110: Fix: PE: Resources are allowed to stop before their state is known everywhere 
~ 058d45e: Fix: PE: Use the node we already have and know isnt NULL 
~ 02defd0: PE: Flag resources that are acting as remote nodes 
~ bec14be: Test: PE: Bare metal remotes _can_ run resources now and must be probed 
+ cafc6b1: Fix: PE: Bare metal remotes _can_ run resources now and must be probed 
~ bee6a66: Test: PE: Bundles only need to wait for other containers on the same node to be probed 
+ 03d40c0: Fix: PE: Bundles only need to wait for other containers on the same node to be probed 
~ 929ed9b: Test: PE: There is no need for port mapping directives when net=host is specified 
+ b4321a7: Fix: PE: There is no need for port mapping directives when net=host is specified 
+ 40aaa36: Fix: PE: Do not always expire failed operations of nested remotes 
+ e1e81ae: Fix: PE: Consolidate REMOTE_CONTAINER_HACK logic 
~ f764f36: Test: PE: Exclude resources and nodes from the symmetric_default constraint in some circumstances 
+ cacbac0: Fix: PE: Exclude resources and nodes from the symmetric_default constraint in some circumstances 
~ b2ca8d5: Test: PE: Do not probe connection resources until the container is active 
+ c3d4ec0: Fix: PE: Do not probe connection resources until the container is active 
+ 2bf3f0b: Log: PE: Detailed resource information should include connection resource state 
~ d1d5643: Log: PE: Trace logging for generated bundle resource xml 
~ ccc8944: Log: PE: Remove overly noisey developmental logging 
~ bb5a731: Log: PE: Improved comment 
~ c0180b4: Test: PE: Only pass requests for promote/demote flags onto the bundle's child 
+ 4ee68a2: Fix: PE: Only pass requests for promote/demote flags onto the bundle's child

Comment 3 Andrew Beekhof 2017-11-01 10:46:35 UTC
Oh, and:

+ b0ca9a1: Fix: Tools: Allow crm_resource to operate on anonymous clones in unknown states

Comment 4 Andrew Beekhof 2017-11-02 05:18:11 UTC
Also:

~ 1fa28f0: Test: PE: Improved logging of reasons for stop/restart actions 
~ 837adae: Log: PE: Improved logging of reasons for stop/restart actions 
+ 3a34fed: Fix: PE: Allow all resources to stop prior to probes completing 
+ 15208f7: Fix: PE: Correctly defer processing of resources inside containers

Comment 11 errata-xmlrpc 2018-04-10 15:32:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0860


Note You need to log in before you can comment on or make changes to this bug.