Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1560731

Summary:

Adding a bind-mount to a bundle doesn't restart the associated container

Product:

Red Hat Enterprise Linux 7

Reporter:

Damien Ciabrini <dciabrin>

Component:

pacemaker

Assignee:

Andrew Beekhof <abeekhof>

Status:

CLOSED NOTABUG

QA Contact:

cluster-qe <cluster-qe>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

7.5

CC:

abeekhof, cluster-maint, dciabrin, michele

Target Milestone:

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-03-27 08:22:54 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
crm_report before and after bind-mount change	none

Description Damien Ciabrini 2018-03-26 20:59:12 UTC

Created attachment 1413351 [details]
crm_report before and after bind-mount change

Description of problem:
When adding a bind mount in a bundle, pacemaker does not react to the configuration change and does not restart automatically the associated container.

Version-Release number of selected component (if applicable):
1.1.18-11.el7-2b07d5c5a9

How reproducible:
Always


Steps to Reproduce:
1. create a bundle in the cluster (e.g. start from an OSP 13 deployment)
2. change the bind-mount configuration:
pcs resource bundle update galera-bundle storage-map add id=mysql-foo source-dir=/foo target-dir=/foo options=rw


Actual results:
the container started fo the bundle is not restarted by pacemaker:
5518d1d4a776        192.168.24.1:8787/rhosp13/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   25 hours ago        Up 25 hours            
                       galera-bundle-docker-2

Expected results:
pacemaker should have delete the container and recreate a new one with the appropriate bind-mount

Additional info:
Attached crm_report

Comment 2 Andrew Beekhof 2018-03-27 02:29:02 UTC

It appears we at least intended to do a restart:

Mar 26 20:34:05 controller-1 pengine[19471]:   notice:  * Restart    galera-bundle-docker-0 ( controller-1 )   due to resource definition change

The crmd is also under the impression it happened:

Mar 26 20:34:13 controller-1 crmd[19472]:   notice: Initiating stop operation galera-bundle-docker-2_stop_0 on controller-0
Mar 26 20:34:23 controller-1 crmd[19472]:   notice: Initiating stop operation galera-bundle-docker-1_stop_0 on controller-2
Mar 26 20:34:34 controller-1 crmd[19472]:   notice: Initiating stop operation galera-bundle-docker-0_stop_0 locally on controller-1

And on the one node we have logs for we see it completed:

Mar 26 20:34:44 controller-1 crmd[19472]:   notice: Result of stop operation for galera-bundle-docker-0 on controller-1: 0 (ok)

Which is confirmed by docker:

Mar 26 20:34:44 controller-1 dockerd-current[18334]: time="2018-03-26T16:34:44.128626799-04:00" level=debug msg="Sending kill signal 9 to container 10ba9787f9c2150b3dd4f9cd92227a635ce64ca216de3f635b7c0c844229c757"
Mar 26 20:34:44 controller-1 dockerd-current[18334]: time="2018-03-26T16:34:44.210568911-04:00" level=debug msg="containerd: process exited" id=10ba9787f9c2150b3dd4f9cd92227a635ce64ca216de3f635b7c0c844229c757 pid=init status=137 systemPid=48206
Mar 26 20:34:44 controller-1 dockerd-current[18334]: time="2018-03-26T16:34:44.215002739-04:00" level=error msg="containerd: deleting container" error="exit status 1: \"container 10ba9787f9c2150b3dd4f9cd92227a635ce64ca216de3f635b7c0c844229c757 does not exist\\none or more of the container deletions failed\\n\""

And later we see the creation:

Mar 26 20:34:44 controller-1 dockerd-current[18334]: time="2018-03-26T16:34:44.490344517-04:00" level=debug msg="Calling POST /v1.26/containers/create?name=galera-bundle-docker-0"


Could you attach journal.log from the other nodes or look for comparable logs on controller-{0,2} please?
I wonder if the delete+create within a short interval is confusing the docker output.

Comment 3 Damien Ciabrini 2018-03-27 08:22:54 UTC

Oops, sorry I obviously did something wrong... I inspected the state of the galera container right after the "pcs resource update" command..

The command is asynchronous and it first has to stop the galera resource itself, then the container. I didn't wait long enough which was the reason why I got confused and thought pacemaker wasn't behaving as expected.

I reran the test and confirmed that things are working as expected. Closing this bug now.