Bug 1422090
| Summary: | Rpcbind does not work in the container | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Mohamed Ashiq <mliyazud> |
| Component: | rhgs-server-container | Assignee: | Mohamed Ashiq <mliyazud> |
| Status: | CLOSED ERRATA | QA Contact: | Prasanth <pprakash> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | cns-3.4 | CC: | asriram, bleanhar, ekuric, hchiramm, julim, kramdoss, madam, mliyazud, pprakash, prasanna.kalever, rcyriac, rhs-bugs, rtalur, sankarshan, skoduri, srmukher, weshi |
| Target Milestone: | --- | ||
| Target Release: | CNS 3.6 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Previously, all services in Red Hat Gluster Storage container used to connect to rpcbind service in the container. With this update, every service now connects to rpcbind service on the host node.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-10-11 06:58:29 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1427806 | ||
| Bug Blocks: | 1433735, 1445447, 1445448 | ||
|
Description
Mohamed Ashiq
2017-02-14 12:50:44 UTC
Relates to RHEL7 BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1427806 *** Bug 1457617 has been marked as a duplicate of this bug. *** *** Bug 1457126 has been marked as a duplicate of this bug. *** According to findings in the RHEL BZ #1427806, this is solved as follows: 1) The rpcbind service has to run on the host (nodes that are supposed to run gluster containers). 2) We need a small change to the gluster container docker file. This change is already done. It will be shipped with the next gluster container image build. Ok. A with a little more analysis the situation seems like this:
1) A change in the host (OCP/RHEL) has led to the rpcbind being started
by default while it was not being started by default et before.
2) Originally the gluster containers started rpcbind, but in the
CNS builds this dependency was actually removed for the rhgs
3.2.0 release, in february, since gluster containers don't need
rpcbind.
==> I.e. cns 3.5 images should not have a problem!
==> I don't know how OCP qe could have run into this issue.
3) Now with preparation for CNS 3.6, the new gluster-blockd component
in the rhgs containers does require rpcbind. Hence testing with
the new CNS 3.6 containers, we hit the problem due to the changed
Host behavior.
==> The solution is to *not* start rpcbind in the container ever
and always rely on rpcbind running on the host.
==> Changed CNS 3.6 gluster builds expected tomorrow (July 26)
Summary questions for Brenton:
* Is it true that OCP 3.6 has the changed behavior of always
starting rpcbind on the host?
* How has OCP QE possibly hit this issue for OCP 3.6?
Have they possibly been using upstream images instead of RHGS/CNS images?
(In reply to Michael Adam from comment #8) > Summary questions for Brenton: > > * Is it true that OCP 3.6 has the changed behavior of always > starting rpcbind on the host? According to code, rpcbind will only start in openshift_storage_nfs_lvm role, which means in cns situation, it won't start. $ grep -nir "rpcbind" roles/openshift_storage_nfs_lvm/tasks/nfs.yml:6:- name: Start rpcbind roles/openshift_storage_nfs_lvm/tasks/nfs.yml:8: name: rpcbind > > * How has OCP QE possibly hit this issue for OCP 3.6? > Have they possibly been using upstream images instead of RHGS/CNS images? Our QE is using brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7, the tag is latest by default. in BZ #1457126, the latest is 3.3.0-7 which have problem. Now the latest is 3.3.0-9, which works well. Thank you :) with the latest cns 3.6 builds, rpcbind is now run on host instead of the containers. verified in build - cns-deploy-5.0.0-34.el7rhgs.x86_64 The following steps seems to a prerequisite now and the same has been documented in our CNS 3.6 guide [1] as well: ######### Execute the following commands to enable and run rpcbind on all the nodes hosting the gluster pod : # systemctl add-wants multi-user rpcbind.service # systemctl enable rpcbind.service # systemctl start rpcbind.service ######### [1] https://access.qa.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html-single/container-native_storage_for_openshift_container_platform/#chap-Documentation-Red_Hat_Gluster_Storage_Container_Native_with_OpenShift_Platform-Setting_the_environment-Preparing_RHOE doc text looks good to me Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:2877 |