Bug 1461367
| Summary: | Addition of mds node to an existing cluster fails | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | shilpa <smanjara> |
| Component: | Ceph-Ansible | Assignee: | Sébastien Han <shan> |
| Status: | CLOSED WORKSFORME | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> |
| Severity: | medium | Docs Contact: | Erin Donnelly <edonnell> |
| Priority: | urgent | ||
| Version: | 3.0 | CC: | adeza, aschoen, ceph-eng-bugs, edonnell, flucifre, gmeno, hnallurv, icolle, kdreyer, nthomas, sankarshan, seb, shan, smanjara |
| Target Milestone: | rc | ||
| Target Release: | 3.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Known Issue | |
| Doc Text: |
.Adding an MDS to an existing cluster fails
Adding a Ceph Metadata Server (MDS) to an existing cluster fails with the error:
----
osd_pool_default_pg_num is undefined\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mon/tasks/create_mds_filesystems.yml
----
As a consequence, an attempt to create an MDS pool fails.
To work around this issue, add the `osd_pool_default_pg_num` parameter to `ceph_conf_overrides` in the `/usr/share/ceph-ansible/group_vars/all.yml` file, for example:
----
ceph_conf_overrides:
global:
osd_pool_default_pg_num: 64
----
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-09-15 13:10:54 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1437916 | ||
Sebastien, what specific change upstream fixed this BZ? Ken, this is fixed in https://github.com/ceph/ceph-ansible/commit/ea68fbaaaee38b1a39b1f093e0faf5f897a466b0 That commit is already in the version Shilpa was running above (ceph-ansible-2.2.11). See "git tag --contains ea68fbaaaee38b1a39b1f093e0faf5f897a466b0" What else do we need to fix this? @Ken, to be honest, I don't know, we don't see this error in the CI.
@Shilpa, could you please try again and let me know if you still see this issue?
I tried to reproduce, without success. I first deployed an initial cluster with 3 mons and 3 osds. Then I added a MDS node and re-ran Ansible. Success.
As you can see I successfully passed this task:
TASK [ceph-mon : create filesystem pools] **********************************************************************************************************************************************************************
task path: /home/jenkins-build/build/workspace/ceph-ansible/roles/ceph-mon/tasks/create_mds_filesystems.yml:6
ok: [mon2] => (item=cephfs_data) => {"changed": false, "cmd": ["ceph", "--cluster", "test", "osd", "pool", "create", "cephfs_data", "8"], "delta": "0:00:01.564608", "end": "2017-08-31 08:27:25.223125", "item"
: "cephfs_data", "rc": 0, "start": "2017-08-31 08:27:23.658517", "stderr": "pool 'cephfs_data' created", "stderr_lines": ["pool 'cephfs_data' created"], "stdout": "", "stdout_lines": []}
ok: [mon2] => (item=cephfs_metadata) => {"changed": false, "cmd": ["ceph", "--cluster", "test", "osd", "pool", "create", "cephfs_metadata", "8"], "delta": "0:00:01.035994", "end": "2017-08-31 08:27:29.731975"
, "item": "cephfs_metadata", "rc": 0, "start": "2017-08-31 08:27:28.695981", "stderr": "pool 'cephfs_metadata' created", "stderr_lines": ["pool 'cephfs_metadata' created"], "stdout": "", "stdout_lines": []}
See the final results:
jenkins-build@ceph-builders:~/build/workspace/ceph-ansible/tests/functional/centos/7/bluestore$ vagrant ssh mon0 -c "sudo ceph --cluster test -s"
cluster:
id: 5a51b9d9-b110-4a5f-b73c-b5dcf63552a1
health: HEALTH_WARN
no active mgr
services:
mon: 3 daemons, quorum ceph-mon0,ceph-mon1,ceph-mon2
mgr: no daemons active
mds: cephfs-1/1/1 up {0=ceph-mds0=up:active}
osd: 1 osds: 1 up, 1 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 bytes
usage: 0 kB used, 0 kB / 0 kB avail
pgs:
Connection to 192.168.121.133 closed.
jenkins-build@ceph-builders:~/build/workspace/ceph-ansible/tests/functional/centos/7/bluestore$ vagrant ssh mon0 -c "sudo ceph --cluster test fs ls"
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
Connection to 192.168.121.133 closed.
I'm tempted to close this bug but I'll wait from you to report first.
Thanks in advance.
Given that I haven't got any response and that I can not reproduce, I'm closing this. Feel free to re-open. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |
Description of problem: Addition of mds server to an existing cluster fails with the following error: TASK [ceph-mon : create filesystem pools] ************************************** fatal: [magna096]: FAILED! => {"failed": true, "msg": "the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'osd_pool_default_pg_num' is undefined\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mon/tasks/create_mds_filesystems.yml': line 6, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n# since those check are performed by the ceph-common role\n- name: create filesystem pools\n ^ here\n"} Version-Release number of selected component (if applicable): ceph-ansible-2.2.11-1.el7scon.noarch How reproducible: Always Steps to Reproduce: 1. Create a cluster first with ceph-ansible. 2. Once the cluster is up, run ansible again to add an mds server Actual results: MDS pool creation fails. "The error was: 'osd_pool_default_pg_num' is undefined\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mon/tasks/create_mds_filesystems.yml" Additional info: As a workaround added 'osd_pool_default_pg_num' to group_vars/all.yml in "CONFIG OVERRIDE" section: ceph_conf_overrides: global: osd_pool_default_pg_num: 64 After this, ansible successfully installed mds server.