Bug 1329190
Summary: | Erasure coded pool creation succeeds in backend but request/{id} returns an error | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Shubhendu Tripathi <shtripat> | ||||
Component: | Calamari | Assignee: | Christina Meno <gmeno> | ||||
Calamari sub component: | Back-end | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | urgent | ||||||
Priority: | unspecified | CC: | ceph-eng-bugs, gmeno, hnallurv, japplewh, kdreyer, nthomas, shtripat | ||||
Version: | 2.0 | Keywords: | Reopened | ||||
Target Milestone: | rc | ||||||
Target Release: | 2.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | RHEL: calamari-server-1.4.4-1.el7cp Ubuntu: calamari_1.4.4-2redhat1xenial | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-08-23 19:36:47 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1335525, 1339480, 1343229 | ||||||
Attachments: |
|
Description
Shubhendu Tripathi
2016-04-21 11:29:43 UTC
Thank you for the great bug report. I appreciate the steps to reproduce. Looks like calamari is not validating the set of fields that are applicable to EC pool creation. The reason it is succeeding in part is because pool creation is a two-step process in calamari. 1. we create the pool with the fields that must be present. 2. we set any properties that are optional and have defaults so you are seeing the failures of step two. I propose we put it in 2.1 because it's not critical IMHO as I understand it size is not applicable to ec pool because that is controlled by the erasure code profile. Can you ignore this specific error in the 2.0 release? Gregory, Issue is we have generic code which looks for the task output and if there are errMsg poipulated we just return the same error to calling code. Now this generic code gets invoked for any POST call to calamari, so I feel certainly its not good to ignore as it could be a genuine error in some other case and we would miss the error. What types of properties are optional as in Gregory's step 2? @Jeff, size is optional field foe ec pool creation. But error message we cannot ignore as the same function is used for all other POST requests. @Shubendu yes I agree - I was just trying to understand the flow What is the next step on this BZ? I need to make a new validating serializer for erasure coded pools here: https://github.com/ceph/calamari/blob/1.4/rest-api/calamari_rest/serializers/v2.py#L75 In progress upstream, expect to finish by May20th. https://github.com/ceph/calamari/pull/451 as soon as the tests pass I'll merge and tag ~ 30min tagged as v1.4.0-rc13 Would you please build this downstream? I will do that first thing tomorrow. With this fix, do we need NOT pass "size" parameter for ec-pool creation? With existing code in USM (which passes size parameter), now ec-ppol creation fails with "400 BAD REQUEST". Tweaked USM code a bit to not to pass size parameter but this returns with "500 INTERNAL SERVER ERROR". Kindly suggest... looks like a bug. I've got a fix https://github.com/ceph/calamari/pull/456 upstream. What version of Calamari will contain this change? v.1.4.1 I modified the USM code to not pass size parameter for EC pools now. But calamari creates the ec pool in backend but the async task for the same still does not report as "completed". It still says "submitted" as below and so after a min USM times out saying ec pool creation failed ----------------------- { "id": "1a9bfe1e-31d4-44ec-bc6d-7c7b8148acc9", "state": "submitted", "error": false, "error_message": "", "headline": "Creating pool 'ecpool'", "status": "Waiting for OSD map epoch 83", "requested_at": "2016-06-17T12:36:42.714802+00:00", "completed_at": null } ----------------------- Shubhendu, calamari is waiting on several conditions one of which is the osd map of a specific epoch. additionally it waits for all PGs in the resulting pool¬ to leave state 'creating' before completing. Would you please share the ceph -w output before and after trying to create this pool ? Also it would be really helpful to see the CRUSH map of the cluster you're using to create this pool. I've tested this out and it works for me. Created attachment 1170062 [details]
crush map
Gregory, I started ceph -w before invoking the erasure pool creation and below is the output while ec pool creation request timed out in USM ================================================================= [root@dhcp46-204 ~]# ceph -w cluster d3e3f8b7-fa1e-48da-adeb-7c76178cc1e7 health HEALTH_OK monmap e1: 1 mons at {dhcp46-204=10.70.46.204:6789/0} election epoch 3, quorum 0 dhcp46-204 osdmap e75: 5 osds: 5 up, 5 in flags sortbitwise pgmap v243: 1024 pgs, 2 pools, 228 bytes data, 8 objects 185 MB used, 149 GB / 149 GB avail 1024 active+clean 2016-06-21 10:31:02.879125 mon.0 [INF] pgmap v244: 1024 pgs: 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail; 0 B/s rd, 0 op/s 2016-06-21 10:31:03.920859 mon.0 [INF] pgmap v245: 1024 pgs: 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail; 0 B/s rd, 0 op/s 2016-06-21 10:31:04.949836 mon.0 [INF] pgmap v246: 1024 pgs: 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail; 7436 B/s rd, 9 op/s 2016-06-21 10:31:07.981476 mon.0 [INF] pgmap v247: 1024 pgs: 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail; 4008 B/s rd, 5 op/s 2016-06-21 10:31:09.000499 mon.0 [INF] pgmap v248: 1024 pgs: 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail; 6569 B/s rd, 8 op/s 2016-06-21 10:31:10.053953 mon.0 [INF] pgmap v249: 1024 pgs: 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail; 6985 B/s rd, 8 op/s 2016-06-21 10:32:18.697233 mon.0 [INF] from='client.? 10.70.46.204:0/4228501899' entity='client.admin' cmd=[{"pool_type": "erasure", "prefix": "osd pool create", "pg_num": 512, "erasure_code_profile": "default", "format": "json", "pool": "ecpool0"}]: dispatch 2016-06-21 10:32:19.141610 mon.0 [INF] osdmap e76: 5 osds: 5 up, 5 in 2016-06-21 10:32:19.165852 mon.0 [INF] pgmap v250: 1024 pgs: 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail 2016-06-21 10:32:19.774692 mon.0 [INF] from='client.? 10.70.46.204:0/4228501899' entity='client.admin' cmd='[{"pool_type": "erasure", "prefix": "osd pool create", "pg_num": 512, "erasure_code_profile": "default", "format": "json", "pool": "ecpool0"}]': finished 2016-06-21 10:32:19.834641 mon.0 [INF] osdmap e77: 5 osds: 5 up, 5 in 2016-06-21 10:32:19.835175 mon.0 [INF] from='client.? 10.70.46.204:0/4228501899' entity='client.admin' cmd=[{"var": "crush_ruleset", "prefix": "osd pool set", "pool": "ecpool0", "val": 1, "format": "json"}]: dispatch 2016-06-21 10:32:19.853713 mon.0 [INF] pgmap v251: 1536 pgs: 512 creating, 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail 2016-06-21 10:32:20.803552 mon.0 [INF] from='client.? 10.70.46.204:0/4228501899' entity='client.admin' cmd='[{"var": "crush_ruleset", "prefix": "osd pool set", "pool": "ecpool0", "val": 1, "format": "json"}]': finished 2016-06-21 10:32:20.823650 mon.0 [INF] osdmap e78: 5 osds: 5 up, 5 in 2016-06-21 10:32:20.824220 mon.0 [INF] from='client.? 10.70.46.204:0/4228501899' entity='client.admin' cmd=[{"field": "max_bytes", "prefix": "osd pool set-quota", "pool": "ecpool0", "val": "0", "format": "json"}]: dispatch 2016-06-21 10:32:20.857157 mon.0 [INF] pgmap v252: 1536 pgs: 512 creating, 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail 2016-06-21 10:32:21.832442 mon.0 [INF] from='client.? 10.70.46.204:0/4228501899' entity='client.admin' cmd='[{"field": "max_bytes", "prefix": "osd pool set-quota", "pool": "ecpool0", "val": "0", "format": "json"}]': finished 2016-06-21 10:32:21.839068 mon.0 [INF] osdmap e79: 5 osds: 5 up, 5 in 2016-06-21 10:32:21.839995 mon.0 [INF] from='client.? 10.70.46.204:0/4228501899' entity='client.admin' cmd=[{"field": "max_objects", "prefix": "osd pool set-quota", "pool": "ecpool0", "val": "0", "format": "json"}]: dispatch 2016-06-21 10:32:21.845359 mon.0 [INF] pgmap v253: 1536 pgs: 512 creating, 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail 2016-06-21 10:32:22.817904 mon.0 [INF] from='client.? 10.70.46.204:0/4228501899' entity='client.admin' cmd='[{"field": "max_objects", "prefix": "osd pool set-quota", "pool": "ecpool0", "val": "0", "format": "json"}]': finished 2016-06-21 10:32:22.848405 mon.0 [INF] osdmap e80: 5 osds: 5 up, 5 in 2016-06-21 10:32:22.888835 mon.0 [INF] pgmap v254: 1536 pgs: 512 creating, 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail 2016-06-21 10:32:23.986663 mon.0 [INF] pgmap v255: 1536 pgs: 512 creating, 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail 2016-06-21 10:32:25.001435 mon.0 [INF] pgmap v256: 1536 pgs: 512 creating, 1024 active+clean; 228 bytes data, 185 MB used, 149 GB / 149 GB avail 2016-06-21 10:32:27.287198 mon.0 [INF] pgmap v257: 1536 pgs: 512 creating, 1024 active+clean; 228 bytes data, 186 MB used, 149 GB / 149 GB avail 2016-06-21 10:32:28.300172 mon.0 [INF] pgmap v258: 1536 pgs: 512 creating, 1024 active+clean; 228 bytes data, 186 MB used, 149 GB / 149 GB avail 2016-06-21 10:32:30.332188 mon.0 [INF] pgmap v259: 1536 pgs: 512 creating, 1024 active+clean; 228 bytes data, 186 MB used, 149 GB / 149 GB avail 2016-06-21 10:32:32.368658 mon.0 [INF] pgmap v260: 1536 pgs: 512 creating, 1024 active+clean; 228 bytes data, 186 MB used, 149 GB / 149 GB avail 2016-06-21 10:32:33.390175 mon.0 [INF] pgmap v261: 1536 pgs: 512 creating, 1024 active+clean; 228 bytes data, 186 MB used, 149 GB / 149 GB avail ================================================================= Also the request output rom calamari django UI is as below =========================================================== GET /api/v2/cluster/d3e3f8b7-fa1e-48da-adeb-7c76178cc1e7/request/c0cd4702-e232-4f45-9ea9-0df668a06036 HTTP 200 OK Vary: Accept Content-Type: text/html; charset=utf-8 Allow: GET, HEAD, OPTIONS { "id": "c0cd4702-e232-4f45-9ea9-0df668a06036", "state": "submitted", "error": false, "error_message": "", "headline": "Creating pool 'ecpool0'", "status": "Waiting for OSD map epoch 80", "requested_at": "2016-06-21T05:02:18.630201+00:00", "completed_at": null } ==================================================================== The cluster status is WARNING after executing ec pool creation as shown below ============================= [root@dhcp46-204 ~]# ceph -s cluster d3e3f8b7-fa1e-48da-adeb-7c76178cc1e7 health HEALTH_ERR 512 pgs are stuck inactive for more than 300 seconds 512 pgs stuck inactive too many PGs per OSD (307 > max 300) monmap e1: 1 mons at {dhcp46-204=10.70.46.204:6789/0} election epoch 3, quorum 0 dhcp46-204 osdmap e80: 5 osds: 5 up, 5 in flags sortbitwise pgmap v261: 1536 pgs, 3 pools, 228 bytes data, 8 objects 186 MB used, 149 GB / 149 GB avail 1024 active+clean 512 creating ============================ The crush map for the cluster is attached. As asked, below is the data payload we use for ec pool creation { "name": <name of pool>, "quota_max_objects": <int value>, "quota_max_bytes": <int value>, "pg_num": <pgs>, "pgp_num": <pgs>, "type": "erasure", "erasure_code_profile": <EC profile name which we create at cluster creation>, "crush_ruleset": <ruleset no>, } [root@dhcp46-204 ~]# crushtool -i crush.map --test --show-bad-mappings --rule 2 --num-rep 1 --min-x 1 --max-x 10 bad mapping rule 2 x 1 num_rep 1 result [2147483647] bad mapping rule 2 x 2 num_rep 1 result [2147483647] bad mapping rule 2 x 3 num_rep 1 result [2147483647] bad mapping rule 2 x 4 num_rep 1 result [2147483647] bad mapping rule 2 x 5 num_rep 1 result [2147483647] bad mapping rule 2 x 6 num_rep 1 result [2147483647] bad mapping rule 2 x 7 num_rep 1 result [2147483647] bad mapping rule 2 x 8 num_rep 1 result [2147483647] bad mapping rule 2 x 9 num_rep 1 result [2147483647] bad mapping rule 2 x 10 num_rep 1 result [2147483647] the crushtool is saying that this rule rule erasure-code { ruleset 2 type erasure min_size 3 max_size 3 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default step chooseleaf indep 0 type host step emit } won't map to any OSDs In this particular rule you're starting with the default node root default { id -1 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item dhcp46-150 weight 0.000 item dhcp46-151 weight 0.000 item dhcp46-27 weight 0.000 item dhcp46-8 weight 0.000 } and trying to chose pool_num hosts then get ODS from those subtrees none of the host nodes listed in default have OSDs so this fails. This is not a bug with calamari, you've made a crush map that contains some semantic errors. Perhaps in 2.1 calamari could help more here. What you'd need to do to make this work you could change the rule erasure-code to start at root node general. @Nishanth, I feel some changes required while creating crush nodes for the cluster. The problem here is that calamari creating crush rule itself and using it as the rule set for erasure coded pool getting created rather than using the rule set supplied by USM along with request.Why is it done so? Does calamari creates a crush rule per ecpool? Ideally calamari should use the crush rule provided by USM rather than create its own? Nishanth and Shubhendu Please pursue the workaround: 1. create erasure coded pool and wait for request endpoint to indicate that it's waiting to complete 2. PATCH the pool to update the crush ruleset, creation should succeed. In the meantime I'll attempt to fix the issue that create won't honor the crushruleset field during pool creation Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-1755.html |