Bug 1413152
Summary: | Error registering Ceph nodes when importing external Ceph cluster into Storage Console | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Storage Console | Reporter: | Alan Bishop <alan_bishop> | ||||
Component: | node-monitoring | Assignee: | anmol babu <anbabu> | ||||
Status: | CLOSED NOTABUG | QA Contact: | sds-qe-bugs | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 2 | CC: | alan_bishop, arkady_kanevsky, cdevine, christopher_dearborn, dahorak, dcain, John_walsh, kasmith, kurt_hey, ltrilety, mkarnik, morazi, nthomas, randy_perryman, rkanade, sankarshan, smerrow, sreichar | ||||
Target Milestone: | --- | ||||||
Target Release: | 3 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-01-19 20:44:38 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1335596, 1356451 | ||||||
Attachments: |
|
Description
Alan Bishop
2017-01-13 18:30:06 UTC
I have an sosreport for the Storage Console VM that I'll post as soon as possible (currently having network issues). Created attachment 1240506 [details]
sosreport for Storage Console VM
For the first look, this seems to me, like problem with duplicated machine-id across the storage nodes. Could you please check, if the content of file '/etc/machine-id' is different on each storage node? The machine IDs are unique because I made them so. The first time I tried importing an external cluster they weren't unique (due to Bug #1270860), and that was apparent because a number of import tasks failed with the error, "Unable to add details of node: <Node’s FQDN> to DB, error: Node with id:<Node’s ID> already exists." I was alerted to the duplicate machine ID problem, which I resolved by generating new IDs for each node in the OSP overcloud. Then I re-deployed a fresh Storage Console VM and tried to import the cluster again. The task failure messages no longer occur now that the machine IDs are unique, but I'm still seeing the symptom in the bug description. I have one question regarding the IDs. After you changed the machine IDs, did you perform https://access.redhat.com/documentation/en/red-hat-storage-console/2.0/single/administration-guide/#troubleshooting_nodes_configuration - issue 2? Note that you could remove all keys from console machine with 'salt -D' command. BTW thanks for sos report but the most important logs are not there, content of /var/log/skyring and /var/log/salt could be useful too. (In reply to Alan Bishop from comment #5) > The machine IDs are unique because I made them so. > > The first time I tried importing an external cluster they weren't unique > (due to Bug #1270860), and that was apparent because a number of import > tasks failed with the error, "Unable to add details of node: <Node’s FQDN> > to DB, error: Node with id:<Node’s ID> already exists." I was alerted to the > duplicate machine ID problem, which I resolved by generating new IDs for > each node in the OSP overcloud. > > Then I re-deployed a fresh Storage Console VM and tried to import the > cluster again. The task failure messages no longer occur now that the > machine IDs are unique, but I'm still seeing the symptom in the bug > description. Hi Alan, can you please respond to comment 6, which I have just removed the private flag from? Thanks (In reply to Lubos Trilety from comment #6) No, I did not follow those steps (I think you mean for issue 3, not 2). I had just finished resolving the duplicate machine-id problem, and then I tried another import. Unfortunately the original setup has been torn down, but I nearly have a clean replacement setup to try the import operation again. That is, I have a totally fresh OSP and Ceph deployment, and fresh Storage Console VM. I'll try another import and report back. I think that was it! ( Issue 3 in https://access.redhat.com/documentation/en/red-hat-storage-console/2.0/single/administration-guide/#troubleshooting_nodes_configuration). This time I modified the machine IDs to ensure they're unique *before* installing the storage console agent, and the import task was successful and all nodes are properly registered. I think my original trouble was I updated the machine IDs *after* installing the console agent, and didn't know I needed to execute the corrective action outlined in the troubleshooting guide. Thanks, Lubos! Closing as NOTABUG. |