Bug 1024326
Summary: | Unable to create second JON server without storage node on HA setup | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [JBoss] JBoss Operations Network | Reporter: | Jeeva Kandasamy <jkandasa> | ||||||
Component: | High Availability | Assignee: | Stefan Negrea <snegrea> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | JON 3.2 | CC: | jkandasa, jsanda | ||||||
Target Milestone: | ER05 | ||||||||
Target Release: | JON 3.2.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | Type: | Bug | |||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1012435 | ||||||||
Attachments: |
|
Installing a server without a co-located storage node is valid. The only two requirements are that each storage node is co-located with an agent and that each server can communicate with (via CQL) each storage node. I do not think that the latter holds in this case. In the log provided by Jeeva I see, 17:31:03,111 ERROR [org.rhq.enterprise.server.installer.InstallerServiceImpl] Could not complete storage cluster schema installation: All host(s) tried for query failed (tried: localhost/127.0.0.1 ([localhost/127.0.0.1] Cannot connect)): com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1 ([localhost/127.0.0.1] Cannot connect)) This indicates that the storage node is bound to localhost; consequently, only the first server, co-located with the storage node, can communicate with it. Jeeva, take a look at rhq-storage-installer.log on your first server machine. You should see a warning message like, "This Storage Node is bound to the loopback address <storage_address>. It will not be able to communicate with Storage Nodes on other machines, and it can only receive client requests from this machine." If/when we confirm that this is the issue, then I think we can close this out. While I think Jeeva did have an environment issue, Stefan also found the problem in the server installer. The installer fetches the storage cluster ports from the database, but it does not fetch the storage node addresses. Stefan is working on the fix so I am reassigning to him. Note though that even with the fix, the storage node should not be using localhost. (In reply to John Sanda from comment #3) > While I think Jeeva did have an environment issue, Stefan also found the > problem in the server installer. The installer fetches the storage cluster > ports from the database, but it does not fetch the storage node addresses. > Stefan is working on the fix so I am reassigning to him. Note though that > even with the fix, the storage node should not be using localhost. True, Storage node IP is not stored on the table "rhq_system_config". If I update storage node IP manually it's resetting to localhost IP. Created attachment 817788 [details]
Storage Node data in "rhq_system_config"
Storage node details on postgresql database (Table: rhq_system_config). Storage node IP is missing. If we have more than on storage node, we have to maintain all the nodes IP somehow. Screen shot is attached
Storage node endpoints are stored in the rhq_storage_node table. The endpoint addresses are visible from the storage node admin UI. Jeeva, can you please provide your rhq-storage-installer.log file. (In reply to John Sanda from comment #6) > Storage node endpoints are stored in the rhq_storage_node table. The > endpoint addresses are visible from the storage node admin UI. Jeeva, can > you please provide your rhq-storage-installer.log file. Yes, It's there. Just I checked the table rhq_storage_node table. I'm facing this issue if I didn't enter storage node IP in rhq-server.properties file. Yes, as mentioned in the comment #3 ip address of storage node should be taken automatically from postgresql database. Earlier I was thinking that storage node IP also will be on 'rhq_system_config' table. If I provide storage node IP on rhq-server.properties it's working fine. The storage node information was not retrieved from the database like all the other store cluster settings. Added code to retrieve the stroge node addresses from the respective table as a comma separated list. release/jon3.2.x branch commit: https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?h=release/jon3.2.x&id=c8f4da3b4219226a28b54c8aa28bedec97321c85 Please retest ... Moving to ON_QA for test with new brew build. Verified, On HA setup second JON server is taking storage node(s) IP from postgresql database. Version : 3.2.0.ER5 Build Number : 2cb2bc9:225c796 GWT Version : 2.5.0 SmartGWT Version : 3.0p |
Created attachment 817061 [details] log file Description of problem: I'm unable to create second JON server without storage node on JON HA setup. Command I executed, [jenkins@rhel6-vm bin]$ ./rhqctl install --server --agent It's throwing the exception, 17:31:03,111 ERROR [org.rhq.enterprise.server.installer.InstallerServiceImpl] Could not complete storage cluster schema installation: All host(s) tried for query failed (tried: localhost/127.0.0.1 ([localhost/127.0.0.1] Cannot connect)): com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1 ([localhost/127.0.0.1] Cannot connect)) Version-Release number of selected component (if applicable): JBoss Operations Network Version: 3.2.0.ER4 Build Number: e413566:057b211 GWT Version: 2.5.0 SmartGWT Version: 3.0p How reproducible: always Steps to Reproduce: 1. Setup(install) second JON server on HA setup without storage node Additional info: Details log message is attached.