Bug 1096927
Summary: | JON storage node does not get properly configured if rhqctl install is run a second time to complete a failed install | ||
---|---|---|---|
Product: | [JBoss] JBoss Operations Network | Reporter: | Larry O'Leary <loleary> |
Component: | Installer | Assignee: | John Mazzitelli <mazz> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Filip Brychta <fbrychta> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | JON 3.2 | CC: | ahovsepy, fbrychta, mazz |
Target Milestone: | DR01 | Flags: | jmorgan:
needinfo?
|
Target Release: | JON 3.3.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-12-11 14:03:33 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Larry O'Leary
2014-05-12 17:33:41 UTC
Note that with the fix to bug #1089757 in place, the replication procedures specified in this issue will no longer cause the install to fail. Its possible therefore the fix to that issue will also fix this issue (though I have no verified that yet). Is there another replication procedure to cause this to happen other than the failure of the storage node to shutdown at the end? Because once the fix to that other bug is in place, that failure will no longer cause the rollback to happen, so this BZ's error won't happen. But I suspect there might be other conditions where this might happen (for example, what did you mean "re-run the installer ... or to adjust configuration parameters" - I'm not aware of being able to re-run the installer after it has successfully run.) The reproducer steps here simply simulates an actual delayed shutdown of the storage node upon install. If for example, it is running on a slower machine or any other shutdown failure occurs. This is different then what is happening in the other bug. In the other bug, the shutdown is really successful but due to the bug, we still say it was unsuccessful. In this case, the shutdown is really not working. As for configuration changes + re-run I am referring to issues like "unable to bind to address" and "invalid port" and "SELinux in enforcing mode". In those cases, the install will fail and the user would update their configuration and re-run. I found one issue that may or may not be related (I can't see right now that it is related, but I'm going to commit it with this BZ number associated with it since I saw it while trying to replicate this issue). If you restart the installer and the jboss.bind.address.management wasn't set, we need to fallback to jboss.bind.address (we don't in the broken code, I will fix that now). git commit to master: 6fccc15 I have not been able to replicate on master which has the latest fix to the installer from that previous BZ. I also tried hitting control-C at different times during the installer run to see if the UNDO steps are not getting performed properly but I've not seen the problem. I'll try running on JON 3.2.0 and try to determine why it happens there. I was able to replicate this easily on 3.2.0 using the replication procedures. However, did not see this error on master build. I will see if I can verify that it is the fix from the other BZ that fixed the issue. OK, I rebuilt release/3.2.x branch and reproduced the error - verified the problem exists there in that branch. Then I cherry-picking these two (and correcting some minor conflicts) into release/3.2.x branch: 3a7cec6 845cc38 These are documented in bug #1089757. Re-ran test - the installation doesn't fail with any abort/undo. The installations are complete. You can start up everything and you won't get the error mentioned in the description. (In reply to John Mazzitelli from comment #7) > Then I cherry-picking these two (and correcting some minor conflicts) into > release/3.2.x branch: > > 3a7cec6 > 845cc38 I forgot, as part of the conflict resolution, I removed an import that should not have been removed. So I commited this to fix that: 87c31fc24abea9b281f4852f43e276046535eb09 Moving to ON_QA as available to test with brew build of DR01: https://brewweb.devel.redhat.com//buildinfo?buildID=373993 I tried to ivoke revert of installation many different ways (incorrect properties, ctrl+c during diffent phases of installation) and I was not able to reproduce the issue. Version : 3.3.0.DR01 Build Number : 6468454:dda0a47 |