| Summary: | Domain controller fails to restart due to an inconsistent rollback of a redeploy | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [JBoss] JBoss Enterprise Application Platform 6 | Reporter: | Osamu Nagano <onagano> | ||||
| Component: | Domain Management | Assignee: | Brian Stansberry <brian.stansberry> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Petr Kremensky <pkremens> | ||||
| Severity: | unspecified | Docs Contact: | Russell Dickenson <rdickens> | ||||
| Priority: | unspecified | ||||||
| Version: | 6.0.1 | CC: | asaji, emuckenh, lcosti, myarboro | ||||
| Target Milestone: | CR1 | ||||||
| Target Release: | EAP 6.2.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: |
The handler for the `full-replace-deployment` includes logic that deletes deployment content which has been added as part of an operation which is being rolled back. This logic was not checking whether the added content was the same as the existing content, so that if it was, the existing content would incorrectly be deleted.
As a result of this situation, if the same content is redeployed in a managed domain using the `deploy --force` CLI command, and if the redeploy failed for any reason (for example, because a depended-upon service such as a datasource is missing on a server), then the deployment would also fail and the content would be removed from all hosts as part of the rollback process. However, the existing configuration item for the deployment would remain, and if the host was restarted, an attempt to deploy non-existent content would be made, resulting in a failure to boot.
This issue has been fixed in this release of JBoss EAP 6. The rollback logic now recognizes that if the content was unchanged, it will not remove the content as part of the rollback process.
As a result, the rollback will leave the domain in a consistent state equivalent to what it was before the redeploy attempt was made, and the content will remain available on all hosts along with the configuration referencing the content.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2013-12-15 16:14:25 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
Created attachment 814847 [details]
reproducer
The OOTB domain.xml and host.xml is supposed. The attachment includes a war file, ./target/case00948611repro.war, which requires a dependency to "java:jboss/datasources/ExampleDS2".
1. Start the domain.
"server-three" in "other-server-group" will not be started yet.
2. In CLI, create the datasource in "full" profile as follows.
--
/profile=full/subsystem=datasources/data-source=ExampleDS2:add(jndi-name="java:jboss/datasources/ExampleDS2",connection-url="jdbc:h2:mem:test2;DB_CLOSE_DELAY=-1",driver-name="h2",user-name="sa",password="sa")
/profile=full/subsystem=datasources/data-source=ExampleDS2:enable(persistent=true)
--
3. Deploy the war file to all server groups.
--
deploy /path/to/case00948611repro.war --all-server-groups
--
"other-server-group" is "full-ha" profile so doesn't have the dependency. But the deployment succeeds because "server-three" didn't start up.
4. Start "server-three".
--
/host=master/server-config=server-three:start
--
It will start up but the deployment will fail.
5. Redeploy all the server groups.
--
deploy /path/to/case00948611repro.war --force
--
It will fail because "server-three" rejects it. The contents are deleted but entries in domain.xml are not.
6. Restart the domain.
It refuses to start.
*** Bug 1021760 has been marked as a duplicate of this bug. *** Brian Stansberry <brian.stansberry> made a comment on jira WFLY-2352 I didn't see this was already in JIRA when I created WFLY-2420. Brian Stansberry <brian.stansberry> updated the status of jira WFLY-2352 to Resolved This issue was verified using the 6.2.0.CR1 preview bits. |
Description of problem: When you try to redeploy (or deploy with --force option) the same application which has the same contents hash, and a necessary dependency like a datasource beeing injected in the application is lost by some reason, the redeploy operation will delete the contents under $JBOSS_HOME/domain/data/content directory but won't delete entries in domain.xml. Then the domain controller fails to start up because no contents found in the directory. This likely happens when you frequently changes settings during other servers shutting down. The domain controller fails to restart with the following log messages. -- 12:04:10,623 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) JBAS014613: Operation ("add") failed - address: ([("deployment" => "exampleapp.war")]) - failure description: "JBAS010876: No deployment content with hash c1306bc4855f4ed9914c9616f2b999c5c62a79d3 is available in the deployment content repository for deployment 'exampleapp.war'. This is a fatal boot error. To correct the problem, either restart with the --admin-only switch set and use the CLI to install the missing content or remove it from the configuration, or remove the deployment from the xml configuraiton file and restart." 12:04:10,628 FATAL [org.jboss.as.host.controller] (Controller Boot Thread) JBAS010933: Host Controller boot has failed in an unrecoverable manner; exiting. See previous messages for details. -- The domain controller should be independent from such a server specific issue and there should be a way to fix this via CLI or Console, not by a manual editing of domain.xml. Additional info: Reproducing steps will follow. The behaviour is the same with WildFly 8.0.0.Alpha. A similar discussion found in the following bug but that is a case of undeploy-deploy, not a redeploy. https://bugzilla.redhat.com/show_bug.cgi?id=901159