Bug 1021763 - Domain controller fails to restart due to an inconsistent rollback of a redeploy
Domain controller fails to restart due to an inconsistent rollback of a redeploy
Status: CLOSED CURRENTRELEASE
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: Domain Management (Show other bugs)
6.0.1
Unspecified Unspecified
unspecified Severity unspecified
: CR1
: EAP 6.2.0
Assigned To: Brian Stansberry
Petr Kremensky
Russell Dickenson
:
: 1021760 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-21 23:01 EDT by Osamu Nagano
Modified: 2016-09-27 20:29 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The handler for the `full-replace-deployment` includes logic that deletes deployment content which has been added as part of an operation which is being rolled back. This logic was not checking whether the added content was the same as the existing content, so that if it was, the existing content would incorrectly be deleted. As a result of this situation, if the same content is redeployed in a managed domain using the `deploy --force` CLI command, and if the redeploy failed for any reason (for example, because a depended-upon service such as a datasource is missing on a server), then the deployment would also fail and the content would be removed from all hosts as part of the rollback process. However, the existing configuration item for the deployment would remain, and if the host was restarted, an attempt to deploy non-existent content would be made, resulting in a failure to boot. This issue has been fixed in this release of JBoss EAP 6. The rollback logic now recognizes that if the content was unchanged, it will not remove the content as part of the rollback process. As a result, the rollback will leave the domain in a consistent state equivalent to what it was before the redeploy attempt was made, and the content will remain available on all hosts along with the configuration referencing the content.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-12-15 11:14:25 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
reproducer (12.58 KB, application/zip)
2013-10-21 23:20 EDT, Osamu Nagano
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
JBoss Issue Tracker WFLY-2352 Major Closed Domain controller fails to restart due to an inconsistent rollback of a redeploy 2017-12-21 00:10 EST

  None (edit)
Description Osamu Nagano 2013-10-21 23:01:18 EDT
Description of problem:
When you try to redeploy (or deploy with --force option) the same application which has the same contents hash, and a necessary dependency like a datasource beeing injected in the application is lost by some reason, the redeploy operation will delete the contents under $JBOSS_HOME/domain/data/content directory but won't delete entries in domain.xml. Then the domain controller fails to start up because no contents found in the directory. This likely happens when you frequently changes settings during other servers shutting down.

The domain controller fails to restart with the following log messages.
--
12:04:10,623 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) JBAS014613: Operation ("add") failed - address: ([("deployment" => "exampleapp.war")]) - failure description: "JBAS010876: No deployment content with hash c1306bc4855f4ed9914c9616f2b999c5c62a79d3 is available in the deployment content repository for deployment 'exampleapp.war'. This is a fatal boot error. To correct the problem, either restart with the --admin-only switch set and use the CLI to install the missing content or remove it from the configuration, or remove the deployment from the xml configuraiton file and restart."
12:04:10,628 FATAL [org.jboss.as.host.controller] (Controller Boot Thread) JBAS010933: Host Controller boot has failed in an unrecoverable manner; exiting. See previous messages for details.
--

The domain controller should be independent from such a server specific issue and there should be a way to fix this via CLI or Console, not by a manual editing of domain.xml.


Additional info:
Reproducing steps will follow.  The behaviour is the same with WildFly 8.0.0.Alpha.  A similar discussion found in the following bug but that is a case of undeploy-deploy, not a redeploy.
https://bugzilla.redhat.com/show_bug.cgi?id=901159
Comment 1 Osamu Nagano 2013-10-21 23:20:23 EDT
Created attachment 814847 [details]
reproducer

The OOTB domain.xml and host.xml is supposed.  The attachment includes a war file, ./target/case00948611repro.war, which requires a dependency to "java:jboss/datasources/ExampleDS2".

1. Start the domain.
"server-three" in "other-server-group" will not be started yet.

2. In CLI, create the datasource in "full" profile as follows.
--
/profile=full/subsystem=datasources/data-source=ExampleDS2:add(jndi-name="java:jboss/datasources/ExampleDS2",connection-url="jdbc:h2:mem:test2;DB_CLOSE_DELAY=-1",driver-name="h2",user-name="sa",password="sa")
/profile=full/subsystem=datasources/data-source=ExampleDS2:enable(persistent=true)
--

3. Deploy the war file to all server groups.
--
deploy /path/to/case00948611repro.war --all-server-groups
--
"other-server-group" is "full-ha" profile so doesn't have the dependency.  But the deployment succeeds because "server-three" didn't start up.

4. Start "server-three".
--
/host=master/server-config=server-three:start
--
It will start up but the deployment will fail.

5. Redeploy all the server groups.
--
deploy /path/to/case00948611repro.war --force
--
It will fail because "server-three" rejects it.  The contents are deleted but entries in domain.xml are not.

6. Restart the domain.
It refuses to start.
Comment 2 Brian Stansberry 2013-10-24 22:01:40 EDT
*** Bug 1021760 has been marked as a duplicate of this bug. ***
Comment 5 JBoss JIRA Server 2013-11-04 11:18:44 EST
Brian Stansberry <brian.stansberry@redhat.com> made a comment on jira WFLY-2352

I didn't see this was already in JIRA when I created WFLY-2420.
Comment 6 JBoss JIRA Server 2013-11-04 11:18:56 EST
Brian Stansberry <brian.stansberry@redhat.com> updated the status of jira WFLY-2352 to Resolved
Comment 7 Petr Kremensky 2013-11-11 06:42:05 EST
This issue was verified using the 6.2.0.CR1 preview bits.

Note You need to log in before you can comment on or make changes to this bug.