Bug 1295999 - [GSS](6.4.z) Cumulative patch fails to apply when current patches modules are not included in CP with: java.io.SyncFailedException: JBAS016855: copied content does not match expected hash for item: ModuleItem...
Summary: [GSS](6.4.z) Cumulative patch fails to apply when current patches modules are...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: Patching
Version: 6.4.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: CR1
: EAP 6.4.8
Assignee: Vaclav Tunka
QA Contact: Jan Martiska
URL:
Whiteboard:
Depends On:
Blocks: eap648-payload
TreeView+ depends on / blocked
 
Reported: 2016-01-06 04:25 UTC by Lyle Wang
Modified: 2019-09-12 09:41 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-17 12:38:29 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
testing one-off patch (902.46 KB, patch)
2016-03-08 11:14 UTC, Jiří Bílek
no flags Details | Diff
testing CP patch (32.65 KB, patch)
2016-03-08 11:15 UTC, Jiří Bílek
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker JBEAP-2669 0 Critical Closed installing CP over one-off fails if the modules patched by the one-off are not patched in the CP 2020-02-03 23:51:40 UTC
Red Hat Issue Tracker WFCORE-1288 0 Critical Resolved installing CP over one-off fails if the modules patched by the one-off are not patched in the CP 2020-02-03 23:51:40 UTC
Red Hat Knowledge Base (Solution) 2110431 0 None None None 2016-01-06 22:22:39 UTC

Description Lyle Wang 2016-01-06 04:25:47 UTC
Description of problem:

An EAP 6.4.0 distribution with CVE-2015-7501  patched is included in JBDS 9.0.0 [1].
When trying to install 6.4.4 / 6.4.5 cumulative patch on this EAP distribution, JBAS016855 error will be returned.


[1] - https://access.redhat.com/jbossnetwork/restricted/softwareDetail.html?softwareId=41511&product=jbossdeveloperstudio&version=9.0.0&downloadType=distributions


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Install JBDS 9.0.0 with EAP embedded using "JBDS-9.0.0/jboss-devstudio-9.0.0.GA-CVE-2015-7501-installer-eap.jar"
2. Download EAP 6.4 Update 05 (or Update 04)
3. Try to install the cumulative patch onto embedded EAP 6.4.0, using CLI command: "patch apply /path/to/jboss-eap-6.4.5-patch.zip"


Actual results:
Getting error:
java.io.SyncFailedException: JBAS016855: copied content does not match expected hash for item: ModuleItem{org.apache.commons.collections:main}

Expected results:
Patch is expected to be installed successfully.

Additional info:
Following lines are printed in EAP log when error is returned:

11:57:11,286 ERROR [org.jboss.as.patching] (management-handler-thread - 5) JBAS016802: failed to undo change for: 'MiscContentItem{bin/client/jboss-cli-client.jar}'
11:57:11,365 ERROR [org.jboss.as.patching] (management-handler-thread - 5) JBAS016802: failed to undo change for: 'MiscContentItem{bin/client/jboss-client.jar}'
11:57:11,366 ERROR [org.jboss.as.patching] (management-handler-thread - 5) JBAS016802: failed to undo change for: 'MiscContentItem{bin/standalone.sh}'
11:57:11,366 ERROR [org.jboss.as.patching] (management-handler-thread - 5) JBAS016802: failed to undo change for: 'MiscContentItem{docs/schema/jboss-web_7_2.xsd}'
11:57:11,367 ERROR [org.jboss.as.patching] (management-handler-thread - 5) JBAS016802: failed to undo change for: 'MiscContentItem{version.txt}'

Comment 1 Lyle Wang 2016-01-06 04:27:47 UTC
workaround is available : https://access.redhat.com/solutions/2110431

Comment 2 Brad Maxwell 2016-01-06 22:10:05 UTC
Updating, this bug is not particular to a particular patch.
The issue appears to happen if a patch is applied which patches module X , then a Cumulative Patch application is attempted and the Cumulative Patch does not include module X.

With respect to CP 5, it does include the module X, but since CP 1 does not it fails with the: java.io.SyncFailedException: JBAS016855: copied content does not match expected hash for item: ModuleItem

The CP should undo any previous patches

Comment 6 Jiří Bílek 2016-03-08 11:14:30 UTC
Created attachment 1134098 [details]
testing one-off patch

Comment 7 Jiří Bílek 2016-03-08 11:15:03 UTC
Created attachment 1134099 [details]
testing CP patch

Comment 8 Jiří Bílek 2016-03-08 11:15:46 UTC
I attached some patches.
Try:
    use 6.4.0.GA with 6.4.7.CP patch or 6.4.7 full build
    apply jboss-eap-6.4.7.one-off.zip
    apply jboss-eap-6.4.8.CP.zip
    run ./standalone.sh or ./jboss-cli.sh

In my case error is showed:
$ ./jboss-cli.sh 
Exception in thread "main" org.jboss.modules.ModuleLoadError: Error loading module from /home/jbilek/Plocha/verifikace/bz1295999/6.4.0.GA/modules/system/layers/base/org/jboss/as/server/main/module.xml
        at org.jboss.modules.ModuleLoadException.toError(ModuleLoadException.java:78)
        ...
Caused by: org.jboss.modules.xml.XmlPullParserException: Failed to add resource root 'jboss-as-server-7.5.0.Final-redhat-21.jar' at path 'jboss-as-server-7.5.0.Final-redhat-21.jar' (position: END_TAG seen ... <resource-root path="jboss-as-server-7.5.0.Final-redhat-21.jar"/>... @33:74) caused by: java.util.zip.ZipException: error in opening zip file
        at org.jboss.modules.ModuleXmlParser.parseResourceRoot(ModuleXmlParser.java:797)
        ...

It looks like when applying the mock "6.4.8" CP, the one-off (which contains org.jboss.as.server module) is not correctly rolled back. EAP should use the org.jboss.as.server module from 6.4.7 overlay, but apparently, it tries to load the module from the default location, and the JAR there is crippled, so it doesn't work.

Comment 9 Alexey Loubyansky 2016-03-08 19:16:42 UTC
Right, good catch, thanks! EAP7 doesn't invalidate JARs, so it works there. And in EAP6 the committed test is not actually run, AFAICS, since the patching testsuite does not actually exist.

Comment 10 Alexey Loubyansky 2016-03-08 19:19:58 UTC
Correction: the patching testsuite exists but set up at a different location than in EAP7. So, the test was not added to the right place.

Comment 11 Jan Martiska 2016-03-09 08:47:24 UTC
But in that case, EAP 7 is incorrect too, no?

0. have a base distribution (GA)
1. install a CP1 which updates module A
2. install a one-off patch which updates module A even further
3. install a CP2 which DOESN'T update module A

then EAP should load module A in the version from CP1, but that doesn't happen, it loads from the base location system/layers/base/$MODULE_NAME. This seemingly works correctly because the jar is not crippled, but it loads the wrong jar!

I'll update JBEAP-2669 with my findings

Comment 12 JBoss JIRA Server 2016-03-09 08:54:49 UTC
Jan Martiska <jmartisk> updated the status of jira JBEAP-2669 to Reopened

Comment 13 Alexey Loubyansky 2016-03-09 14:05:35 UTC
I think it's actually another issue. Just tested your scenario. It appears you can even skip the one-off application and simply apply cp2 after cp1 and the cp1 overlay will be lost.

Just in case, I took the branch with my first fix for this issue WFCORE-1288 and tested this scenario with my fix reverted. It still fails, so it's not something introduced by the PR for WFCORE-1288.

Comment 14 Alexey Loubyansky 2016-03-09 14:41:44 UTC
I'm wondering whether it has ever worked. It appears it's not a regression after major changes merged this summer.

Comment 15 Alexey Loubyansky 2016-03-09 15:23:42 UTC
I think it might have been implemented this way on purpose originally.

Back in the day when I was not working on patching, I was asked to contribute a feature of aging out the patching history. The result was a management operation /core-service=patching:ageout-history. The description of the operation in management model reads: "Removes a part of the patching history which is considered too old and not useful any more."

Now "too old and not useful any more" meant (at least at that time) anything preceding the last applied CP and one-off patches applied on top of it.

The scenario we are talking about here does not fit this assumption. I think the assumption originally was that each CPs would carry all the deltas necessary to update the original release to the latest CP.
This is not exactly the case with our CPs for EAP6.x. We ship bundles of separate CPs, not an atomic CP.

This means, we actually cannot safely recommend ageout-history operation to clean the disk up since it is based on the wrong assumption.

In EAP7 the situation is practically the same at this point. EAP7 supports new CP format which is actually close to that assumption when a CP is a delta between the original final release and the target CP version. But the original version has already been patched with one of the CPs and then a later CP is applied, it will be applied on top of the first CP. And we are back.

Comment 16 Alexey Loubyansky 2016-03-09 19:16:03 UTC
So, I downloaded 6.4.0 release and applied 6.4.6.CP (which is a bundle of 6.4.1.CP, 6.4.2.CP, ... , 6.4.6.CP). After applying this bundle the only active overlay is 6.4.6.CP. So, stuff from the previous CPs is lost. But, apparently, that's fine.

Here is how the CPs are built on the example of 6.4.1.CP and 6.4.2.CP. If 6.4.2.CP needs to patch the content brought in by 6.4.1.CP, it will expect the content from 6.4.1.CP physically installed. Naturally.

But if 6.4.1.CP patched something and 6.4.2.CP does not bring in any update for it, 6.4.2.CP will include the update from 6.4.1.CP, i.e. it will simply duplicate it.

So, Jan, that's why we never had to deal with the use-case you expected to work. I completely agree with you, your use-case totally makes sense. It's just worked around by the (confusing) way our CPs are built (which is also of course a consequence of the way our patching works).

And ageout-history can still be recommended.

Now, what to do about the use-case when CP2 does not duplicate CP1? I think I don't want to change anything in the current implementation regarding this. Although, I do admit this use-case makes total sense to me.
The way to fix this will be to add CP1 overlay to the .overlays file (you can do that manually, btw). But then you can't use ageout-history operation as it will remove the content you rely on to be there.

This patching implementation is in the maintenance mode and will be completely replaced in the future versions. It doesn't make sense to invest into making it perfect as long as it appears to be working and it's bearable.

I will still be looking into the invalidation issue for EAP6.

Comment 17 Jan Martiska 2016-03-10 09:42:00 UTC
AFAIK the mechanism usually works like this:
- patch CP1 patches module A
-- so the module A will be in overlay-cp1
- patch CP2 doesn't patch module A
-- so the patching mechanism, when creating overlay-cp2, COPIES module A from overlay-cp1 to overlay-cp2 (and cripples the jar in overlay-cp1). Only overlay-cp2 needs to be active then, and nothing is lost.

This is the way EAP 6.4.x does it. CP2 BUNDLE zip contains CP1 and CP2 patches bundled together, but only CP1 contains module A, CP2 doesn't have to, because of the copying mechanism I described above. Therefore only one overlay needs to be active at a time.

However, in this special case, we noticed that if 
- there was CP1 with a one-off patch for module A, and 
- CP2 is being installed which doesn't include A, then
- one-off is invalidated, but the module A from overlay-CP1 should be copied to overlay-CP2, but it is NOT, so the update is lost
And so it looks like this "lost" update occurs only in this special case. In normal cases the module is copied to the new overlay.

Therefore I think that the fix for this BZ is insufficient, because the patching mechanism doesn't properly update CP(x)->CP(x+1) while invalidating the one-off for CP(x).

Comment 18 Jan Martiska 2016-03-10 09:52:58 UTC
In EAP 7 the issue is the same - both EAP6/7 try to load the "lost" module from the base location, the only difference is that EAP 7 succeeds in loading (the wrong) module, because the jars are not crippled. But loading the wrong module is maybe even worse than not loading anything and a crash at startup.

Comment 19 Alexey Loubyansky 2016-03-10 21:49:58 UTC
Yes, there is in fact port-forward that you described and I did overlook. I apologize for that.
And the patch.xml stored in the patching history is not the original one from the patch file but with the port-forward mixed-in from which I made the assumptions about the original CP content.

I am looking into this.

Comment 20 Jan Martiska 2016-03-14 09:34:59 UTC
After a discussion in QA team, we're not sure how to actually proceed here.
What should EAP do when there is CP(N) installed with a one-off patch which updates module X and the user wants to install CP(N+1) which doesn't contain  module X?

The options are:

1. Proceed with the installation, invalidate the one-off patch, the module X will be loaded from CP(N) (or the highest N which contains the module)
-- the problem here is that the user will 'lose' the one-off patch - which is unexpected
-- we would probably have to re-create the one-off for the new CP so that users will be able to continue with the fix.

2. Don't proceed with the installation; instead, warn the user that he would lose the fix from the one-off patch. If he really wants to do this, tell him to uninstall the one-off patch first, and then the CP can be applied.
-- we would probably have to re-create the one-off for the new CP so that users will be able to continue with the fix.

3. Proceed with the installation, without invalidating the one-off patch (at least not completely), the module X will be loaded from the one-off patch.
-- I think it would be the largest change to the codebase, and most difficult to test 
-- No need to re-create the one-off, but there's a danger that the one-off  will not work properly, because it will actually be installed on a different version than the one it was generated (and tested) for

We could also really make sure on the process side that all CPs will contain all one-off patches.
-- But the problem is that we sometimes want to 'backport' one-offs to older versions when the next CP for them is already out, in which case this approach will not work.

Brad, what do you think would be the best approach?

Comment 21 Alexey Loubyansky 2016-03-14 10:26:52 UTC
1. Proceed with the installation, invalidate the one-off patch, the module X will be loaded from CP(N) (or the highest N which contains the module)
-- the problem here is that the user will 'lose' the one-off patch - which is unexpected
-- we would probably have to re-create the one-off for the new CP so that users will be able to continue with the fix.

That was the decision for this case when the current implementation was in development. It shouldn't actually be unexpected.

I'd be very much interested to collect your requirements wrt patching and provisioning to take them into account. Let's schedule a call sometime?

Comment 22 Alexey Loubyansky 2016-03-14 14:35:33 UTC
For 6.4.7 CP the original PR for this issue has to be rolled back.

The PR allows a CP to be applied over a one-off patch applied to the BASE but when rolling back the one-off, it doesn't re-enable the original module in the BASE.
The workaround would be to manually rollback the one-off before applying the CP.

Comment 24 Vaclav Tunka 2016-03-14 14:48:09 UTC
Alexey, can you please ACK the rollback of the PR applied to the EAP 6.4.7 and thus re-spin to CR3, so we have explicit confirmation please?
https://github.com/jbossas/jboss-eap/pull/2676

Comment 25 Alexey Loubyansky 2016-03-14 14:50:56 UTC
Yes, rolling back will be the right thing to do for EAP 6.4.7.

Comment 29 Alexey Loubyansky 2016-03-24 21:45:26 UTC
I personally wouldn't mind adding such warnings. As I understand though, this kind of request has to go through Jira and approval first.

Back to the original issue, I think I have a fix for it finally. I've tested a few variations of the scenario we discussed here and it appears to be working. But still, before I start sending PRs, I would appreciate if you guys tested it too. Would that be possible?
The commit for 6.4.x branch is here https://github.com/aloubyansky/jboss-eap/commit/016aae0229ca937e0d9938e7b1ebc0486b7600e8

Thanks.

Comment 30 Jan Martiska 2016-03-29 05:26:10 UTC
Sorry I was on PTO last week, I'll try to look at it today.

Comment 31 Jan Martiska 2016-03-30 09:34:19 UTC
Hmm, that still doesn't seem to work for me.

My testing scenario:
cp1 updates modules org.dom4j and org.jboss.as.product
oneoff is built for cp1 and updates module org.hibernate
cp2 updates module org.jboss.as.product only and is supposed to be installed over cp1

when I apply cp1, then oneoff, then cp2, and restart, EAP fails to boot, because it's trying to load org.hibernate from the base location (no overlay), and the jars there are crippled

I uploaded the testing bits here: http://download.eng.brq.redhat.com/scratch/jmartisk/bz1295999/ (jboss-eap.zip is the distro built from your commit)

Comment 32 Alexey Loubyansky 2016-03-30 11:56:53 UTC
Yes, I see that, thanks.

Comment 33 Alexey Loubyansky 2016-03-30 19:43:19 UTC
I added a test case for this scenario too. Could you please give this branch another try? https://github.com/aloubyansky/jboss-eap/tree/BZ1295999
Thanks a lot.

Comment 34 Jan Martiska 2016-03-31 11:45:42 UTC
Yes this one seems to work fine now.

Comment 35 Alexey Loubyansky 2016-03-31 11:54:55 UTC
Thanks! Which branch should I submit the PR for?

Comment 36 Vaclav Tunka 2016-03-31 12:09:42 UTC
6.4.x branch please. Thanks!

Comment 37 Alexey Loubyansky 2016-03-31 13:27:42 UTC
The PR for 6.4.x branch is https://github.com/jbossas/jboss-eap/pull/2736

Comment 39 Jiří Bílek 2016-05-10 06:09:28 UTC
Verified with EAP 6.4.8.CP.CR2

Comment 40 JBoss JIRA Server 2016-08-23 11:38:42 UTC
Jiri Pallich <jpallich> updated the status of jira JBEAP-2669 to Closed

Comment 41 Petr Penicka 2017-01-17 12:38:29 UTC
Retroactively bulk-closing issues from released EAP 6.4 cumulative patches.


Note You need to log in before you can comment on or make changes to this bug.