Bug 1433668

Summary: The file modification in /etc of middle layer can not be updated to latest layer after upgrade multiple times
Product: [oVirt] ovirt-node Reporter: Huijuan Zhao <huzhao>
Component: Installation & UpdateAssignee: Ryan Barry <rbarry>
Status: CLOSED CURRENTRELEASE QA Contact: Huijuan Zhao <huzhao>
Severity: high Docs Contact:
Priority: high    
Version: 4.1CC: bugs, cshao, dguo, dougsland, jiawu, leiwang, mgoldboi, mjankula, qiyuan, rbarry, sbonazzo, weiwang, yaniwang, ycui, yzhao
Target Milestone: ovirt-4.1.1-1Keywords: Rebase
Target Release: 4.1Flags: rule-engine: ovirt-4.1+
rule-engine: blocker+
mgoldboi: planning_ack+
sbonazzo: devel_ack+
ycui: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: imgbased-0.9.19-0.1.el7ev Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-04-21 09:43:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Sosreport and all logs in /var/log/ and /tmp from host none

Description Huijuan Zhao 2017-03-19 07:30:49 UTC
Created attachment 1264558 [details]
Sosreport and all logs in /var/log/ and /tmp from host

Description of problem:
Upgrade RHVH for multiple times, 4.0 GA -> 4.0.4 -> 4.0.5 -> 4.0.6 Async -> 4.1, the file modification in /etc in layer 4.0.4 and 4.0.5 can not be updated to layer 4.1.
All modifications in /etc in any layer should be updated to newer layer during upgrade.

Version-Release number of selected component (if applicable):
Build 1:
redhat-virtualization-host-4.0-20160817.0
Build 2:
redhat-virtualization-host-4.0-20160919.0
Build 3:
redhat-virtualization-host-4.0-20161116.1
Build 4:
redhat-virtualization-host-4.0-20170201.0
Build 5:
redhat-virtualization-host-4.1-20170314.0



How reproducible:
100%

Steps to Reproduce:
1. Clean install RHVH build1 redhat-virtualization-host-4.0-20160817.0
2. Reboot and login build1(rhvh-4.0-20160817.0), create new file in /etc
For example, create new file /etc/huzhao:
-------------------------
# cat /etc/huzhao
huzhao 0817
-------------------------
3. Download redhat-virtualization-host-image-update-4.0-20160919.0.el7_2.noarch.rpm, and update rhvh to build2 in rhvh side:
# yum install redhat-virtualization-host-image-update-4.0-20160919.0.el7_2.noarch.rpm

4. Reboot and login build2(rhvh-4.0-20160919.0), check file /etc/huzhao:
-------------------------
# cat /etc/huzhao
huzhao 0817
-------------------------
Modify it(add one line):
-------------------------
# cat /etc/huzhao
huzhao 0817
huzhao 0919
-------------------------
5. Download redhat-virtualization-host-image-update-4.0-20161116.1.el7_3.noarch.rpm, and update rhvh to build3 in rhvh side:
# yum install redhat-virtualization-host-image-update-4.0-20161116.1.el7_3.noarch.rpm

6. Reboot and login build3(rhvh-4.0-20161116.1), check file /etc/huzhao:
-------------------------
# cat /etc/huzhao
huzhao 0817
huzhao 0919
-------------------------
Modify it(add one line):
-------------------------
# cat /etc/huzhao
huzhao 0817
huzhao 0919
huzhao 1116
-------------------------
7. Download redhat-virtualization-host-image-update-4.0-20170201.0.el7_3.noarch.rpm, and update rhvh to build4 in rhvh side:
# yum install redhat-virtualization-host-image-update-4.0-20170201.0.el7_3.noarch.rpm

8. Reboot and login build4(rhvh-4.0-20170201.0), check file /etc/huzhao and imgbase layout:
-------------------------
# cat /etc/huzhao
huzhao 0817
huzhao 0919
huzhao 1116
-------------------------
-------------------------
# imgbase layout
rhvh-4.0-0.20160817.0
 +- rhvh-4.0-0.20160817.0+1
rhvh-4.0-0.20160919.0
 +- rhvh-4.0-0.20160919.0+1
rhvh-4.0-0.20161116.0
 +- rhvh-4.0-0.20161116.0+1
rhvh-4.0-0.20170201.0
 +- rhvh-4.0-0.20170201.0+1
-------------------------
9. Setup local repos in build4(rhvh-4.0-0.20170201.0), and update to build5(rhvh-4.1-20170314.0)
# yum update
10. Check imgbase layout before reboot rhvh:
-------------------------
# imgbase layout
rhvh-4.0-0.20170201.0
 +- rhvh-4.0-0.20170201.0+1
rhvh-4.1-0.20170315.0
 +- rhvh-4.1-0.20170315.0+1
-------------------------
11. Reboot and login build5 (rhvh-4.1-20170314.0), check file /etc/huzhao



Actual results:
In step 11, file /etc/huzhao is:
-------------------------
# cat /etc/huzhao
huzhao 0817
-------------------------
The modifications in build2 and build3 are not updated to latest layer.


Expected results:
In step 11, file /etc/huzhao should be:
-------------------------
# cat /etc/huzhao
huzhao 0817
huzhao 0919
huzhao 1116
-------------------------
The modifications in build2 and build3 should be updated to latest layer.


Additional info: 
The file modifications in /var in build2 and build3 can be updated to latest layer.

Comment 1 Ying Cui 2017-03-20 02:49:56 UTC
The contents in /etc should be kept for each image because it is in the writable layer, but from bug description, the contents under /etc are missing after upgrading to 4.1. Standing on customer side as QE viewpoint, we consider it is a blocker.

Comment 2 Ryan Barry 2017-03-20 06:42:51 UTC
Can you define "the contents under /etc are missing"? This has already been verified as part of rhbz#1417534

In general, the problem with the middle layer update is multifaceted, and the fix for rhbz#1417534 makes imgbased go back in time and pretend that imgbased always did the "right thing" by only keeping unmodified configuration files.

Here, we can compare the hash of (for example):

0916 - /etc/hosts vs 0916 /usr/share/factory/etc/hosts

These differ, so we figure that /etc/hosts has been modified, and we copy it forward.

Originally (until 4.0.7/4.1.1), imgbased copied ALL of /etc

To remediate this (and get the system back to a point where unmodified configuration files on a system upgraded from 4.0.3->4.0.6->4.0.7, for example) are actually still unmodified and keep the system value, imgbased will now essentially look at the difference and copy.

Using /etc/vdsm/logger.conf for example, since this changes frequently, and some changes were created and then removed, imagine the following:

4.0.3 -> logger.conf was not modified
4.0.5 -> logger.conf changed in the image, but imgbased bulk copied the file from 4.0.3, so it's now considered modified.
4.0.6 -> logger.conf changed in the image, but imgbased bulk copied the file from 4.0.3, so it's now considered modified.
4.0.7 -> logger.conf changed in the image, but imgbased bulk copied again

4.0.7 now has logger.conf and a number of other files from 4.0.3 in /etc which should not be present in their modified versions.

To fix this appropriately (and to fix it in previous layers), imgbased must go back in layers, and say:


4.0.3 -> logger.conf was not modified
4.0.5 -> logger.conf changed in the image, so keep the new one
4.0.6 -> logger.conf changed in the image, so keep the new one
4.0.7 -> logger.conf changed in the image, and the running image has it

In this case (as an analogue to /etc/hosts), we cannot compare the file in 4.0.5 to /usr/share/factory/etc/hosts in 4.0.5, since it would have a different hash (being from 4.0.3). To resolve this, it's repeating the changes imgbased *should* have made.

The only appropriate resolution here is to say:

/etc/hosts in 0919 differs from /etc/hosts in 0817 *and* has a newer timestamp *and* the hash for /etc/hosts in 0817 is not the same as /usr/share/factory/etc/hosts in 0817 *and* the hash for /etc/hosts in 0916 is not the same as the timestamp in /usr/share/factory/etc/hosts in 0916, so we should keep that. 

This is probably an acceptable workaround, but can still potentially fail. If, for example:

Boot into 0817
Modify /etc/hosts
Upgrade to 0919
Modify /etc/hosts
Upgrade to 1116
Modify /etc/hosts
Boot back to 0919
Upgrade

Which /etc/hosts should be taken? 

Similar to another bug, imgbased must now assume that layers newer than the NVR which is being upgraded from contain invalid configuration somehow and should be ignored (there should be a separate bug for this).

In general, the suggestion ("All modifications in /etc in any layer should be updated to newer layer during upgrade.") cannot be resolved. If for no other reason, imgbased cannot be expected to know what is and is not valid syntax for every file in /etc. 

We must make a decision of what to keep, and timestamp seems like the best option, while also being aware that

Comment 3 Ying Cui 2017-03-20 09:27:59 UTC
(In reply to Ryan Barry from comment #2)
> Can you define "the contents under /etc are missing"? This has already been
> verified as part of rhbz#1417534

It means that the files under /etc which were _created_ into middle layers are missing after upgrading to latest 4.1 builds. It should not lose any data among several upgrades.

Comment 4 Ryan Barry 2017-03-20 17:15:45 UTC
(In reply to Ying Cui from comment #3)
> (In reply to Ryan Barry from comment #2)
> > Can you define "the contents under /etc are missing"? This has already been
> > verified as part of rhbz#1417534
> 
> It means that the files under /etc which were _created_ into middle layers
> are missing after upgrading to latest 4.1 builds. It should not lose any
> data among several upgrades.

This is very misleading.

For example:

Install 0916
touch /etc/test.0916
Upgrade to 1012
touch /etc/test.1012
Upgrade to 1116
touch /etc/test.1116
Upgrade to 0317

All files are actually present in /etc on the final image, as expected.

What's missing now is modifications made in subsequent layers, which is still serious, but can be handled with timestamp checking as a safety. There are still ways in which this can fail, though.

I'm hesitant to implement a 3-way merge, since it's possible for configurations to be conflicting or otherwise broken, which is why I'm defaulting to timestamps

Comment 6 Huijuan Zhao 2017-04-05 09:22:23 UTC
Test version:
Build 1:
redhat-virtualization-host-4.0-20160817.0
Build 2:
redhat-virtualization-host-4.0-20160919.0
Build 3:
redhat-virtualization-host-4.0-20161116.1
Build 4:
redhat-virtualization-host-4.0-20170201.0
Build 5:
redhat-virtualization-host-4.1-20170403.0
imgbased-0.9.20-0.1.el7ev.noarch

Test steps:
Same with comment 0

Test results:
In step 11, file /etc/huzhao is:
-------------------------
# cat /etc/huzhao
huzhao 0817
huzhao 0919
huzhao 1116
-------------------------
The modifications in middle layers are updated to latest layer.

So this bug is fixed in imgbased-0.9.20-0.1.el7ev.noarch, change the status to VERIFIED.

Comment 7 Ryan Barry 2017-04-20 17:54:45 UTC
*** Bug 1443957 has been marked as a duplicate of this bug. ***