Bug 1340382 - Upgrade to latest build ovirt-node ngn 4.0 failed via "yum update"
Summary: Upgrade to latest build ovirt-node ngn 4.0 failed via "yum update"
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-node
Classification: oVirt
Component: Installation & Update
Version: 4.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ovirt-4.0.1
: ---
Assignee: Ryan Barry
QA Contact: Huijuan Zhao
URL:
Whiteboard:
Depends On:
Blocks: 1323941 1326728 1334874
TreeView+ depends on / blocked
 
Reported: 2016-05-27 08:48 UTC by Huijuan Zhao
Modified: 2016-07-19 06:25 UTC (History)
11 users (show)

Fixed In Version: ovirt-node-ng-installer-ovirt-4.0-snapshot-2016062108.iso
Clone Of:
Environment:
Last Closed: 2016-07-19 06:25:53 UTC
oVirt Team: Node
Embargoed:
rule-engine: ovirt-4.0.z+
rule-engine: blocker+
ycui: testing_plan_complete?
rule-engine: planning_ack+
fdeutsch: devel_ack+
cshao: testing_ack+


Attachments (Terms of Use)
/var/log logs in ovirt-node (5.99 MB, application/x-gzip)
2016-05-27 08:48 UTC, Huijuan Zhao
no flags Details
/tmp log in ovirt-node (1.64 KB, application/x-gzip)
2016-05-27 08:49 UTC, Huijuan Zhao
no flags Details
repo files in ovirt-node-ng-installer-ovirt-4.0-snapshot-2016062108.iso (2.96 KB, application/x-gzip)
2016-06-23 02:55 UTC, Huijuan Zhao
no flags Details
Debug info during upgrade (31.36 KB, application/vnd.oasis.opendocument.text)
2016-06-30 09:03 UTC, Huijuan Zhao
no flags Details
all logs during upgrade (6.27 MB, application/x-gzip)
2016-06-30 09:23 UTC, Huijuan Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 58467 0 master ABANDONED configure.ac: Use snapshot repo for ovirt-release 2020-03-17 20:47:40 UTC
oVirt gerrit 58495 0 master MERGED release: Build nightly to allow nightly node builds 2020-03-17 20:47:40 UTC
oVirt gerrit 58497 0 ovirt-4.0 ABANDONED configure.ac: Use snapshot repo for ovirt-release 2020-03-17 20:47:40 UTC
oVirt gerrit 58781 0 master MERGED spec: remove epoch macros 2020-03-17 20:47:40 UTC
oVirt gerrit 58787 0 ovirt-4.0 MERGED spec: remove epoch macros 2020-03-17 20:47:40 UTC
oVirt gerrit 59374 0 master MERGED node: Add markers to the repos we want to keep enabled 2020-03-17 20:47:41 UTC
oVirt gerrit 59375 0 master MERGED postprocess: Enable repos with a marker 2020-03-17 20:47:41 UTC
oVirt gerrit 59390 0 ovirt-4.0 MERGED node: Add markers to the repos we want to keep enabled 2020-03-17 20:47:41 UTC
oVirt gerrit 59432 0 master MERGED node: Fix repo enablement 2020-03-17 20:47:41 UTC
oVirt gerrit 59433 0 ovirt-4.0 MERGED node: Fix repo enablement 2020-03-17 20:47:41 UTC
oVirt gerrit 59509 0 None MERGED node: Fix pkg typo 2020-03-17 20:47:42 UTC

Description Huijuan Zhao 2016-05-27 08:48:57 UTC
Created attachment 1162394 [details]
/var/log logs in ovirt-node

Description of problem:
Upgrade from ovirt-node ngn 3.6.6/3.6.5 to latest build ngn 3.6.7 failed via "yum update"

Version-Release number of selected component (if applicable):
ovirt-node-ng-installer-ovirt-3.6-2016052300.iso
imgbased-0.6-0.201605111421git8828a69.el7.centos.noarch
ovirt-node-ng-image-update-placeholder-3.6.6-0.3.rc2.el7.noarch
ovirt-release-host-node-3.6.6-0.3.rc2.el7.noarch
ovirt-release36-snapshot-3.6.6-0.3.rc2.noarch
centos-release-7-2.1511.el7.centos.2.10.x86_64

upgrade to:
ovirt-node-ng-image-update-3.6.7-0.0.rc1.el7.noarch.rpm        
ovirt-node-ng-installer-ovirt-3.6-2016052700

How reproducible:
100%


Steps to Reproduce:
1. Install node ng 4.0(ovirt-3.6 branch)
2. Reboot and login host, upgrade:
   # yum update


Actual results:
1. After step2, failed to upgrade to latest build:

Loaded plugins: fastestmirror, imgbased-warning
Warning: yum operations are not persisted across upgrades!
ovirt-node-ng-ovirt-3.6                                  | 2.9 kB     00:00    
ovirt-node-ng-ovirt-3.6/primary_db                         | 2.7 kB   00:00    
Loading mirror speeds from cached hostfile
No packages marked for update


Expected results:
1. After step2, should upgrade successful

Additional info:
Upgrade from ovirt-node-ng-installer-ovirt-3.6-2016041900.iso to latest build(ovirt-node-ng-installer-ovirt-3.6-2016052700) also failed.

Comment 1 Huijuan Zhao 2016-05-27 08:49:47 UTC
Created attachment 1162395 [details]
/tmp log in ovirt-node

Comment 2 Red Hat Bugzilla Rules Engine 2016-05-31 09:09:56 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 3 Douglas Schilling Landgraf 2016-05-31 20:41:09 UTC
Looks like I have reproduced the report. Investigating.

Comment 4 Douglas Schilling Landgraf 2016-06-01 19:16:31 UTC
We have requested to have unversioned rpms for the ovirt-release* in the ovirt repos. Without it we won't be able to make yum update work again.

Here the request from fabian: https://ovirt-jira.atlassian.net/browse/OVIRT-555#add-comment

As soon we have it, we can update our ngn configure.ac with the new ovirt-release rpm schema.

Comment 5 Fabian Deutsch 2016-06-01 19:23:48 UTC
Also for reference, for different branches different steps are needed for the final solution.

In general: And image nvr will change (and thus an update available) when the ovirt-release (and indirectly placeholder) nvr changes.

In stable branches this change happens whenever the release maintainer does a new release, then a yum update should work.

In nightly branches this is currently broken, because nightly branches need to cinlude a build timestamp in the nvr to allow nightly updates (the version and relase does not _really_ change over night, because no spec change is done, but it should include the buildstamp to allow building an image which has a higher nvr).

Summary:
For stable branches sthis should already been working in the 4.0 and master repos.
But it's broken for the -snapshot repos of master and 4.0.

In addition we need to fix ngn to pull the yum updates frm th eofficial repos and not from jenkins.

Comment 6 Fabian Deutsch 2016-06-01 19:36:22 UTC
Moving to 4.0, because we don't provide an update for now in 3.6

Comment 7 Ying Cui 2016-06-07 09:44:32 UTC
Do we need to care this bug for RHEV 4.0 beta 1? 
Whether or no, it is blocker+ bug, we have to request this bug to fix asap to unblock QE testing on 4.0 branch. Thanks.

Comment 8 Fabian Deutsch 2016-06-15 18:40:53 UTC
An update, thi was done to enable updates:

- Images are now pulled from ovirt repos
- Images are now built for each repo (stable, pre, snapshot)
- wrapper-rpm-nvr == inner placeholder nvr (relevant to prevent upgrade to itself)

The remaining gap is to enable the relevant repos.

This was done in the ovirt-release-host-node file, but imgbase postprocess is disabling all repos at the end.
We need to find a mechanism to prevent imgbase from disabling the impotrant repos.

Comment 9 Huijuan Zhao 2016-06-23 02:49:11 UTC
1. There is typo in oVirt gerrit ID 59433, line 301 "iovirt-node-ng-image".

2. Tested failed in ovirt-node-ng-installer-ovirt-4.0-snapshot-2016062108.iso

Test version:
ovirt-node-ng-installer-ovirt-4.0-snapshot-2016062108.iso
imgbased-0.7.0-0.201606170910git3cb1db2.el7.centos.noarch
ovirt-node-ng-image-update-placeholder-4.0.0-5.201606200219.el7.noarch
ovirt-release-host-node-4.0.0-5.el7.noarch
centos-release-7-2.1511.el7.centos.2.10.x86_64
ovirt-release40-snapshot-4.0.0-5.noarch

Test steps:
1. Install ovirt-node-ng-installer-ovirt-4.0-snapshot-2016062108.iso
2. Login rhev-h and upgrade:
# yum update

Test results:
After step2, failed to update, the repo is not correct

[root@dhcp-8-252 yum.repos.d]# yum update
Loaded plugins: fastestmirror, imgbased-warning
Warning: yum operations are not persisted across upgrades!
centos-ovirt40-candidate                                                                                                                       | 3.4 kB  00:00:00     
ovirt-4.0-centos-gluster37                                                                                                                     | 2.9 kB  00:00:00     
ovirt-4.0-epel/x86_64/metalink                                                                                                                 | 3.6 kB  00:00:00     
ovirt-4.0-epel                                                                                                                                 | 4.3 kB  00:00:00     
ovirt-4.0-patternfly1-noarch-epel                                                                                                              | 3.0 kB  00:00:00     
http://www.gtlib.gatech.edu/pub/oVirt/pub/ovirt-4.0-snapshot/rpm/el7/repodata/repomd.xml: [Errno 12] Timeout on http://www.gtlib.gatech.edu/pub/oVirt/pub/ovirt-4.0-snapshot/rpm/el7/repodata/repomd.xml: (28, 'Connection timed out after 30110 milliseconds')
Trying other mirror.
ovirt-4.0-snapshot                                                                                                                             | 2.9 kB  00:00:00     
ovirt-4.0-snapshot-static                                                                                                                      | 2.9 kB  00:00:00     
virtio-win-stable                                                                                                                              | 3.0 kB  00:00:00     
(1/9): centos-ovirt40-candidate/primary_db                                                                                                     | 7.4 kB  00:00:00     
(2/9): ovirt-4.0-centos-gluster37/x86_64/primary_db                                                                                            |  38 kB  00:00:01     
(3/9): ovirt-4.0-patternfly1-noarch-epel/x86_64/primary_db                                                                                     | 2.2 kB  00:00:00     
ovirt-4.0-epel/x86_64/primary_ FAILED                                                                                               ]  74 kB/s | 114 kB  00:01:08 ETA 
http://ftp.kddilabs.jp/Linux/packages/fedora/epel/7/x86_64/repodata/a6000ded4f43675213e102043f019618a20310fa87c0700ea64ade3bc747117a-primary.sqlite.xz: [Errno 14] HTTP Error 404 - Not Found
Trying other mirror.
To address this issue please refer to the below knowledge base article 

https://access.redhat.com/articles/1320623

If above article doesn't help to resolve this issue please create a bug on https://bugs.centos.org/

(4/9): ovirt-4.0-snapshot-static/7/primary_db                                                                                                  |  14 kB  00:00:00     
(5/9): ovirt-4.0-snapshot/7/primary_db                                                                                                         | 110 kB  00:00:01     
(6/9): ovirt-4.0-epel/x86_64/updateinfo                                                                                                        | 572 kB  00:00:03     
(7/9): ovirt-4.0-epel/x86_64/group_gz                                                                                                          | 170 kB  00:00:03     
(8/9): virtio-win-stable/primary_db                                                                                                            | 2.0 kB  00:00:01     
ovirt-4.0-epel/x86_64/primary_ FAILED                                          
https://free.nchc.org.tw/fedora-epel/7/x86_64/repodata/a6000ded4f43675213e102043f019618a20310fa87c0700ea64ade3bc747117a-primary.sqlite.xz: [Errno 14] HTTPS Error 404 - Not Found
Trying other mirror.
(9/9): ovirt-4.0-epel/x86_64/primary_db                                                                                                        | 4.2 MB  00:00:10     
Determining fastest mirrors
 * ovirt-4.0-epel: ftp.kddilabs.jp
 * ovirt-4.0-snapshot: resources.ovirt.org
 * ovirt-4.0-snapshot-static: resources.ovirt.org
No packages marked for update


So I will change the status to ASSIGNED.

Additional info:
Fabian, I wonder how will customers upgrade d/s ngn via repo, will they setup repodata themselves or register to RHSM to use red hat repodata? I would like to confirm how should QE test upgrade via repo. And what about u/s upgrade? Thanks in advance.

Comment 10 Huijuan Zhao 2016-06-23 02:55:06 UTC
Created attachment 1171183 [details]
repo files in ovirt-node-ng-installer-ovirt-4.0-snapshot-2016062108.iso

Comment 11 Red Hat Bugzilla Rules Engine 2016-06-23 02:58:07 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 12 Ying Cui 2016-06-23 03:47:02 UTC
> Additional info:
> Fabian, I wonder how will customers upgrade d/s ngn via repo, will they
> setup repodata themselves or register to RHSM to use red hat repodata? I
> would like to confirm how should QE test upgrade via repo. And what about
> u/s upgrade? Thanks in advance.

For d/s, normally user will register RHVH host to RHSM, and subscribe the pool ID which is for RHVH, sync the repo, then yum update. But now, QE can not do this test due to Jira ticket RCM-3124 in process and bug 1343997 is assigned.
For satellite, user need to satellite sync repo with RHSM, then user register to satellite, then yum update.
Base on above, workaround-> we can create yum repo in local, and download the packages under the repo, and set repo source, then yum update. 

For u/s, ovirt-node-ng the repo should be set from resources.ovirt.org

We are blocked by repo issues on d/s and u/s, need to escalate it to unblock yum update testing.

Comment 13 Ying Cui 2016-06-28 12:15:38 UTC
We have to escalate this issue to unblock upgrade testing.

Comment 14 Ryan Barry 2016-06-28 13:33:53 UTC
This typo was fixed upstream.

Can you please test?

Comment 15 Huijuan Zhao 2016-06-30 09:00:26 UTC
Upgrade via "yum update" in ovirt-node-ng-installer-ovirt-4.0-2016062004.iso today, can upgrade condition-successful:
1. There is too much debug info
2. I have to enable repos first, due to all the repos are not enabled as default.
3. After upgrade, there is ERROR for "imgbase check".

Below is detailed info:

Test version:
1.Before update:
ovirt-node-ng-installer-ovirt-4.0-2016062004.iso
imgbased-0.7.0-0.201606081307gitfb92e93.el7.centos.noarch
ovirt-node-ng-image-update-placeholder-4.0.0-1.el7.noarch
2.After update:
ovirt-node-ng-4.0.0-0.20160624.0
imgbased-0.7.0-0.201606081307gitfb92e93.el7.centos.noarch
ovirt-node-ng-image-update-placeholder-4.0.0-5.201606240219.el7.noarch

Test steps:
1. Install ovirt-node-ng-installer-ovirt-4.0-2016062004.iso via kickstart file.
2. Reboot and login NGN, enable the below repos:
CentOS-Base.repo  
CentOS-Debuginfo.repo    
cockpit-preview-epel-7.repo  
ovirt-4.0-pre.repo
ovirt-4.0-pre-dependencies.repo
3. Upgrade NGN via "yum update"
4. Reboot and Login new build ovirt-node-ng-4.0.0-0.20160624.0, check info,
#imgbase check

Test result:
1. After step 3, there is much debug info during update. detailed info please refer to attachment.
2. After step 4, there are some ERRORs: 
[huzhao@huzhao ~]$ ssh root.148.10
root.148.10's password: 
Last login: Thu Jun 30 04:24:30 2016

  imgbase status: DEGRADED
  Please check the status manually using `imgbase check`

[root@dell-pet105-02 ~]# imgbase check
Status: FAILED
Mount points ... FAILED - This can happen if the installation was performed incorrectly
  Separate /var ... OK
  Discard is used ... ERROR
    Exception in '<function check_discard at 0x1d16848>': '/var'
Basic storage ... OK
  Initialized VG ... OK
  Initialized Thin Pool ... OK
  Initialized LVs ... OK
Thin storage ... OK
  Checking available space in thinpool ... OK
  Checking thinpool auto-extend ... OK

[root@dell-pet105-02 ~]# imgbase layout
ovirt-node-ng-4.0.0-0.20160620.0
 +- ovirt-node-ng-4.0.0-0.20160620.0+1
ovirt-node-ng-4.0.0-0.20160624.0
 +- ovirt-node-ng-4.0.0-0.20160624.0+1

The much debug info issue is tracked by Bug 1333776, but what about the other 2 issues of repos and "imgbase check"?

Comment 16 Huijuan Zhao 2016-06-30 09:03:43 UTC
Created attachment 1174420 [details]
Debug info during upgrade

Comment 17 Huijuan Zhao 2016-06-30 09:23:29 UTC
Created attachment 1174424 [details]
all logs during upgrade

Comment 18 Ying Cui 2016-06-30 10:24:36 UTC
Thanks Huijuan. 
According to comment 15, we may split several bugs

Issue 1: if you execute # yum repolist all, not sure which repo ID should be enable as default. But for some repo name of ovirt repo ID should be update to match ovirt description, All repo for ovirt should start with ovirt for repo Name

e.g
repo ID                         repo Name
ovirt-4.0-epel/x86_64           Extra Packages for Enterprise Linux 7 - x86_64

Here should be 
repo ID                         repo Name
ovirt-4.0-epel/x86_64           oVirt-4.0-Extra Packages for CentOS 7 - x86_64

Issue 2: 
In attachment 1174420 [details]
<snip>
sed: can't read /usr/share/ovirt-hosted-engine-setup/plugins/ovirt-hosted-engine-setup/network/bridge.py: No such file or directory
</snip>
function impact for hosted-engine-setup? here need to report bug to follow up

Issue 3: imgbase check FAILED in comment 15, not related upgrade, after clean installation, still has this error.

# imgbase check
Status: FAILED
Mount points ... FAILED - This can happen if the installation was performed incorrectly
  Separate /var ... OK
  Discard is used ... ERROR
    Exception in '<function check_discard at 0x1d16848>': '/var'
Basic storage ... OK
  Initialized VG ... OK
  Initialized Thin Pool ... OK
  Initialized LVs ... OK
Thin storage ... OK
  Checking available space in thinpool ... OK
  Checking thinpool auto-extend ... OK 

# findmnt | grep discard
└─/var                           /dev/mapper/rhel_dell--pet105--02-var                                    ext4        rw,relatime,seclabel,discard,stripe=32,data=ordered

In anaconda-ks.cfg
<snip>
# Partition clearing information
clearpart --all --initlabel
# Disk partitioning information
part / --fstype="ext4" --size=3072 --fsoptions="discard"
</snip>

Issue 4: Too much debug info output which is the Bug 1333776

Comment 19 Huijuan Zhao 2016-07-01 15:41:43 UTC
Thanks for ycui's analysis and suggestion.
Reported new bug 1352098, bug 1352100 and bug 1352103 to track the issues in comment 18.
And I will verify this bug, according to comment 15 and comment 18, change the status to VERIFIED.

Comment 20 Sandro Bonazzola 2016-07-19 06:25:53 UTC
Since the problem described in this bug report should be
resolved in oVirt 4.0.1 released on July 19th 2016, it has been closed with a
resolution of CURRENT RELEASE.

For information on the release, and how to update to this release, follow the link below.

If the solution does not work for you, open a new bug report.

http://www.ovirt.org/release/4.0.1/


Note You need to log in before you can comment on or make changes to this bug.