Created attachment 1162394 [details] /var/log logs in ovirt-node Description of problem: Upgrade from ovirt-node ngn 3.6.6/3.6.5 to latest build ngn 3.6.7 failed via "yum update" Version-Release number of selected component (if applicable): ovirt-node-ng-installer-ovirt-3.6-2016052300.iso imgbased-0.6-0.201605111421git8828a69.el7.centos.noarch ovirt-node-ng-image-update-placeholder-3.6.6-0.3.rc2.el7.noarch ovirt-release-host-node-3.6.6-0.3.rc2.el7.noarch ovirt-release36-snapshot-3.6.6-0.3.rc2.noarch centos-release-7-2.1511.el7.centos.2.10.x86_64 upgrade to: ovirt-node-ng-image-update-3.6.7-0.0.rc1.el7.noarch.rpm ovirt-node-ng-installer-ovirt-3.6-2016052700 How reproducible: 100% Steps to Reproduce: 1. Install node ng 4.0(ovirt-3.6 branch) 2. Reboot and login host, upgrade: # yum update Actual results: 1. After step2, failed to upgrade to latest build: Loaded plugins: fastestmirror, imgbased-warning Warning: yum operations are not persisted across upgrades! ovirt-node-ng-ovirt-3.6 | 2.9 kB 00:00 ovirt-node-ng-ovirt-3.6/primary_db | 2.7 kB 00:00 Loading mirror speeds from cached hostfile No packages marked for update Expected results: 1. After step2, should upgrade successful Additional info: Upgrade from ovirt-node-ng-installer-ovirt-3.6-2016041900.iso to latest build(ovirt-node-ng-installer-ovirt-3.6-2016052700) also failed.
Created attachment 1162395 [details] /tmp log in ovirt-node
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
Looks like I have reproduced the report. Investigating.
We have requested to have unversioned rpms for the ovirt-release* in the ovirt repos. Without it we won't be able to make yum update work again. Here the request from fabian: https://ovirt-jira.atlassian.net/browse/OVIRT-555#add-comment As soon we have it, we can update our ngn configure.ac with the new ovirt-release rpm schema.
Also for reference, for different branches different steps are needed for the final solution. In general: And image nvr will change (and thus an update available) when the ovirt-release (and indirectly placeholder) nvr changes. In stable branches this change happens whenever the release maintainer does a new release, then a yum update should work. In nightly branches this is currently broken, because nightly branches need to cinlude a build timestamp in the nvr to allow nightly updates (the version and relase does not _really_ change over night, because no spec change is done, but it should include the buildstamp to allow building an image which has a higher nvr). Summary: For stable branches sthis should already been working in the 4.0 and master repos. But it's broken for the -snapshot repos of master and 4.0. In addition we need to fix ngn to pull the yum updates frm th eofficial repos and not from jenkins.
Moving to 4.0, because we don't provide an update for now in 3.6
Do we need to care this bug for RHEV 4.0 beta 1? Whether or no, it is blocker+ bug, we have to request this bug to fix asap to unblock QE testing on 4.0 branch. Thanks.
An update, thi was done to enable updates: - Images are now pulled from ovirt repos - Images are now built for each repo (stable, pre, snapshot) - wrapper-rpm-nvr == inner placeholder nvr (relevant to prevent upgrade to itself) The remaining gap is to enable the relevant repos. This was done in the ovirt-release-host-node file, but imgbase postprocess is disabling all repos at the end. We need to find a mechanism to prevent imgbase from disabling the impotrant repos.
1. There is typo in oVirt gerrit ID 59433, line 301 "iovirt-node-ng-image". 2. Tested failed in ovirt-node-ng-installer-ovirt-4.0-snapshot-2016062108.iso Test version: ovirt-node-ng-installer-ovirt-4.0-snapshot-2016062108.iso imgbased-0.7.0-0.201606170910git3cb1db2.el7.centos.noarch ovirt-node-ng-image-update-placeholder-4.0.0-5.201606200219.el7.noarch ovirt-release-host-node-4.0.0-5.el7.noarch centos-release-7-2.1511.el7.centos.2.10.x86_64 ovirt-release40-snapshot-4.0.0-5.noarch Test steps: 1. Install ovirt-node-ng-installer-ovirt-4.0-snapshot-2016062108.iso 2. Login rhev-h and upgrade: # yum update Test results: After step2, failed to update, the repo is not correct [root@dhcp-8-252 yum.repos.d]# yum update Loaded plugins: fastestmirror, imgbased-warning Warning: yum operations are not persisted across upgrades! centos-ovirt40-candidate | 3.4 kB 00:00:00 ovirt-4.0-centos-gluster37 | 2.9 kB 00:00:00 ovirt-4.0-epel/x86_64/metalink | 3.6 kB 00:00:00 ovirt-4.0-epel | 4.3 kB 00:00:00 ovirt-4.0-patternfly1-noarch-epel | 3.0 kB 00:00:00 http://www.gtlib.gatech.edu/pub/oVirt/pub/ovirt-4.0-snapshot/rpm/el7/repodata/repomd.xml: [Errno 12] Timeout on http://www.gtlib.gatech.edu/pub/oVirt/pub/ovirt-4.0-snapshot/rpm/el7/repodata/repomd.xml: (28, 'Connection timed out after 30110 milliseconds') Trying other mirror. ovirt-4.0-snapshot | 2.9 kB 00:00:00 ovirt-4.0-snapshot-static | 2.9 kB 00:00:00 virtio-win-stable | 3.0 kB 00:00:00 (1/9): centos-ovirt40-candidate/primary_db | 7.4 kB 00:00:00 (2/9): ovirt-4.0-centos-gluster37/x86_64/primary_db | 38 kB 00:00:01 (3/9): ovirt-4.0-patternfly1-noarch-epel/x86_64/primary_db | 2.2 kB 00:00:00 ovirt-4.0-epel/x86_64/primary_ FAILED ] 74 kB/s | 114 kB 00:01:08 ETA http://ftp.kddilabs.jp/Linux/packages/fedora/epel/7/x86_64/repodata/a6000ded4f43675213e102043f019618a20310fa87c0700ea64ade3bc747117a-primary.sqlite.xz: [Errno 14] HTTP Error 404 - Not Found Trying other mirror. To address this issue please refer to the below knowledge base article https://access.redhat.com/articles/1320623 If above article doesn't help to resolve this issue please create a bug on https://bugs.centos.org/ (4/9): ovirt-4.0-snapshot-static/7/primary_db | 14 kB 00:00:00 (5/9): ovirt-4.0-snapshot/7/primary_db | 110 kB 00:00:01 (6/9): ovirt-4.0-epel/x86_64/updateinfo | 572 kB 00:00:03 (7/9): ovirt-4.0-epel/x86_64/group_gz | 170 kB 00:00:03 (8/9): virtio-win-stable/primary_db | 2.0 kB 00:00:01 ovirt-4.0-epel/x86_64/primary_ FAILED https://free.nchc.org.tw/fedora-epel/7/x86_64/repodata/a6000ded4f43675213e102043f019618a20310fa87c0700ea64ade3bc747117a-primary.sqlite.xz: [Errno 14] HTTPS Error 404 - Not Found Trying other mirror. (9/9): ovirt-4.0-epel/x86_64/primary_db | 4.2 MB 00:00:10 Determining fastest mirrors * ovirt-4.0-epel: ftp.kddilabs.jp * ovirt-4.0-snapshot: resources.ovirt.org * ovirt-4.0-snapshot-static: resources.ovirt.org No packages marked for update So I will change the status to ASSIGNED. Additional info: Fabian, I wonder how will customers upgrade d/s ngn via repo, will they setup repodata themselves or register to RHSM to use red hat repodata? I would like to confirm how should QE test upgrade via repo. And what about u/s upgrade? Thanks in advance.
Created attachment 1171183 [details] repo files in ovirt-node-ng-installer-ovirt-4.0-snapshot-2016062108.iso
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
> Additional info: > Fabian, I wonder how will customers upgrade d/s ngn via repo, will they > setup repodata themselves or register to RHSM to use red hat repodata? I > would like to confirm how should QE test upgrade via repo. And what about > u/s upgrade? Thanks in advance. For d/s, normally user will register RHVH host to RHSM, and subscribe the pool ID which is for RHVH, sync the repo, then yum update. But now, QE can not do this test due to Jira ticket RCM-3124 in process and bug 1343997 is assigned. For satellite, user need to satellite sync repo with RHSM, then user register to satellite, then yum update. Base on above, workaround-> we can create yum repo in local, and download the packages under the repo, and set repo source, then yum update. For u/s, ovirt-node-ng the repo should be set from resources.ovirt.org We are blocked by repo issues on d/s and u/s, need to escalate it to unblock yum update testing.
We have to escalate this issue to unblock upgrade testing.
This typo was fixed upstream. Can you please test?
Upgrade via "yum update" in ovirt-node-ng-installer-ovirt-4.0-2016062004.iso today, can upgrade condition-successful: 1. There is too much debug info 2. I have to enable repos first, due to all the repos are not enabled as default. 3. After upgrade, there is ERROR for "imgbase check". Below is detailed info: Test version: 1.Before update: ovirt-node-ng-installer-ovirt-4.0-2016062004.iso imgbased-0.7.0-0.201606081307gitfb92e93.el7.centos.noarch ovirt-node-ng-image-update-placeholder-4.0.0-1.el7.noarch 2.After update: ovirt-node-ng-4.0.0-0.20160624.0 imgbased-0.7.0-0.201606081307gitfb92e93.el7.centos.noarch ovirt-node-ng-image-update-placeholder-4.0.0-5.201606240219.el7.noarch Test steps: 1. Install ovirt-node-ng-installer-ovirt-4.0-2016062004.iso via kickstart file. 2. Reboot and login NGN, enable the below repos: CentOS-Base.repo CentOS-Debuginfo.repo cockpit-preview-epel-7.repo ovirt-4.0-pre.repo ovirt-4.0-pre-dependencies.repo 3. Upgrade NGN via "yum update" 4. Reboot and Login new build ovirt-node-ng-4.0.0-0.20160624.0, check info, #imgbase check Test result: 1. After step 3, there is much debug info during update. detailed info please refer to attachment. 2. After step 4, there are some ERRORs: [huzhao@huzhao ~]$ ssh root.148.10 root.148.10's password: Last login: Thu Jun 30 04:24:30 2016 imgbase status: DEGRADED Please check the status manually using `imgbase check` [root@dell-pet105-02 ~]# imgbase check Status: FAILED Mount points ... FAILED - This can happen if the installation was performed incorrectly Separate /var ... OK Discard is used ... ERROR Exception in '<function check_discard at 0x1d16848>': '/var' Basic storage ... OK Initialized VG ... OK Initialized Thin Pool ... OK Initialized LVs ... OK Thin storage ... OK Checking available space in thinpool ... OK Checking thinpool auto-extend ... OK [root@dell-pet105-02 ~]# imgbase layout ovirt-node-ng-4.0.0-0.20160620.0 +- ovirt-node-ng-4.0.0-0.20160620.0+1 ovirt-node-ng-4.0.0-0.20160624.0 +- ovirt-node-ng-4.0.0-0.20160624.0+1 The much debug info issue is tracked by Bug 1333776, but what about the other 2 issues of repos and "imgbase check"?
Created attachment 1174420 [details] Debug info during upgrade
Created attachment 1174424 [details] all logs during upgrade
Thanks Huijuan. According to comment 15, we may split several bugs Issue 1: if you execute # yum repolist all, not sure which repo ID should be enable as default. But for some repo name of ovirt repo ID should be update to match ovirt description, All repo for ovirt should start with ovirt for repo Name e.g repo ID repo Name ovirt-4.0-epel/x86_64 Extra Packages for Enterprise Linux 7 - x86_64 Here should be repo ID repo Name ovirt-4.0-epel/x86_64 oVirt-4.0-Extra Packages for CentOS 7 - x86_64 Issue 2: In attachment 1174420 [details] <snip> sed: can't read /usr/share/ovirt-hosted-engine-setup/plugins/ovirt-hosted-engine-setup/network/bridge.py: No such file or directory </snip> function impact for hosted-engine-setup? here need to report bug to follow up Issue 3: imgbase check FAILED in comment 15, not related upgrade, after clean installation, still has this error. # imgbase check Status: FAILED Mount points ... FAILED - This can happen if the installation was performed incorrectly Separate /var ... OK Discard is used ... ERROR Exception in '<function check_discard at 0x1d16848>': '/var' Basic storage ... OK Initialized VG ... OK Initialized Thin Pool ... OK Initialized LVs ... OK Thin storage ... OK Checking available space in thinpool ... OK Checking thinpool auto-extend ... OK # findmnt | grep discard └─/var /dev/mapper/rhel_dell--pet105--02-var ext4 rw,relatime,seclabel,discard,stripe=32,data=ordered In anaconda-ks.cfg <snip> # Partition clearing information clearpart --all --initlabel # Disk partitioning information part / --fstype="ext4" --size=3072 --fsoptions="discard" </snip> Issue 4: Too much debug info output which is the Bug 1333776
Thanks for ycui's analysis and suggestion. Reported new bug 1352098, bug 1352100 and bug 1352103 to track the issues in comment 18. And I will verify this bug, according to comment 15 and comment 18, change the status to VERIFIED.
Since the problem described in this bug report should be resolved in oVirt 4.0.1 released on July 19th 2016, it has been closed with a resolution of CURRENT RELEASE. For information on the release, and how to update to this release, follow the link below. If the solution does not work for you, open a new bug report. http://www.ovirt.org/release/4.0.1/