Bug 1619969 - beaker-repo-update leaves corrupted packages, inconsistent repodata if a package changes while keeping the same NVR
Summary: beaker-repo-update leaves corrupted packages, inconsistent repodata if a pack...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Beaker
Classification: Retired
Component: general
Version: 25
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: 25.6
Assignee: Dan Callaghan
QA Contact: Roman Joost
URL:
Whiteboard:
: 1625423 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-22 03:19 UTC by Dan Callaghan
Modified: 2018-09-16 23:00 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-03 23:31:36 UTC
Embargoed:


Attachments (Terms of Use)
bad and regenerated repodata (3.36 MB, application/tar)
2018-08-28 00:51 UTC, Dan Callaghan
no flags Details
beaker-repo-update run log (185.62 KB, text/plain)
2018-08-28 07:41 UTC, Roman Joost
no flags Details

Description Dan Callaghan 2018-08-22 03:19:48 UTC
As part of the efforts to move all our package builds from our "temporary" team-specific Koji instance into Brew, I recently updated the RHEL4-6 harness repos to come from Brew:

https://git.beaker-project.org/cgit/beaker-project.org/commit/?id=1adc63869b3bd0d9bae30a2c2f71633a81aafd68

One consequence of that change is that a large number of packages in those repos have kept the same NVR while *not* being byte-for-byte identical, because they have actually been rebuilt in a slightly different environment (although from identical sources).

I ran beaker-repo-update in our beaker-devel environment shortly after changing the repos over, to pull down the new packages.

However it seems that doing so has left the /var/www/beaker/harness directories with corrupted repodata. Specifically I noticed this error in a job, where restraint was trying to install beakerlib-1.17-14.el6bkr:

http://beaker-devel.app.eng.bos.redhat.com/harness/RedHatEnterpriseLinux6/beakerlib-1.17-14.el6bkr.noarch.rpm: [Errno -1] Package does not match intended download. Suggestion: run yum --enablerepo=beaker-harness clean metadata

When I inspected the repodata I saw that it did indeed list a slightly different size and different SHA1 checksum for beakerlib-1.17-14.el6bkr.noarch.rpm compared with what exists on disk. I think that it was the old size and checksum for the build from our temporary Koji (can't be certain unfortunately because I didn't keep a note of the incorrect checksum).

Probably beaker-repo-update is incorrectly assuming when it produces the repodata that if a file has not changed its NVR then it has also not changed its checksum.

Comment 1 Dan Callaghan 2018-08-22 03:20:51 UTC
Workaround is to explicitly recreate the repodata for all harness repos on your Beaker server:

for d in /var/www/beaker/harness/* ; do ( cd $d && createrepo_c --checksum sha --no-database . ) ; done

Comment 2 Dan Callaghan 2018-08-28 00:04:57 UTC
beaker-repo-update messed up the repos on beaker-devel again yesterday -- even the RHEL5 ones were there should have been no silent NVR switcheroos (finished those for RHEL5 last week).

So there is definitely something wrong with the repodata being produced by beaker-repo-update. I'm not sure what. I will grab the bad repodata that's currently on there now and attach it here, for comparison with corresponding good repodata.

Comment 3 Dan Callaghan 2018-08-28 00:48:11 UTC
Seen in: https://beaker-devel.app.eng.bos.redhat.com/recipes/25446

http://beaker-devel.app.eng.bos.redhat.com/harness/RedHatEnterpriseLinuxServer5/rhts-test-env-4.74-1.el5bkr.noarch.rpm: [Errno -1] Package does not match intended download 

Diffing the bad vs. regenerated repodata I do indeed see that this package changed while keeping its NVR the same:

   <name>rhts-test-env</name>
   <arch>noarch</arch>
   <version epoch="0" ver="4.74" rel="1.el5bkr"/>
-  <checksum type="sha" pkgid="YES">baa9aed8ace0df55b1d3c6d8ac6922a2a73183dc</checksum>
+  <checksum type="sha" pkgid="YES">24c7f435f7dc3a19474ca673050220c33119f923</checksum>
   <summary>Testing API</summary>
   <description>This package contains components of the test system used when running
 tests, either on a developer's workstation, or within a lab.</description>
   <packager>Koji</packager>
   <url></url>
-  <time file="1517974945" build="1517968224"/>
-  <size package="45056" installed="119750" archive="124456"/>
+  <time file="1534813682" build="1517968224"/>
+  <size package="45077" installed="119750" archive="124456"/>
   <location href="rhts-test-env-4.74-1.el5bkr.noarch.rpm"/>

I wonder if our Jenkins is doing something wrong...

Comment 4 Dan Callaghan 2018-08-28 00:51:02 UTC
Created attachment 1479105 [details]
bad and regenerated repodata

Comment 5 Dan Callaghan 2018-08-28 01:12:19 UTC
I went back through all the console logs of our beaker-redhat-yum-repos jobs on Jenkins. The last time it touched rhts-test-env-4.74-1.el5bkr was job #2121 which ran for 1 hour 12 minutes starting 20 August 2018 23:58:11 UTC.

That was the job for the commit "switch to Brew for RHEL4-6 harness repos".

So it doesn't seem like Jenkins has done anything wrong.

Indeed, the file on disk on beaker-devel has a modtime of 21 August 01:08 UTC which lines up with the above.

The only mystery here is why this keeps going backwards. I first hit this problem last week (22 August according to the bug timestamp), I regenerated the repodata for all repos as per comment 1 as a workaround. Then yesterday evening (27 August) I re-ran beaker-repo-update and somehow it changed the repodata back to use the incorrect checksum.

Comment 6 Dan Callaghan 2018-08-28 01:16:52 UTC
Here is something suspicious though. The checksum of the file on disk in /var/www/beaker/harness/RedHatEnterpriseLinuxServer5/ matches NEITHER the old build from beakerkoji NOR the new build from Brew:

$ shasum rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-b*
24c7f435f7dc3a19474ca673050220c33119f923  rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beaker-devel
baa9aed8ace0df55b1d3c6d8ac6922a2a73183dc  rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beakerkoji
ca223594bcf85f1cd2e3f41431d9bc68b33853b3  rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-brew

It seems like the package on disk on beaker-devel is somehow corrupted. rpm -q --info shows that it's still the old build according to its RPM header:

$ rpm -q --info -p rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beakerkoji | grep Build
Build Date  : Wed 07 Feb 2018 11:50:24 AEST
Build Host  : test4.dcallagh.beakerdevs.lab.eng.bne.redhat.com
$ rpm -q --info -p rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-brew | grep Build
Build Date  : Fri 27 Apr 2018 15:16:44 AEST
Build Host  : ppc-030.build.eng.bos.redhat.com
$ rpm -q --info -p rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beaker-devel | grep Build
Build Date  : Wed 07 Feb 2018 11:50:24 AEST
Build Host  : test4.dcallagh.beakerdevs.lab.eng.bne.redhat.com

but its size matches the (slightly larger) new build from Brew:

-rw-r--r--.  1 dcallagh dcallagh   45077 Aug 21 11:08 rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beaker-devel
-rw-r--r--.  1 dcallagh dcallagh   45056 Feb  7  2018 rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beakerkoji
-rw-r--r--.  1 dcallagh dcallagh   45077 Apr 27 15:16 rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-brew

Comment 7 Dan Callaghan 2018-08-28 01:24:24 UTC
I was worried that we might have a mistake in our build-yum-repos.py script that runs in Jenkins, or the rsync command that runs at the end. I thought it might have accidentally left some mangled file on disk. But it seems not to be the case.

Both /var/lib/jenkins/workspace/beaker-redhat-yum-repos/beakerrepos/harness/RedHatEnterpriseLinux5/rhts-test-env-4.74-1.el5bkr.noarch.rpm on the Jenkins slave, and /net/aoe-cluster.lab.bos.redhat.com/exports/beakerrepos/harness/RedHatEnterpriseLinux5/rhts-test-env-4.74-1.el5bkr.noarch.rpm (the download mirror) have the expected size and checksum ca223594bcf85f1cd2e3f41431d9bc68b33853b3.

The corrupted package only appears on disk in /var/www/beaker/harness on beaker-devel, meaning this must be a bug in beaker-repo-update.

Comment 8 Dan Callaghan 2018-08-28 01:44:35 UTC
I just noticed /var/www was full on beaker-devel. So beaker-repo-update was failing to write but not actually erroring out. I am not sure if that could be the cause of the corrupted downloads although it certainly won't help (and certainly beaker-repo-update is not smart enough to notice that a previous run produced corrupted packages either).

Comment 9 Dan Callaghan 2018-08-28 02:55:11 UTC
The yum code called by beaker-repo-update appears to be using cache data in /var/tmp/yum-* in spite of all the attempts in beaker-repo-update to prevent that. So that might explain how it sometimes manages to go "back in time", if there is an error and it falls back to an old cached copy of its metadata.

Comment 10 Dan Callaghan 2018-08-28 03:48:10 UTC
https://gerrit.beaker-project.org/#/c/beaker/+/6281 tests: use http:// instead of file:// for beaker-repo-update
https://gerrit.beaker-project.org/#/c/beaker/+/6282 beaker-repo-update: handle missing OS majors as a special case
https://gerrit.beaker-project.org/#/c/beaker/+/6283 beaker-repo-update: verify package checksums when downloading

Comment 11 Dan Callaghan 2018-08-28 03:53:35 UTC
I also noticed while testing this, that if an incomplete file exists on disk beaker-repo-update (or rather the Yum/urlgrabber code we are calling into) would just download the file again and *append* it to whatever junk was there on disk. :-(

At least with the above patch, the checksum verification will catch any bugs like that in future.

Comment 13 Roman Joost 2018-08-28 07:41:45 UTC
Created attachment 1479152 [details]
beaker-repo-update run log

Verified on beaker-devel, which during running beaker-repo-update showed several entries:

2018-08-28 07:37:06,394 bkr.server.tools.repo_update INFO Unlinking bad package /var/www/beaker/harness/RedHatStorageSoftwareAppliance3/beaker-system-scan-debuginfo-2.3-3.el6bkr.x86_64.rpm

Comment 14 Roman Joost 2018-09-03 23:31:36 UTC
This has been released with Beaker 25.6.

Release Notes: https://beaker-project.org/docs/whats-new/release-25.html#beaker-25-6

Comment 15 Dan Callaghan 2018-09-16 23:00:17 UTC
*** Bug 1625423 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.