As part of the efforts to move all our package builds from our "temporary" team-specific Koji instance into Brew, I recently updated the RHEL4-6 harness repos to come from Brew: https://git.beaker-project.org/cgit/beaker-project.org/commit/?id=1adc63869b3bd0d9bae30a2c2f71633a81aafd68 One consequence of that change is that a large number of packages in those repos have kept the same NVR while *not* being byte-for-byte identical, because they have actually been rebuilt in a slightly different environment (although from identical sources). I ran beaker-repo-update in our beaker-devel environment shortly after changing the repos over, to pull down the new packages. However it seems that doing so has left the /var/www/beaker/harness directories with corrupted repodata. Specifically I noticed this error in a job, where restraint was trying to install beakerlib-1.17-14.el6bkr: http://beaker-devel.app.eng.bos.redhat.com/harness/RedHatEnterpriseLinux6/beakerlib-1.17-14.el6bkr.noarch.rpm: [Errno -1] Package does not match intended download. Suggestion: run yum --enablerepo=beaker-harness clean metadata When I inspected the repodata I saw that it did indeed list a slightly different size and different SHA1 checksum for beakerlib-1.17-14.el6bkr.noarch.rpm compared with what exists on disk. I think that it was the old size and checksum for the build from our temporary Koji (can't be certain unfortunately because I didn't keep a note of the incorrect checksum). Probably beaker-repo-update is incorrectly assuming when it produces the repodata that if a file has not changed its NVR then it has also not changed its checksum.
Workaround is to explicitly recreate the repodata for all harness repos on your Beaker server: for d in /var/www/beaker/harness/* ; do ( cd $d && createrepo_c --checksum sha --no-database . ) ; done
beaker-repo-update messed up the repos on beaker-devel again yesterday -- even the RHEL5 ones were there should have been no silent NVR switcheroos (finished those for RHEL5 last week). So there is definitely something wrong with the repodata being produced by beaker-repo-update. I'm not sure what. I will grab the bad repodata that's currently on there now and attach it here, for comparison with corresponding good repodata.
Seen in: https://beaker-devel.app.eng.bos.redhat.com/recipes/25446 http://beaker-devel.app.eng.bos.redhat.com/harness/RedHatEnterpriseLinuxServer5/rhts-test-env-4.74-1.el5bkr.noarch.rpm: [Errno -1] Package does not match intended download Diffing the bad vs. regenerated repodata I do indeed see that this package changed while keeping its NVR the same: <name>rhts-test-env</name> <arch>noarch</arch> <version epoch="0" ver="4.74" rel="1.el5bkr"/> - <checksum type="sha" pkgid="YES">baa9aed8ace0df55b1d3c6d8ac6922a2a73183dc</checksum> + <checksum type="sha" pkgid="YES">24c7f435f7dc3a19474ca673050220c33119f923</checksum> <summary>Testing API</summary> <description>This package contains components of the test system used when running tests, either on a developer's workstation, or within a lab.</description> <packager>Koji</packager> <url></url> - <time file="1517974945" build="1517968224"/> - <size package="45056" installed="119750" archive="124456"/> + <time file="1534813682" build="1517968224"/> + <size package="45077" installed="119750" archive="124456"/> <location href="rhts-test-env-4.74-1.el5bkr.noarch.rpm"/> I wonder if our Jenkins is doing something wrong...
Created attachment 1479105 [details] bad and regenerated repodata
I went back through all the console logs of our beaker-redhat-yum-repos jobs on Jenkins. The last time it touched rhts-test-env-4.74-1.el5bkr was job #2121 which ran for 1 hour 12 minutes starting 20 August 2018 23:58:11 UTC. That was the job for the commit "switch to Brew for RHEL4-6 harness repos". So it doesn't seem like Jenkins has done anything wrong. Indeed, the file on disk on beaker-devel has a modtime of 21 August 01:08 UTC which lines up with the above. The only mystery here is why this keeps going backwards. I first hit this problem last week (22 August according to the bug timestamp), I regenerated the repodata for all repos as per comment 1 as a workaround. Then yesterday evening (27 August) I re-ran beaker-repo-update and somehow it changed the repodata back to use the incorrect checksum.
Here is something suspicious though. The checksum of the file on disk in /var/www/beaker/harness/RedHatEnterpriseLinuxServer5/ matches NEITHER the old build from beakerkoji NOR the new build from Brew: $ shasum rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-b* 24c7f435f7dc3a19474ca673050220c33119f923 rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beaker-devel baa9aed8ace0df55b1d3c6d8ac6922a2a73183dc rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beakerkoji ca223594bcf85f1cd2e3f41431d9bc68b33853b3 rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-brew It seems like the package on disk on beaker-devel is somehow corrupted. rpm -q --info shows that it's still the old build according to its RPM header: $ rpm -q --info -p rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beakerkoji | grep Build Build Date : Wed 07 Feb 2018 11:50:24 AEST Build Host : test4.dcallagh.beakerdevs.lab.eng.bne.redhat.com $ rpm -q --info -p rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-brew | grep Build Build Date : Fri 27 Apr 2018 15:16:44 AEST Build Host : ppc-030.build.eng.bos.redhat.com $ rpm -q --info -p rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beaker-devel | grep Build Build Date : Wed 07 Feb 2018 11:50:24 AEST Build Host : test4.dcallagh.beakerdevs.lab.eng.bne.redhat.com but its size matches the (slightly larger) new build from Brew: -rw-r--r--. 1 dcallagh dcallagh 45077 Aug 21 11:08 rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beaker-devel -rw-r--r--. 1 dcallagh dcallagh 45056 Feb 7 2018 rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-beakerkoji -rw-r--r--. 1 dcallagh dcallagh 45077 Apr 27 15:16 rhts-test-env-4.74-1.el5bkr.noarch.rpm.from-brew
I was worried that we might have a mistake in our build-yum-repos.py script that runs in Jenkins, or the rsync command that runs at the end. I thought it might have accidentally left some mangled file on disk. But it seems not to be the case. Both /var/lib/jenkins/workspace/beaker-redhat-yum-repos/beakerrepos/harness/RedHatEnterpriseLinux5/rhts-test-env-4.74-1.el5bkr.noarch.rpm on the Jenkins slave, and /net/aoe-cluster.lab.bos.redhat.com/exports/beakerrepos/harness/RedHatEnterpriseLinux5/rhts-test-env-4.74-1.el5bkr.noarch.rpm (the download mirror) have the expected size and checksum ca223594bcf85f1cd2e3f41431d9bc68b33853b3. The corrupted package only appears on disk in /var/www/beaker/harness on beaker-devel, meaning this must be a bug in beaker-repo-update.
I just noticed /var/www was full on beaker-devel. So beaker-repo-update was failing to write but not actually erroring out. I am not sure if that could be the cause of the corrupted downloads although it certainly won't help (and certainly beaker-repo-update is not smart enough to notice that a previous run produced corrupted packages either).
The yum code called by beaker-repo-update appears to be using cache data in /var/tmp/yum-* in spite of all the attempts in beaker-repo-update to prevent that. So that might explain how it sometimes manages to go "back in time", if there is an error and it falls back to an old cached copy of its metadata.
https://gerrit.beaker-project.org/#/c/beaker/+/6281 tests: use http:// instead of file:// for beaker-repo-update https://gerrit.beaker-project.org/#/c/beaker/+/6282 beaker-repo-update: handle missing OS majors as a special case https://gerrit.beaker-project.org/#/c/beaker/+/6283 beaker-repo-update: verify package checksums when downloading
I also noticed while testing this, that if an incomplete file exists on disk beaker-repo-update (or rather the Yum/urlgrabber code we are calling into) would just download the file again and *append* it to whatever junk was there on disk. :-( At least with the above patch, the checksum verification will catch any bugs like that in future.
Created attachment 1479152 [details] beaker-repo-update run log Verified on beaker-devel, which during running beaker-repo-update showed several entries: 2018-08-28 07:37:06,394 bkr.server.tools.repo_update INFO Unlinking bad package /var/www/beaker/harness/RedHatStorageSoftwareAppliance3/beaker-system-scan-debuginfo-2.3-3.el6bkr.x86_64.rpm
This has been released with Beaker 25.6. Release Notes: https://beaker-project.org/docs/whats-new/release-25.html#beaker-25-6
*** Bug 1625423 has been marked as a duplicate of this bug. ***