Plague needs some kind of locking in order to prevent build failures or incorrect builds during signing or rsync. Buildhosts use two repositories during builds. 1) download.fedora.redhat.com 2) extras64.linux.duke.edu for packages that are just-built but not yet signed. Build failure can happen if mock is downloading packages to populate the buildroot during the signing process. If the headers downloaded suddenly don't match the packages because the packages were modified in signing, then the build job can fail. Build failure can also happen if the builders attempt to download repodata during the repocreate of the sign & push process. Incorrect builds can happen between the long period after repocreate and rsync when repoview is running for a considerable length of time. This is because just-built packages have been moved out of the needsign repository in preparation for syncing to the master mirror. Buildhosts pulling repodata from both locations at this moment will attempt to install BuildRequires from an incomplete repository. Solutions? ========== It is clear that we need some kind of "locked" or "paused" state where builds wait for the repository to be consistent. dcbw suggested either 1) Wait until all builds are complete then lock for sign & rsync to happen. 2) Cancel all current builds in order for sign & rsync to happen immediately, then automatically requeue everything afterward. My opinion is that #1 is most desirable, but only if signing can be a "fire and forget" operation where you type in the passphrase at the beginning, and it completes the time consuming tasks without necessity of interactivity minutes later. Otherwise #2 is an OK option. The rsync part can be made faster by moving the build master into the PHX colo, however the vast majority of the delay would be during repoview.
-> FE/plague
pretty sure this is fixed, plague client support lockfiles on each repo directory now.