Bug 164966

Summary:	plague-client kill <job> not working on "needsign"
Product:	[Retired] Fedora Infrastructure	Reporter:	Ralf Corsepius <rc040203>
Component:	extras buildsys	Assignee:	Seth Vidal <skvidal>
Status:	CLOSED RAWHIDE	QA Contact:	Jeremy Katz <katzj>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	unspecified	CC:	bugs.michael, dcbw, katzj, zing
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2005-08-05 16:01:32 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Ralf Corsepius 2005-08-03 04:59:58 UTC

Description of problem:
plague-client kill <job>
doesn't seem to have any effect on jobs in "needsign"-queue.

Background: ATM, several, different versions of the same package for the same
distribution queue up in "needsign". I had wanted to purge jobs, I consider
obsolete and do not want to see released from "needsign".

# plague-client list email ralf status needsign
316: Inventor (Inventor-2_1_5-12_fc5)  ralf   needsign
        hammer3.fedora.redhat.com(x86_64):
e890f6d8af2128e730f34f18f74b5a06ec46dd48 done/done
        hammer1.fedora.redhat.com(i386):
56e801c8cb9469ff243f533b6bca0132032406bb done/done
        ppc1.fedora.redhat.com(ppc): 56ce7ac69a5e1f3c2855a9b13249c197fb693fea
done/done

344: Inventor (Inventor-2_1_5-13_fc5)  ralf   needsign
        ppc1.fedora.redhat.com(ppc): 714658290607e21e14d850b746a0c5d6b55b3f06
done/done
        hammer1.fedora.redhat.com(i386):
d8595f91a3a718a3bfa17ee0dee07615281285ca done/done
        hammer2.fedora.redhat.com(x86_64):
2b347b0b48d50fd9caa69078c0121629f6ac9af3 done/done

# plague-client kill 316
Job 316 does not exist.

Version-Release number of selected component (if applicable):
Seth's plague-client-0.2

How reproducible:
Deterministic

Comment 1 Michael Schwendt 2005-08-03 21:50:11 UTC

This feature may be important for withdrawing incomplete builds, e.g. updates
with inter-package dependencies, e.g. ABI changes. That is, first package built
fine, second depends on the first one, but failed to build. The packager ought
to be able to block such an incomplete update and either be able to withdraw a
package from "needsign" queue or put it on hold.

Comment 2 Dan Williams 2005-08-05 16:01:32 UTC

michael: Interesting...  however 'needsign' denotes that the package has
_already_ been copied to the repository, and createrepo has been run.  So you've
already got a problem, because builders may be using the package to build other
packages already.  For all intents and purposes, 'needsign' packages are
finished, done, and built.  I don't think we'll be able to pull packages without
some manual intervention...

ralf: 'needsign' packages can only be moved to finished by a person who can sign
packages, as the needsign state essentially tracks packages that need signing. 
To be technical, both the builds for Inventor did complete and both do need
signing.  I'm going to mentally defer this until we modify the build system to
be smarter about RPM versions, distro targets, and package interrelationships.

But for the immediate time, I'll modify the kill command so that you can't kill
jobs that have already been added to the repo.

Comment 3 Ralf Corsepius 2005-08-05 17:00:51 UTC

(In reply to comment #2)
> ralf: 'needsign' packages can only be moved to finished by a person who 
> can sign packages, as the needsign state essentially tracks packages that
> need signing. 
> To be technical, both the builds for Inventor did complete and both do need
> signing.  I'm going to mentally defer this until we modify the build system to
> be smarter about RPM versions, distro targets, and package interrelationships.
Don't fret on the name "needsign", or on the current implementation.

My point is: Package maintainers _must_ (!) have a possibility to review and
withdraw/confirm his _own_ packages after "successful builts".

The current implementation doesn't provide any means for such intervention.
To me, this is a severe deficit, in longer terms causing severe irritations.

Some examples:
* a packager modifies an rpm.spec and after the build systems successfully built
his package, he finds out the package has a serious defect on _one_
architecture. In this case, a reasonable packager will want to withdraw his
package before it has been released, to have time to fix the problem.

* a packager wants to try something inside of his rpm spec. He normally will
approach this problem incrementally, i.e. one modification to the spec, request
build, if build is successful, apply next modification. He normally will want
only the last iteration of his builts to be released.


> But for the immediate time, I'll modify the kill command so that you can't
> kill jobs that have already been added to the repo.
I am not sure I understand what this means or if this is what I think is needed,
nor am I sure I am satisfied with you having closed this PR.

Comment 4 Michael Schwendt 2005-08-05 17:03:26 UTC

There are _two_ repositories:

 * The build system's _private_ repository.
 * The _public_ repository accessed by the world.

We may not permit rolling back to an earlier package release in
the public repository. But it makes sense to withdraw (or defer)
a built package from being signed and made available to the world.

Forcing packagers into a situation where they cannot verify the
binaries produced by the build system is a less than ideal situation.

I'm not argueing that it would be used frequently. I just see that
it would be a helpful mighty feature. As a last resort, and a bit more
complicated, only somebody with access to the buildsys' repository could
withdraw packages from needsign queue manually and prevent further damage.

Consider what happened with the libcddb/libcdio update. The first
package built fine, the second one failed because the first one
broke the ABI and hence broke the dependencies in the buildsys'
private repository. The first one made it into "needsign", and
there was no way out. No way for the packager to request that
the incomplete pair of update packages should not be published
until both packages are ready to be shipped in "needsign".

An intermediate state between "built" and "needsign" would be
interesting, where the packager can acknowledge the move from
"built" queue to "needsign" queue.

Alternatively, a way out, like "needsign" -> "on hold". Such a
package will be available in the buildsys' private repository,
but will not be signed and published unless somebody requests it.
And it won't be published accidentally while somebody pushes out
the other packages waiting in the needsign queue.

Comment 5 Dan Williams 2005-08-05 18:34:02 UTC

The problem with this is that if a package is in 'needsign', its already being
used by other builders.  If you want to "withdraw" or "hold" a package, then you
have to roll back _all_ jobs that have been built since the package that you are
withdrawing.  That's not feasible since you are withdrawing/affecting more
people than just yourself.

I'm not going to implement a solution that gives you a window to withdraw the
package or anything, we need to hash out whether or not this is really needed,
and if so, how to actually do it.

There's two conflicting issues here:
1) Getting built packages into the private repo as fast as possible so that
other packages the depend on the just-built one can actually be built
2) Rolling back already-built packages that have already been pushed to the
private repo

I'd argue that #1 here is more important, and its a direct conflict with what
you're requesting here.

> My point is: Package maintainers _must_ (!) have a possibility to review
> and withdraw/confirm his _own_ packages after "successful builts".

This can be solved by a simple "scratch" target, like "development-testing" or
something like that which pulls from the official repo but does not contribute
to it.  Packages could get built here, tested by the developers, and then
submitted to the real target to get into the real repository.

In some sense, if you submit a bad package to the repo, it's your fault and its
not the build system's job to babysit that.  However, it does make life easier
for everyone to attempt to avoid these problems in the first place, which I
don't dispute.

Comment 6 Michael Schwendt 2005-08-05 19:12:38 UTC

> This can be solved by a simple "scratch" target,
> like "development-testing"

Well, we haven't had anything like "updates-testing" for a long time.
So, if you can bring them back, that would be great, too.

Comment 7 Seth Vidal 2005-08-05 19:21:47 UTC

as I mentioned to dan earlier on irc. We should not be filing a build system rfe
for something that is essentially policy. What y'all are looking for is a QA
policy about releasing built packages. That's what all of the above boils down to.

if that's the case then it needs to be brought to fesco and decided there then
the code can be written to implement the policy.

I think the ideas you've mentioned above are great and fine but they really need
to pass through the committee if only b/c it is implicitly determining QA policy.

Comment 8 Michael Schwendt 2005-08-05 19:29:43 UTC

> then it needs to be brought to fesco

Will happen, but probably not sooner than Sunday [unless somebody forwards this
earlier].

Comment 9 Ralf Corsepius 2005-08-06 01:43:00 UTC

(In reply to comment #7)
> as I mentioned to dan earlier on irc. We should not be filing a build system rfe
> for something that is essentially policy. What y'all are looking for is a QA
> policy about releasing built packages. That's what all of the above boils 
> down to.
Well, it's a hen and egg problem.

Is it "a defect in the build system" or is "the build system a reflection of the
build policy"?

IMO, it is both: The build system implementors missed to consider the case of
packagers wanting to retract packages, and the build policy lacks
post-built/pre-release QA. 

BTW: The package release-policy also lacks any policy on post-release QA. It's
only a matter of time until this also will hit.