Bug 870694 - excessive hourly cron job
Summary: excessive hourly cron job
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: ghc
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jens Petersen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-10-28 01:30 UTC by Paul Wouters
Modified: 2012-11-28 07:31 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2012-11-28 07:31:30 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Paul Wouters 2012-10-28 01:30:49 UTC
Description of problem:
ghc installs /etc/cron.hourly/ghc-doc-index. I find this problematic for various reasons.

First, it seems this "documentation" can only come from package installs, so it seems to be better places in an ghc spec macro or %post job.

Second, if there are really good reasons to keep this in, hourly is _way_ too often. I would suggest weekly at most.

Third, it seemed to blow up for me too:

Date: Sat, 27 Oct 2012 20:01:02
From: Cron Daemon <root.ca>
To: root.ca
Subject: Cron <root@bofh> run-parts /etc/cron.hourly

/etc/cron.hourly/ghc-doc-index:

--read-interface=Cabal-1.14.0,Cabal-1.14.0/Cabal.haddock --read-interface=array-0.4.0.0,array-0.4.0.0/array.haddock --read-interface=base-4.5.0.0,base-4.5.0.0/base.haddock --read-interface=bytestring-0.9.2.1,bytestring-0.9.2.1/bytestring.haddock --read-interface=containers-0.4.2.1,containers-0.4.2.1/containers.haddock --read-interface=deepseq-1.3.0.0,deepseq-1.3.0.0/deepseq.haddock --read-interface=directory-1.1.0.2,directory-1.1.0.2/directory.haddock --read-interface=filepath-1.3.0.0,filepath-1.3.0.0/filepath.haddock --read-interface=ghc-prim-0.2.0.0,ghc-prim-0.2.0.0/ghc-prim.haddock --read-interface=hashable-1.1.2.3,hashable-1.1.2.3/hashable.haddock --read-interface=hashmap-1.0.0.2,hashmap-1.0.0.2/hashmap.haddock --read-interface=integer-gmp-0.4.0.0,integer-gmp-0.4.0.0/integer-gmp.haddock --read-interface=mtl-2.1.1,mtl-2.1.1/mtl.haddock --read-interface=old-locale-1.0.0.4,old-locale-1.0.0.4/old-locale.haddock --read-interface=old-time-1.1.0.0,old-time-1.1.0.0/old-time.haddock --read-interface=parallel-3.2.0.2,parallel-3.2.0.2/parallel.haddock --read-interface=parsec-3.1.2,parsec-3.1.2/parsec.haddock --read-interface=pretty-1.1.1.0,pretty-1.1.1.0/pretty.haddock --read-interface=process-1.1.0.1,process-1.1.0.1/process.haddock --read-interface=text-0.11.2.0,text-0.11.2.0/text.haddock --read-interface=transformers-0.3.0.0,transformers-0.3.0.0/transformers.haddock --read-interface=unix-2.5.1.0,unix-2.5.1.0/unix.haddock --read-interface=unordered-containers-0.2.1.0,unordered-containers-0.2.1.0/unordered-containers.haddock

In short, I would strongly suggest removing this cronjob altogether.

Comment 1 Garrett Holmstrom 2012-10-28 04:32:03 UTC
You (the maintainer) can get rid of the delay between package installation and doc-readiness by running that in every package's %post instead.  If the number of times it might run in a large rpm transaction bothers you then you can instead employ %posttrans and a temporary file to run it at most once per transaction:

1. Add a new /var/lib/rpm-state/ghc directory to the ghc package.
2. Add a new, %ghosted /var/lib/rpm-state/ghc/update-doc-index file to the ghc package.  (This specific file name is merely illustrative.)
3. Have every Haskell package's %post and %preun script touch /var/lib/rpm-state/ghc/update-doc-index.
4. Add a %posttrans section to every Haskell package that checks if /var/lib/rpm-state/ghc/update-doc-index exists, and if it does, it deletes that file and then updates the doc index.  This is a great candidate for an rpm macro.

How to deal with a transaction that removes ghc from the system completely is left as an exercise to the reader.

Relevant packaging standard:  http://fedoraproject.org/wiki/Packaging:ScriptletSnippets#Saving_state_between_scriptlets

Comment 2 Jens Petersen 2012-10-29 05:23:41 UTC
Thanks for bringing this up - for long I was not completely happy with
the cronjob hack either but hadn't found time to look into a better solution yet.

Paul: I wonder why it failed for you?  Is above the complete cron mail?

Let me make a few comments on the current cronjob:

- it was indeed introduced because running in %post slowed down
  installation of ghc-*-devel very considerably when there are many libs
  installed/being installed

- the cronjob runs as root which is also highly undesirable

- the cronjob is hourly but basically uses 0 cpu unless
  any ghc-*-devel have been newly installed/updated.

I will move to using /var/lib/rpm-state/ghc/ as soon as possible.
Running once per rpm transaction in %posttrans should be no problem.

Comment 4 Jens Petersen 2012-10-30 01:03:53 UTC
Ok, I forgot about %postun, otherwise ghc-rpm-macros seems
to be working well for me.

The only problem then I see is that posttrans is not run
for removal so maybe we have to run the reindex for each %postun
which will slow down removals but maybe that is ok for now.

Comment 5 Jens Petersen 2012-10-30 02:11:05 UTC
BTW I am seeing the output from gen_contents_index mentioned in comment 0.
Perhaps I need to silence it - not sure why that is happening.

Of course even running gen_contents_index only once does slow down
ghc-*-devel installs/updates quite a bit.  I am almost wondering
if it would be better to just not generate the doc index though
it is nice to have or to try to hack up some debian style at job??

Comment 6 Jens Petersen 2012-10-30 02:21:40 UTC
(In reply to comment #5)
> BTW I am seeing the output from gen_contents_index mentioned in comment 0.

Ah that seems to be due to a change in the ghc-7.4 script I hadn't noticed.

Comment 7 Fedora Update System 2012-10-31 01:09:12 UTC
ghc-7.4.1-6.4.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/ghc-7.4.1-6.4.fc18

Comment 8 Fedora Update System 2012-11-09 04:45:08 UTC
ghc-7.4.1-6.4.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 9 Jens Petersen 2012-11-16 08:24:16 UTC
I started thinking about using 'at' instead of 'cron' but it is
a bit more complicated.  For now I reverted the rpm-state changes
in git master and moved the cronjob to a new optional subpackage
ghc-doc-index.  It may be backported later to F18.


Note You need to log in before you can comment on or make changes to this bug.