865650 – Assorted problems with packages scripts in texlive packages and an impression of an infinite loop

Bug 865650 - Assorted problems with packages scripts in texlive packages and an impression of an infinite loop

Summary: Assorted problems with packages scripts in texlive packages and an impression...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	texlive
Sub Component:
Version:	rawhide
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Jindrich Novy
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-10-12 02:05 UTC by Michal Jaegermann
Modified:	2013-07-02 23:57 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2013-01-12 00:36:44 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Michal Jaegermann 2012-10-12 02:05:17 UTC

Description of problem:

I had to leave for half an hour while a 'yum update' transaction was running and when I returned I found on screen displayed:

....
  Cleanup    : glibc-2.16.90-22.fc19                                  2128/2130 
  Cleanup    : libgcc-4.7.2-3.fc19                                    2129/2130 

and nothing was moving forward.  Closer examination revealed that files
/var/tmp/rpm-tmp.... are showing up in quick succession, each and every one
with the following content:

[ -e /var/run/texlive/run-texhash ] && [ -e /usr/bin/texhash ] && /usr/bin/texhash 2> /dev/null; rm -f /var/run/texlive/run-texhash
[ -e /usr/bin/mtxrun ] && export TEXMF=/usr/share/texlive/texmf-dist; export TEXMFCNF=/usr/share/texlive/texmf/web2c; export TEXMFCACHE=/var/lib/texmf; /usr/bin/mtxrun --generate &> /dev/null
:

It is not visible here but no terminating newline.

Check in /var/log/yum.log showed:

Oct 11 18:45:48 1:texlive-swebib-0.svn15878-2.20120926_r27815.noarch: ts_done name in te is libstdc++ should be 1:texlive-swebib-0.svn15878-2.20120926_r27815.noarch

followed by 524 similar "should be" lines.

Attempts to 'pkill -f mtxrun' resulted only in something like that:

/var/tmp/rpm-tmp.wc6ewo: line 2:  2708 Terminated              /usr/bin/mtxrun --generate &>/dev/null
/var/tmp/rpm-tmp.SZX9BR: line 2:  2712 Terminated              /usr/bin/mtxrun --generate &>/dev/null
/var/tmp/rpm-tmp.LG3GXA: line 2:  2727 Terminated              /usr/bin/mtxrun --generate &>/dev/null
/var/tmp/rpm-tmp.aCQqNj: line 2:  2732 Terminated              /usr/bin/mtxrun --generate &>/dev/null

and the transaction was looping like before.  Only killing the whole yum job
provided a way out.  That elicited:

warning: %posttrans(texlive-patgen-1:2.3.svn26689-3.20120926_r27815.noarch) scriptlet failed, signal 15

After that re-running a content of an outstanding /var/tmp/rpm-tmp....
was not a problem.  Still yum-complete-transaction found an apparent 1 outstanding transaction element but apparently decided that it was "empty".
In particular posttrans scriptlet for a new 'kernel' package, which was installed in this transaction too, was not executed and I had to run it myself.

Version-Release number of selected component (if applicable):
texlive-2012-3.20120926_r27815.fc19
yum-3.4.3-43.fc19

How reproducible:
no idea


Additional info:
I really cannot tell if texlive packages are responsible for this or this is a yum screwup.  In particular if a missing newline in posttrans is significant.  
Please reassign as needed.

I have seen on occasions in the past "... ts_done name in te is ... should be" in yum logs but so far that was always terminating and yum was recovering.

Comment 1 Michal Jaegermann 2012-11-08 23:05:28 UTC

This bug is still present in texlive-2012-5.20121024_r28063.fc19 although it apparently moved from texlive-patgen to texlive-texdef.

This bug not only stops the whole transaction dead, without obvious ways to get out, but after killing mtxrun causes other "posttrans" scripts to be skipped and
because of that, for example, initramfs for a newly generated kernel is not created nor grub configuration updated.  Yes, one can rectify these problems "manually" - if caught in time.

Killing mtxrun this time generetes the following error message:

/var/tmp/rpm-tmp.GUItjV: line 2: 23580 Terminated              /usr/bin/mtxrun --generate &>/dev/null
/var/tmp/rpm-tmp.HcBDXj: line 2: 23585 Terminated              /usr/bin/mtxrun --generate &>/dev/null
^Cwarning: %posttrans(texlive-texdef-1:1.7b.svn26420-5.fc19.noarch) scriptlet failed, signal 2

but without terminating that a power switch remains as the only, very bad, option.

Comment 2 Michal Jaegermann 2012-11-10 00:15:34 UTC

I was not careful enough, again, to run an update to texlive-2012-6.20121107_r28202.fc19 together with assorted other rawhide updates and after all cleanups a machine once again "dropped of the cliff" making a pretty good impression of an infinite loop in a yum transaction while beeing busy all the time to an extent that typing was quite difficult.

Closer investigation suggests that this would probably finish some day but hard to tell when.  I know from the previous experience that half an hour is not enough.  That looks like far to excessive.  Small minutes would likely be already too long.

AFAICS the trouble is as follows.  I have at this moment installed "only" 929 texlive packages (and a full installation for sure would require more). 748 of these sport "posttrans" scripts which in general look like that:

[ -e /var/run/texlive/run-texhash ] && [ -e /usr/bin/texhash ] && /usr/bin/texhash 2> /dev/null; rm -f /var/run/texlive/run-texhash
[ -e /usr/bin/mtxrun ] && export TEXMF=/usr/share/texlive/texmf-dist; export TEXMFCNF=/usr/share/texlive/texmf/web2c; export TEXMFCACHE=/var/lib/texmf; /usr/bin/mtxrun --generate &> /dev/null

with a minimal additions to that block here and there.

Just running a script which extracts all these scriplets takes on my test installation 1m4.659s of a real time.  The above means that an update at least runs for me '/usr/bin/mtxrun --generate' 748 times over and over again.  This takes a lot of time and cycles.

The first line of this block is guarded by:
[ -e /var/run/texlive/run-texhash ] && [ -e /usr/bin/texhash ]
creating the next set of problems of different weight.
- There is no /var/run/texlive/ so earlier
      touch /var/run/texlive/run-texhash
    ends up with:
touch: cannot touch ‘/var/run/texlive/run-texhash’: No such file or directory
The above really should be
    mkdir -p /var/run/texlive && touch /var/run/texlive/run-texhash
- /var/run was made into a symlink to /run so accordingly one should use /run/texlive/run-texhash instead
- The same as above applies to other "flag" run-... files
- [ -e /usr/bin/texhash ] above should really be [ -x /usr/bin/texhash ] and similarly [ -x /usr/bin/mtxrun ]
- mtxrun should have its own guard flag so all these operations from "mtxrun" line are performed _once_ instead over and over in a looong loop.
- In a posttrans for texlive-base we have "ungarded":

/usr/bin/texhash 2> /dev/null
/usr/bin/updmap-sys &> /dev/null
/usr/bin/fmtutil-sys --all &> /dev/null

That is fine.  Maybe we want to run here these does not matter what; but in such case maybe corresponding run-... files should be removed after a success?
No need to repeat these operations again.


Various font packages constantly edit /usr/share/texlive/texmf/web2c/updmap.cfg without really changing anything except for rare occasions. What for? Instead of

if [ $1 -gt 0 ] ; then
sed -i '/^Map bera.map/d' /usr/share/texlive/texmf/web2c/updmap.cfg
echo "Map bera.map" >> /usr/share/texlive/texmf/web2c/updmap.cfg
touch /var/run/texlive/run-updmap
fi; :

something like this can be used:

if [ $1 -gt 0 ] ; then
  if ! grep -q '^Map bera.map' /usr/share/texlive/texmf/web2c/updmap.cfg ; then
    echo "Map bera.map" >> /usr/share/texlive/texmf/web2c/updmap.cfg
    mkdir -p /var/run/texlive/ && touch /var/run/texlive/run-updmap
  fi
fi; :

Possibly setting 'run-updmap' can be done unconditionally.  Maybe safer?

One can also use "flag" directories so setting such flag could be reduced to

   mkdir -p /run/texlive/run-updmap

with a corresponding changes when unsetting it.

Hopefully such adjustment would prevent a good approximation of a crash.  There are also other important operations, like wrapping up kernel updates, which are waiting in a "posttrans queue".

Comment 3 Jindrich Novy 2012-11-11 19:11:39 UTC

The main reason for the slow-downs is the mtxrun execution multiple times in %posttrans. Creation of /var/run/texlive is actually done first in any %post/%postun scriptlet before any touching files there. %posttrans is ran at the very end of the transaction where it is 100% sure /var/run/texlive exists.

I added the mechanism so that mtxrun is ran only once so the updates should now run much faster.

The proposed change for the map files you proposed is broken because there _has_ to be the map entry in updmap.cfg after installing the package shipping map files.

Hopefully all will work as expected now. Thanks for comments.

Comment 4 Michal Jaegermann 2012-11-11 20:33:00 UTC

(In reply to comment #3)
> The main reason for the slow-downs is the mtxrun execution multiple times in
> %posttrans.

Yes, that is what I wrote. :-)

> Creation of /var/run/texlive is actually done first in any
> %post/%postun scriptlet before any touching files there.

OK. If this is always the case ...  I did not find anything likely because
/var/run/ disappears on a reboot.
 
> The proposed change for the map files you proposed is broken because there
> _has_ to be the map entry in updmap.cfg after installing the package
> shipping map files.

Eh?  Right now you have:
  - delete some map entry if there and rewrite updmap.cfg _every_ time
    ('sed -i ...' does such rewrite behind scenes)
  - add back just possibly deleted map entry; this map entry was not deleted if it was not there but updmap.cfg was rewritten anyway _before_ adding.
A suggested change is:
  - check for a map entry and do not bother if already present
  - add such map entry if it was missing

So where is this bug?  You would have to explain that to me slowly.

Morever scripts do not even bother to coalesce rewrites and you have (below is a quote from texlive-arphic scripts; just an example):

sed -i '/^Map bkaiu.map/d' /usr/share/texlive/texmf/web2c/updmap.cfg
sed -i '/^Map bsmiu.map/d' /usr/share/texlive/texmf/web2c/updmap.cfg
sed -i '/^Map gbsnu.map/d' /usr/share/texlive/texmf/web2c/updmap.cfg
sed -i '/^Map gkaiu.map/d' /usr/share/texlive/texmf/web2c/updmap.cfg

instead of

sed -i -e '/^Map bkaiu.map/d' \
       -e '/^Map bsmiu.map/d' \
       -e '/^Map gbsnu.map/d' \
       -e '/^Map gkaiu.map/d' \
   /usr/share/texlive/texmf/web2c/updmap.cfg

OTOH in a general scheme of things this spurious churn is probably not that significant, so I am not going to loose my sleep over it, but if you will do that on an update some hundreds of times, and you do, it may become noticeable.

Comment 5 Michal Jaegermann 2012-11-17 22:23:13 UTC

The situation is indeed improved with texlive-2012-8.20121115_r28267.fc19 if not exactly great.  I run a transaction which was was updating (or adding due to dependencies) only 936 'texlive*' packages plus docbook-utils-pdf (forced by dependencies) and I looked at some times.  On my test installation it took over ten and a half minutes of a wall-clock time for all "posttrans" scripts to finish.

That is a definite improvement over the previous situation BUT ...
Apparently the first script which runs in that queue is this:

/usr/bin/texhash 2> /dev/null
/usr/bin/updmap-sys &> /dev/null
/usr/bin/fmtutil-sys --all &> /dev/null
rm -rf /var/lib/texmf/web2c/*
if [ -x /usr/sbin/selinuxenabled ] && /usr/sbin/selinuxenabled; then
[ -x /sbin/restorecon ] && /sbin/restorecon -R /var/lib/texmf/
fi

It looks like it is coming from texlive-base.  So first you create a bunch of new fmt files with '/usr/bin/fmtutil-sys --all &> /dev/null' and in the next line you throw away all this work with 'rm -rf /var/lib/texmf/web2c/*'.  It looks to me definitely backwards.  The situation is rectified as a flag in /run/texlive is not removed so fmtutil-sys will hopefuly run again but why to do all this work twice?

OTOH an old, not needed anymore and possibly confusing, stuff in /usr/share/texmf/web2c is left alone.

Is there any guarantee that posttrans from texlive-base will indeed run first and why not to check if runs are needed and remove flags after a completion? In other words why not to do something like the following?

[ -e /var/run/texlive/run-texhash ] && /usr/bin/texhash 2> /dev/null && \
   rm -f /var/run/texlive/run-texhash
[ -e /var/run/texlive/run-updmap ] && /usr/bin/updmap-sys &> /dev/null && \
   rm -f /var/run/texlive/run-updmap
if [ -e /var/run/texlive/run-fmtutil ] ; then
   rm -rf /var/lib/texmf/web2c/*
   /usr/bin/fmtutil-sys --all &> /dev/null && \
      rm -f /var/run/texlive/run-fmtutil
fi

If an order of posttrans scripts is "random" then currently you may get into a situation that after 'rm -rf /var/lib/texmf/web2c/*' nothing will run fmtutil again and all format files will be gone.

BTW - a phase between a time when the last flag in /var/lib/texmf/ is set and posttrans scripts are starting to run is also very far from lighthning fast.

Comment 6 Jindrich Novy 2012-11-20 20:44:38 UTC

Right. Since the texlive-base now ships only the basic filesystem structure + licenses it doesn't make sense to run texhash and friends. Especially because all calls are assured by subpackages containing relevant CTAN content. It is just a relict from times when all was done once by texlive-base after installation.

Comment 7 Michal Jaegermann 2012-11-21 01:24:52 UTC

(In reply to comment #6)
> It is just a relict from times when all was done once by
> texlive-base after installation.

OK. I still think that dropping in texlive-base another relic a from texlive-2007 installation, i.e. the whole /usr/share/texmf/web2c, would be a good idea which is nearly "free".  It seems to fit there.

Comment 8 Fedora Update System 2012-12-09 12:04:42 UTC

texlive-2012-10.20121205_r28449.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/texlive-2012-10.20121205_r28449.fc18

Comment 9 Fedora Update System 2012-12-10 06:57:47 UTC

Package texlive-2012-10.20121205_r28449.fc18:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing texlive-2012-10.20121205_r28449.fc18'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-20041/texlive-2012-10.20121205_r28449.fc18
then log in and leave karma (feedback).

Comment 10 Fedora Update System 2013-01-12 00:36:47 UTC

texlive-2012-10.20121205_r28449.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.