Description of problem: Texlive has an absurd number of sub-packages and these sub-packages keep getting pulled in as dependencies by other packages. I have never specifically installed any texlive packages and my machine has accumulated *1,013* texlive packages. Among other problems this makes any 'yum update' involving texlive be hideously slow, because each sub-package takes a certain amount of time to install, clean up, and then verify. My latest update took an hour and ten minutes (during which time I can't do anything productive with yum, like install additional packages). (For example, yum appears to take roughly a second per package to do the cleanup phase of texlive updates. With a lot of texlive packages this takes a lot of time.) In fact texlive is responsible for a fifth of my packages in total (1,013 out of 5,316) and this is on a machine with a lot of Fedora packages installed. I think that this is clearly excessive and texlive sub-packages should be merged so that there are less of them, even if this theoretically results in people having slightly more disk space used.
Not sure where these 1013 packages come from but the total packages in TeX Live is about 4500. The vast number of packages is caused by texlive package set packaged from CTAN. The most of the packages are really about 10-40kB in size so I see no real problem here. The higher granularity of packages will actually cause that you end up with less space consumed by texlive. At teTeX times everything was packaged monolythically so it is good that you end up having only one package but with texlive that will install the whole 2.5G of data. That could be much worse IMO.
We currently have 13284 packages in F18. ~4500 of them are textlive. That means that texlive makes up one third all our package metadata, probably more because of the long package names and the many (inter) dependencies. This causes a lot of traffic for *all users* every time they run yum, regardless of whether they actually use texlive or not. I'm a big fan of modular packaging and I hate dependency chains, that's why I own bug 661442. But even I have to admit that the packaging of texlive is ridiculous.
Not to forget we ship 4500 copies of gpl.txt and other licenses.
If you don't think that there's a bug, time a package update to the TexLive packages that you get from having, say, python-matplotlib installed (which requires dvipng, which pulls in quite a lot of TexLive packages). On my machine both the update and cleanup phase seemed to not be able to go faster than about a package a second. Several hundred tiny packages thus turn into minutes of waiting and waiting and waiting. If TexLive packages grow to the level that they did on my machine with 1000 packages, you eat up huge amounts of time due to those tiny packages and wind up with hour-long 'yum upgrade' runs. Packages are not free and some of the costs are not related to the package size.
(In reply to comment #2) > We currently have 13284 packages in F18. ~4500 of them are textlive. That > means that texlive makes up one third all our package metadata, probably > more because of the long package names and the many (inter) dependencies. > This causes a lot of traffic for *all users* every time they run yum, > regardless of whether they actually use texlive or not. We have around 13232 source packages of which one of those is texlive. There's a lot more the 13K binary packages. I agree it's completely absurd though.
(In reply to comment #3) > Not to forget we ship 4500 copies of gpl.txt and other licenses. At the very least there should be a single texlive-license package which all the others depend on similar to what bind does.
If I used the cpu time needed for updating texlive to mine for bitcoins I'd already be rich by now.
(In reply to comment #5) > We have around 13232 source packages of which one of those is texlive. > There's a lot more the 13K binary packages. Ok, if my counting is right, we have close to binary 34600 packages in Fedora 18, ~ 4500 of them only for texlive.
Why this is closed with NOTABUG? There must be bug in specfile in Texlive source package.
Hi Christoph, (In reply to comment #3) > Not to forget we ship 4500 copies of gpl.txt and other licenses. actually we don't. Please note the packaging is designed to avoid such wastes. All the licenses are stored in texlive-base package which is always installed when you have any of TeX packages installed. The packages themselves just contain symlink to relevant license stored in /usr/share/texlive/licenses.
Just a note for anyone who considers the packaging incomprehensible. There are not only packages themselves but also: 1) schemes (texlive-scheme-*) packages 2) collections (texlive-collection-*) packages both of these are just metapackages (designed by TeX Live's upstream) to let you decide to install a particular size of installtion or functionality, e.g.: texlive-scheme-medium, texlive-scheme-tetex, etc. These package sets do overlap. It is used for users which don't understand much of TeX but wants a particular type of deployment. The collections are more fine-grained schemes for people who understand TeX a bit and demands some of its feature, e.g.: texlive-collection-publishers, texlive-collection-langitalian, etc. there are currently 83 such collections present in texlive. Why not to package TeX Live into collections as specified by upstream? Because their package sets do overlap. So to provide all this functionality (maintained by upstream) to end user, we need to have per-package granularity per one CTAN archive.
(In reply to comment #11) > Just a note for anyone who considers the packaging incomprehensible. There > are not only packages themselves but also: > > 1) schemes (texlive-scheme-*) packages > 2) collections (texlive-collection-*) packages > > both of these are just metapackages (designed by TeX Live's upstream) to let > you decide to install a particular size of installtion or functionality, > e.g.: texlive-scheme-medium, texlive-scheme-tetex, etc. These package sets > do overlap. It is used for users which don't understand much of TeX but > wants a particular type of deployment. > > The collections are more fine-grained schemes for people who understand TeX > a bit and demands some of its feature, e.g.: texlive-collection-publishers, > texlive-collection-langitalian, etc. there are currently 83 such collections > present in texlive. > > Why not to package TeX Live into collections as specified by upstream? > Because their package sets do overlap. So to provide all this functionality > (maintained by upstream) to end user, we need to have per-package > granularity per one CTAN archive. That all sounds reasonable but is it documented in an easy to find location in the wiki so that people can be referred to it?
(In reply to comment #12) > (In reply to comment #11) > That all sounds reasonable but is it documented in an easy to find location > in the wiki so that people can be referred to it? It is noted on the http://fedoraproject.org/wiki/Features/TeXLive feature page but indeed I need to start a better page about it as that one is quite obsolete now.
I edited this page to reflect the current status of TeX Live packaging and intro for end-users: https://fedoraproject.org/wiki/Features/TeXLive#Benefit_to_Fedora