Bug 1695946

Summary: html2ps wants to install texlive packages that are NOT generally needed
Product: [Fedora] Fedora Reporter: antofthy <Anthony.Thyssen>
Component: html2psAssignee: Petr Pisar <ppisar>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: unspecified    
Version: rawhideCC: james.34.99smith, kasal, ppisar, rdieter, than
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: html2ps-1.0-0.50.b7.fc39 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-01 07:40:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description antofthy 2019-04-03 23:22:26 UTC
Description of problem:

The html2ps package only requires dvi2ps and TeX for very special cases.
The code for "html2ps" does not require these packages, though will make use of them in a very rare special case is available, but will still work without it.

This causes a very large number of unneeded packages to be installed, wasting a lot of disk space for something the would never be used.

This is also the cause of the simpular problem with a2ps
Bug number 1577008  which is meant to be a small utility program.
but which does make use of html2ps when user wants to print html.


Version-Release number of selected component (if applicable):

html2ps-1.0-0.31


How reproducible:

Always

Steps to Reproduce:
1.install a2ps or html2ps
2.
3.

Actual results:
200+ texlive unneeded packages are also installed

Expected results:
a small minimal install for a small utility.

Additional info:

Comment 1 antofthy 2019-04-03 23:23:51 UTC
Unable to specify html2ps as the component!   Specified texlive as closest match, though it shouldn't be.

Comment 2 James 2021-01-04 17:16:31 UTC Comment hidden (spam)
Comment 3 Petr Pisar 2023-07-31 15:15:38 UTC
I know it's a huge dependency.

Texlive is used for rendering MathML:

    if(&math2sym($math)) {
      $_=$beg.$sym.$end;
    } elsif($package{'TeX'} && $package{'dvips'}) {
      ...
      `tex $scr.tex`;
      `dvips -E -o $scr.ps $scr.dvi`;
      ...
    } else {
      $math=~s/<math$R//i;
      $_=$beg.$math.$end;
    }

math2sym() returns false if not all entities could have been converted to a postscript text directly:

sub math2sym {
  local($_)=@_;
  s/<math$R//gi;
  for $char (keys %symb) {s/&($char)(;|$|(?=\W))/\\$symb{$char}/g};
  $stat=!/([&<][a-zA-Z]|[_^{])/;
  s/[a-zA-Z\s]*[a-zA-Z][a-zA-Z\s]*/)ES()I($&)ES()SY(/g;
  s/(\\200|\\201|\\202)/)RO($&)ES(/g;
  $sym=")SY($_)ES(";
  $stat;
}

Then Texlive is used if configured, or the half-rendered MathML text is simply echoed into the output.

So Texlive is used for fine rendering of too complex mathematical symbols, while a text-only fallback exists.

I tried to convert a simple HTML + MathML example <https://en.wikipedia.org/wiki/MathML#Embedding_MathML_in_HTML.2FXHTML_files>. With Texlive the output cannot be interpreted with GhostScript (it reports /limitcheck error). Though the intermadiate PostScript coming from dvips is interpreted by GhostScript fine. Probably something is wrong with embedding it inside the main PostScript document. I conclude the Texlive mode of operation is broken.

I will try to disable the dependency on tex and dvips in html2ps.

Comment 4 antofthy 2023-08-01 00:11:49 UTC
It is great someone is finally looking at this, old problem.
For years I have been installing "a2ps"
 dnf install a2ps
and then forcibly removing the unneeded TexLive packages
 dnf remove --noautoremove texlive\*