Bug 767985 - Review Request: man2html - Convert man pages to HTML
Summary: Review Request: man2html - Convert man pages to HTML
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Michael Schwendt
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-12-15 13:12 UTC by T.C. Hollingsworth
Modified: 2012-07-26 03:58 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-07-26 03:56:25 UTC
Type: ---
Embargoed:
bugs.michael: fedora-review+
gwync: fedora-cvs+


Attachments (Terms of Use)

Description T.C. Hollingsworth 2011-12-15 13:12:31 UTC
Spec URL: http://tchol.org/fedora/rpm2html.spec
SRPM URL: http://tchol.org/fedora/man2html-1.6-1.g.fc16.src.rpm
Koji scratch build: http://koji.fedoraproject.org/koji/taskinfo?taskID=3586845
Description: man2html is a man page to HTML converter.

% rpmlint SPECS/man2html.spec 
# doesn't use GNU autoconf
SPECS/man2html.spec:116: W: configure-without-libdir-spec
# rpmlint doesn't recognize %{patches}
SPECS/man2html.spec: W: patch-not-applied Patch3: man2html-cgi.patch
SPECS/man2html.spec: W: patch-not-applied Patch1025: man2html-all-args.patch
#<snip>
0 packages and 1 specfiles checked; 0 errors, 21 warnings.

% rpmlint RPMS/x86_64/man2html*   
man2html.x86_64: W: manual-page-warning /usr/share/man/man1/hman.1.gz 7: warning: macro `LO' not defined
# cache directory for CGI scripts
man2html.x86_64: E: non-standard-dir-perm /var/cache/man2html 0775L
man2html-core.x86_64: E: incorrect-fsf-address /usr/share/doc/man2html-core-1.6/COPYING
man2html-core.x86_64: W: manual-page-warning /usr/share/man/man1/man2html.1.gz 6: warning: macro `LO' not defined
man2html-debuginfo.x86_64: E: incorrect-fsf-address /usr/src/debug/man-1.6g/debian/sources/manwhatis.c
3 packages and 0 specfiles checked; 3 errors, 2 warnings.

Comment 1 T.C. Hollingsworth 2011-12-20 14:09:48 UTC
Sorry, the above Spec URL is incorrect.  It should be:
http://tchol.org/fedora/man2html.spec

Comment 2 Matthias Runge 2012-03-02 20:55:48 UTC
In order to get sponsored into the packager group, please follow 
https://fedoraproject.org/wiki/How_to_get_sponsored_into_the_packager_group#Convincing_someone_to_sponsor_you

You should do some informal reviews (and note the bug numbers e.g. here, as a reference for a potential sponsor).

Comment 3 Michael Schwendt 2012-05-07 20:20:45 UTC
> %setup -q -n man-%{version}%{posttag}
> tar -zxf %SOURCE1

Hint: %setup can extract multiple tarballs, too:

%setup -q -n man-%{version}%{posttag} -a1


> %files
> %{_libdir}/../lib/man2html

Really unusual. Nothing forces you to use %_libdir, especially not if the value of this variable is not passed into the source code's build framework as an option. So, let's see:

> %build
> # not autoconf
> ./configure -d +fhs

$ grep libdir configure
$

That custom configure script understands several options, however, and defaults to -prefix=/usr and then derives other paths from that prefix. It hardcodes a confdir="${confprefix}/lib" path, for example, and the Debian sources hardcode /usr/lib, too.  => Using %_libdir makes no sense.

  /usr/lib/man2html

The spec file would also be more readable when making explicit that a directory is to be included and not a single file. A trailing slash does the trick:

  /usr/lib/man2html/


* https://fedoraproject.org/wiki/Packaging:Guidelines#Compiler_flags

At least the CGI executables are not built with %optflags yet.


* https://fedoraproject.org/wiki/Packaging:Guidelines#All_patches_should_have_an_upstream_bug_link_or_comment

What's the status with regard to that?

Comment 4 T.C. Hollingsworth 2012-05-07 20:51:59 UTC
(In reply to comment #3)
> > %setup -q -n man-%{version}%{posttag}
> > tar -zxf %SOURCE1
> 
> Hint: %setup can extract multiple tarballs, too:
> 
> %setup -q -n man-%{version}%{posttag} -a1

Thanks for the tip.

> 
> > %files
> > %{_libdir}/../lib/man2html
> 
> Really unusual. Nothing forces you to use %_libdir, especially not if the value
> of this variable is not passed into the source code's build framework as an
> option. So, let's see:
> 
> > %build
> > # not autoconf
> > ./configure -d +fhs
> 
> $ grep libdir configure
> $
> 
> That custom configure script understands several options, however, and defaults
> to -prefix=/usr and then derives other paths from that prefix. It hardcodes a
> confdir="${confprefix}/lib" path, for example, and the Debian sources hardcode
> /usr/lib, too.  => Using %_libdir makes no sense.

I wasn't sure the proper way to express "/usr/lib" even on x86_64, so I looked at how systemd did it (that was the first package that came to mind that also needed it).  Back at the time I wrote that spec file, it did it the same way:
http://pkgs.fedoraproject.org/gitweb/?p=systemd.git;a=blob;f=systemd.spec;h=b5affa0d9b5294f591088f62de7c0f7fd28afe8d;hb=refs/heads/f15#l369

However, it's since switched to using %{_prefix}/lib and I recall a discussion on devel that mentioned that that's the way to go too.  I'll fix it.

>   /usr/lib/man2html
> 
> The spec file would also be more readable when making explicit that a directory
> is to be included and not a single file. A trailing slash does the trick:
> 
>   /usr/lib/man2html/
> 
> 
> * https://fedoraproject.org/wiki/Packaging:Guidelines#Compiler_flags
> 
> At least the CGI executables are not built with %optflags yet.

I'll fix these two as well.

> 
> *
> https://fedoraproject.org/wiki/Packaging:Guidelines#All_patches_should_have_an_upstream_bug_link_or_comment
> 
> What's the status with regard to that?

Upstream is very dead; this is essentially the Debian fork of man2html.  I have a comment about what each patch does (most are bugfixes to manpage parsing) and provided the patch number from Debian where relevant.

Comment 5 T.C. Hollingsworth 2012-05-07 20:55:24 UTC
(In reply to comment #2)
> In order to get sponsored into the packager group, please follow 
> https://fedoraproject.org/wiki/How_to_get_sponsored_into_the_packager_group#Convincing_someone_to_sponsor_you
> 
> You should do some informal reviews (and note the bug numbers e.g. here, as a
> reference for a potential sponsor).

Sorry, missed the bugmail with this.  Here are some I've done in the past:
753577 815018 768894 755890 756435 728837 769029

I'll try and do some more this week.

Comment 6 Michael Schwendt 2012-05-07 21:21:22 UTC
> I recall a discussion on devel 

The result of it is covered by
http://fedoraproject.org/wiki/Packaging:Guidelines#Macros

The full background about the small benefit of path macros in some cases is a bit longer than what can be found on that page.

Comment 7 T.C. Hollingsworth 2012-05-09 10:00:10 UTC
Thanks for taking this!  I've updated it to address all the above concerns and fix a bug:

Spec: http://tchol.org/fedora/man2html.spec
SRPM: http://tchol.org/fedora/man2html-1.6-2.g.fc17.src.rpm
Scratch build: http://koji.fedoraproject.org/koji/taskinfo?taskID=4064432


% rpmlint SPECS/man2html.spec 
# these two addressed in our conversation above
SPECS/man2html.spec:118: W: configure-without-libdir-spec
SPECS/man2html.spec:148: E: hardcoded-library-path in %{_prefix}/lib/man2html/
# rpmlint doesn't understand %patches
SPECS/man2html.spec: W: patch-not-applied Patch3: man2html-cgi.patch
<snip list of every patch>
0 packages and 1 specfiles checked; 1 errors, 21 warnings.


% rpmlint RPMS/x86_64/man2html-1.6-2.g.fc17.x86_64.rpm RPMS/x86_64/man2html-core-1.6-2.g.fc17.x86_64.rpm 
# this is documented as a groff macro, not sure why rpmlint doesn't lik it
man2html.x86_64: W: manual-page-warning /usr/share/man/man1/hman.1.gz 7: warning: macro `LO' not defined
# cache directory for cgi scripts
man2html.x86_64: E: non-standard-dir-perm /var/cache/man2html 0775L
# unlikely to be fixed, upstream is dead
man2html-core.x86_64: E: incorrect-fsf-address /usr/share/doc/man2html-core-1.6/COPYING
# see above
man2html-core.x86_64: W: manual-page-warning /usr/share/man/man1/man2html.1.gz 6: warning: macro `LO' not defined
2 packages and 0 specfiles checked; 2 errors, 2 warnings.

Comment 8 Ralf Corsepius 2012-05-09 13:46:12 UTC
The utf8-converted French man2html man-page appears mis-converted.
AFAIS, the original man-page seems latin1-encoded to me, not latin2 encoded.

I haven't checked the Italian man-pages, but I'd assume the same applies to them.

Comment 9 Michael Schwendt 2012-05-09 17:00:06 UTC
Agreed. "iconv -f latin2 -t utf-8" for both the French and the Italian manual gives bad results => invalid words. Even the Italian translator's name is messed up.

Comment 10 T.C. Hollingsworth 2012-05-09 18:33:16 UTC
(In reply to comment #8)
> The utf8-converted French man2html man-page appears mis-converted.
> AFAIS, the original man-page seems latin1-encoded to me, not latin2 encoded.
> 
> I haven't checked the Italian man-pages, but I'd assume the same applies to
> them.

Yeah, that must have been a typo.  Fixed.

Spec: http://tchol.org/fedora/man2html.spec
SRPM: http://tchol.org/fedora/man2html-1.6-3.g.fc17.src.rpm
Scratch build: http://koji.fedoraproject.org/koji/taskinfo?taskID=4065636

rpmlint output is identical.

Comment 11 Michael Schwendt 2012-05-10 12:34:27 UTC
So, here are a couple of findings not limited to the items on the ReviewGuidelines page. This thing is non-trivial to review, but I had expected that.

[...]

* The debian/NEWS file mentions a "man2html-base" package instead of "man2html-core". Indeed, the Debian packages search lists it as "man2html-base". That suggests following
https://fedoraproject.org/wiki/Packaging:NamingGuidelines#General_Naming


* Licensing, minor issues:
  Files man2html/man2html.c and debian/sources/man2html.cgi.c do not
  explicitly refer to GPL, just:

    > Permission is granted to distribute, modify and use this program
    > as long as this comment is not removed or changed.

  The utils.c file only mentions "GPL". Following
  https://fedoraproject.org/wiki/Packaging:LicensingGuidelines#GPL_and_LGPL
  and
  https://fedoraproject.org/wiki/Licensing#Good_Licenses
  that would result in a license tag: GPL+ 

  File manwhatis.c contains a GPLv2 (or later) header.

  So, no big issue. License clarification would be NTH:
  https://fedoraproject.org/wiki/Packaging:LicensingGuidelines#License_Clarification


* Format string warnings in build output! More often than not these are
worth taking a look at, also to avoid surprises on big endian platforms.


* Run-time testing:

$ hman 7 locale
/usr/bin/hman: line 90: lynx: command not found

So, if "lynx" isn't a requirement, the browser ought to be configurable. Let's see:

$ export MANHTMLPAGER=firefox ; hman 7 locale
--> http://localhost/cgi-bin/man/man2html?7+locale

"The requested URL /cgi-bin/man/man2html was not found on this server."

Now returning to lynx after a "yum -y install lynx":

$ hman 7 locale

"Alert!: Executable link rejected due to location or path.

lynx: Can't access startfile lynxcgi:/usr/lib/man2html/cgi-bin/man/man2html?7+locale"


* Fedora related patches and explanations don't seem to be accurate. For example:

man2html-dirs.patch
+sharedir = $(DESTDIR)$(PREFIX)/usr/share/man2html

A path that is not used anywhere in the package. The spec file comment mentions /var/www/cgi-bin for cgi.


* As expected, due to SELinux, httpd is confined as much as not to allow the CGI scripts to access the MAN search paths. That's a blocker, but still only a
SHOULD in the Review Guidelines:

| SHOULD: The reviewer should test that the package functions as
| described. A package should not segfault instead of running, for example.

Currently, the only way to get the scripts to work at all is to change
their file context to httpd_unconfined_script_exec_t.

Neither the Packaging Guidelines nor the Review Guidelines contain
any section on SELinux. I've found just:
https://fedoraproject.org/wiki/PackagingDrafts/SELinux


Comments/suggestions?

Comment 12 T.C. Hollingsworth 2012-05-11 01:34:06 UTC
(In reply to comment #11)
> So, here are a couple of findings not limited to the items on the
> ReviewGuidelines page. This thing is non-trivial to review, but I had expected
> that.
> 
> [...]
> 
> * The debian/NEWS file mentions a "man2html-base" package instead of
> "man2html-core". Indeed, the Debian packages search lists it as
> "man2html-base". That suggests following
> https://fedoraproject.org/wiki/Packaging:NamingGuidelines#General_Naming

Generally, Fedora uses "-core" subpackages for the same reasons/purposes Debian uses "-base" subpackages.  I used "-core" to be consistent with other Fedora packages, but I can change it if consistency with the Debian package is preferred.
 
> * Licensing, minor issues:
>   Files man2html/man2html.c and debian/sources/man2html.cgi.c do not
>   explicitly refer to GPL, just:
> 
>     > Permission is granted to distribute, modify and use this program
>     > as long as this comment is not removed or changed.

This looks like "Copyright Only" [1] to me, but I'll mail the legal list for guidance.

>   The utils.c file only mentions "GPL". Following
>   https://fedoraproject.org/wiki/Packaging:LicensingGuidelines#GPL_and_LGPL
>   and
>   https://fedoraproject.org/wiki/Licensing#Good_Licenses
>   that would result in a license tag: GPL+ 
> 
>   File manwhatis.c contains a GPLv2 (or later) header.
> 
>   So, no big issue. License clarification would be NTH:
>  
> https://fedoraproject.org/wiki/Packaging:LicensingGuidelines#License_Clarification
> 
> 
> * Format string warnings in build output! More often than not these are
> worth taking a look at, also to avoid surprises on big endian platforms.

Okay, I'll see what's up with these.

> * Run-time testing:
> 
> $ hman 7 locale
> /usr/bin/hman: line 90: lynx: command not found
> 
> So, if "lynx" isn't a requirement, the browser ought to be configurable. Let's
> see:
> 
> $ export MANHTMLPAGER=firefox ; hman 7 locale
> --> http://localhost/cgi-bin/man/man2html?7+locale
> 
> "The requested URL /cgi-bin/man/man2html was not found on this server."

I'll patch hman to fix this one.

> Now returning to lynx after a "yum -y install lynx":
> 
> $ hman 7 locale
> 
> "Alert!: Executable link rejected due to location or path.
> 
> lynx: Can't access startfile
> lynxcgi:/usr/lib/man2html/cgi-bin/man/man2html?7+locale"

LYNXCGI is disabled by default for security reasons.  (See /etc/lynx.cfg at line 964 for the details.)  I could ask the lynx maintainer to enable local LYNXCGI and whitelist man2html, but I'm not sure of the security implications of that.

IMHO it would be better to default hman to calling `xdg-open` and leave LYNXCGI configuration to interested users who are comfortable with the security issues.
 
> * Fedora related patches and explanations don't seem to be accurate. For
> example:
> 
> man2html-dirs.patch
> +sharedir = $(DESTDIR)$(PREFIX)/usr/share/man2html
> 
> A path that is not used anywhere in the package. The spec file comment mentions
> /var/www/cgi-bin for cgi.

Sorry, both are leftover from previous renditions of the packaging.  

/usr/share/man2html would have been used by the original man2html CGI scripts to store template files.  Debian's version of the CGI scripts (which I switched to because they are more robust and less buggy) don't require it.

The latter is because I ended up patching away Debian's use of cgi-bin and instead just use an Apache conf file to make http://localhost/man/ work, because mucking about in /var/www isn't allowed in Fedora.

I'll clean up this patch and the comment.

> * As expected, due to SELinux, httpd is confined as much as not to allow the
> CGI scripts to access the MAN search paths. That's a blocker, but still only a
> SHOULD in the Review Guidelines:
> 
> | SHOULD: The reviewer should test that the package functions as
> | described. A package should not segfault instead of running, for example.
> 
> Currently, the only way to get the scripts to work at all is to change
> their file context to httpd_unconfined_script_exec_t.
> 
> Neither the Packaging Guidelines nor the Review Guidelines contain
> any section on SELinux. I've found just:
> https://fedoraproject.org/wiki/PackagingDrafts/SELinux

Thanks, I'll call `semanage fcontext` in the scriptlets for now so it works and file a bug against selinux-policy so it can be fixed properly.

Comment 13 Michael Schwendt 2012-05-11 09:24:21 UTC
# yum list \*-base|wc -l
53
# yum list \*-core|wc -l
83

Just was curious about it. ;)

[...]

> LYNXCGI is disabled by default for security reasons.
> (See /etc/lynx.cfg at line 964 for the details.) 

Then it's not suitable as a default. "elinks" has been the much more common text-based browser anyway. Also in Fedora.

> IMHO it would be better to default hman to calling `xdg-open`

Sounds plausible.

Comment 14 T.C. Hollingsworth 2012-05-13 23:20:53 UTC
This fixes everything mentioned previously:

Spec: http://tchol.org/fedora/man2html.spec
SRPM: http://tchol.org/fedora/man2html-1.6-4.g.fc17.src.rpm
Scratch build: http://koji.fedoraproject.org/koji/taskinfo?taskID=4074599

rpmlint output is identical except for a new "hardcoded-library-path" warning for the selinux %post(un) scriptlets.

Comment 15 T.C. Hollingsworth 2012-05-13 23:32:40 UTC
I filed bug 821286 for proper SELinux support but didn't block this against it since httpd_unconfined_script_exec_t fixes it in the interim.

Comment 16 Michael Schwendt 2012-05-16 17:29:10 UTC
From the diff between -3 and -4:

+
+%post
+#clear out the cache directory so all future pages are regenerated with the new
 build
+rm -f %{_prefix}/cache/man2html/* || :
+
+

%{_prefix} here is wrong. You did want %{_localstatedir}, but you could use /var instead and everywhere else (not just in the patch files).

Again, macros here only add value, if you substituted their values also in the source code. That isn't done for /usr and /var, and not /etc either, so hardcoding /usr and /var (and derived paths) would make sense and would be acceptable, too. Rule of thumb: If you "rpmbuild --rebuild --define "foo bar" …" the src.rpm, the redefined macro values ought to find their way into the built rpms, too. If that isn't the case, the builds are broken due to a mismatch between paths in %files section, scriptlets, and in files included in the package.

The wrong prefix can be fixed in Fedora package git, of course.

Welcome to the packager group!

[...]

Apart from that, installing the builds of man2html-1.6-4.g.fc17.src.rpm now makes the three CGI executables work by default. hman also works great now using xdg-open. (Neat with an already open Firefox, isn't it?). man2html-core succeeds, too. Also the -h option.


APPROVED

Comment 17 T.C. Hollingsworth 2012-05-16 22:47:06 UTC
New Package SCM Request 
======================= 
Package Name: man2html
Short Description: Convert man pages to HTML 
Owners: patches 
Branches: f16 f17 el5 el6 
InitialCC:

Comment 18 Gwyn Ciesla 2012-05-17 02:16:03 UTC
Git done (by process-git-requests).

Comment 19 Fedora Update System 2012-05-18 07:39:08 UTC
man2html-1.6-5.g.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/man2html-1.6-5.g.fc17

Comment 20 Fedora Update System 2012-05-18 07:41:47 UTC
man2html-1.6-5.g.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/man2html-1.6-5.g.fc16

Comment 21 Fedora Update System 2012-05-18 07:47:58 UTC
man2html-1.6-5.g.el6 has been submitted as an update for Fedora EPEL 6.
https://admin.fedoraproject.org/updates/man2html-1.6-5.g.el6

Comment 22 Fedora Update System 2012-05-18 20:25:40 UTC
man2html-1.6-5.g.fc17 has been pushed to the Fedora 17 testing repository.

Comment 23 Fedora Update System 2012-07-26 03:56:25 UTC
man2html-1.6-7.g.fc16 has been pushed to the Fedora 16 stable repository.

Comment 24 Fedora Update System 2012-07-26 03:58:18 UTC
man2html-1.6-8.g.fc17 has been pushed to the Fedora 17 stable repository.


Note You need to log in before you can comment on or make changes to this bug.