Bug 542045

Summary: Review Request: php-htmlpurifier - standards-compliant HTML filter library
Product: [Fedora] Fedora Reporter: David Nalley <david>
Component: Package ReviewAssignee: Pavel Alexeev <pahan>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: awilliam, fedora, fedora-package-review, gwync, ignatenko, james.hogarth, pahan, redhat-bugzilla, rvokal
Target Milestone: ---Flags: pahan: fedora-review+
gwync: fedora-cvs+
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-04 01:27:52 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 576032    
Bug Blocks: 544722, 544724    

Description David Nalley 2009-11-28 03:08:23 UTC
Spec URL: http://ke4qqq.fedorapeople.org/php-htmlpurifier.spec
SRPM URL: http://ke4qqq.fedorapeople.org/php-htmlpurifier-4.0.0-1.fc12.src.rpm
Description: 
Standards-compliant HTML filter library written in
PHP. HTML Purifier will not only remove all malicious code
(better known as XSS) with a thoroughly audited, secure yet permissive
whitelist, it will also make sure your documents are standards compliant,
something only achievable with a comprehensive knowledge of W3C's
specifications.


rpmlint output: 
php-htmlpurifier.src: W: summary-not-capitalized standards-compliant HTML filter library
1 packages and 0 specfiles checked; 0 errors, 1 warnings.
[ke4qqq@nalleyx60 SPECS]$ rpmlint ../RPMS/noarch/php-htmlpurifier-4.0.0-1.fc12.noarch.rpm 
php-htmlpurifier.noarch: W: summary-not-capitalized standards-compliant HTML filter library
1 packages and 0 specfiles checked; 0 errors, 1 warnings.


The warning being announced by rpmlint appears to be due to the capitilization of HTML.

Comment 1 Pavel Alexeev 2010-01-03 11:22:29 UTC
> The warning being announced by rpmlint appears to be due to the capitilization
> of HTML.  
No. It says what summary starts from lower letter. It want see it as "Standards-compliant HTML filter library"

Comment 2 David Nalley 2010-01-04 00:48:15 UTC
Thanks for catching that Pavel - 

I have updated the srpm and spec: 

Spec URL: http://ke4qqq.fedorapeople.org/php-htmlpurifier.spec
SRPM URL: http://ke4qqq.fedorapeople.org/php-htmlpurifier-4.0.0-2.fc12.src.rpm

Comment 3 Pavel Alexeev 2010-01-04 09:47:20 UTC
I'll review it.

Comment 4 Pavel Alexeev 2010-01-04 12:03:33 UTC
Legend: + - Ok.
- - Error.
+/- - It item acceptable, but I strongly recommend enhancement.
= - N/A.
MUST Items

[+] MUST: rpmlint must be run on every package. The output should be posted in the review.
$ rpmlint *
2 packages and 1 specfiles checked; 0 errors, 0 warnings.

[+] MUST: The package must be named according to the Package Naming Guidelines.
[+] MUST: The spec file name must match the base package %{name}, in the format %{name}.spec unless your package has an exemption.
[-] MUST: The package must meet the Packaging Guidelines.
Package have PEAR channel for installation: http://htmlpurifier.org/download#PEAR
And must be registered in system PEAR database: https://fedoraproject.org/wiki/Packaging:PHP#PEAR_Packages_from_a_non_standard_channel.2Frepository

[+] MUST: The package must be licensed with a Fedora approved license and meet the Licensing Guidelines.
[+] MUST: The License field in the package spec file must match the actual license.
[+] MUST: If (and only if) the source package includes the text of the license(s) in its own file, then that file, containing the text of the license(s) for the package must be included in %doc.
[+] MUST: The spec file must be written in American English.
[+] MUST: The spec file for the package MUST be legible.
[+] MUST: The sources used to build the package must match the upstream source, as provided in the spec URL. Reviewers should use md5sum for this task. If no upstream URL can be specified for this package, please see the Source URL Guidelines for how to deal with this.

$ md5sum htmlpurifier-4.0.0.tar.gz htmlpurifier-4.0.0.tar.gz_reviewed
88c6107a278aeb18757a1c99b03be59a  htmlpurifier-4.0.0.tar.gz
88c6107a278aeb18757a1c99b03be59a  htmlpurifier-4.0.0.tar.gz_reviewed

[+] MUST: The package MUST successfully compile and build into binary rpms on at least one primary architecture.
http://koji.fedoraproject.org/koji/taskinfo?taskID=1900534
[=] MUST: If the package does not successfully compile, build or work on an architecture, then those architectures should be listed in the spec in ExcludeArch. Each architecture listed in ExcludeArch MUST have a bug filed in bugzilla, describing the reason that the package does not compile/build/work on that architecture. The bug number MUST be placed in a comment, next to the corresponding ExcludeArch line.
[+] MUST: All build dependencies must be listed in BuildRequires, except for any that are listed in the exceptions section of the Packaging Guidelines ; inclusion of those as BuildRequires is optional. Apply common sense.
[+] MUST: The spec file MUST handle locales properly. This is done by using the %find_lang macro. Using %{_datadir}/locale/* is strictly forbidden.
[=] MUST: Every binary RPM package (or subpackage) which stores shared library files (not just symlinks) in any of the dynamic linker's default paths, must call ldconfig in %post and %postun.
[+] MUST: Packages must NOT bundle copies of system libraries.
[=] MUST: If the package is designed to be relocatable, the packager must state this fact in the request for review, along with the rationalization for relocation of that specific package. Without this, use of Prefix: /usr is considered a blocker.
[+] MUST: A package must own all directories that it creates. If it does not create a directory that it uses, then it should require a package which does create that directory.
[+] MUST: A Fedora package must not list a file more than once in the spec file's %files listings.
[+] MUST: Permissions on files must be set properly. Executables should be set with executable permissions, for example. Every %files section must include a %defattr(...) line.
[+] MUST: Each package must have a %clean section, which contains rm -rf %{buildroot} (or $RPM_BUILD_ROOT).
[+] MUST: Each package must consistently use macros.
[+] MUST: The package must contain code, or permissable content.
[-] MUST: Large documentation files must go in a -doc subpackage. (The definition of large is left up to the packager's best judgement, but is not restricted to size. Large can refer to either size or quantity).

Package have many useful documentation and it should be included in %doc

[+] MUST: If a package includes something as %doc, it must not affect the runtime of the application. To summarize: If it is in %doc, the program must run properly if it is not present.
[=] MUST: Header files must be in a -devel package.
[=] MUST: Static libraries must be in a -static package.
[=] MUST: Packages containing pkgconfig(.pc) files must 'Requires: pkgconfig' (for directory ownership and usability).
[=] MUST: If a package contains library files with a suffix (e.g. libfoo.so.1.1), then library files that end in .so (without suffix) must go in a -devel package.
[=] MUST: In the vast majority of cases, devel packages must require the base package using a fully versioned dependency: Requires: %{name} = %{version}-%{release}
[=] MUST: Packages must NOT contain any .la libtool archives, these must be removed in the spec if they are built.
[=] MUST: Packages containing GUI applications must include a %{name}.desktop file, and that file must be properly installed with desktop-file-install in the %install section. If you feel that your packaged GUI application does not need a .desktop file, you must put a comment in the spec file with your explanation.
[+] MUST: Packages must not own files or directories already owned by other packages. The rule of thumb here is that the first package to be installed should own the files or directories that other packages may rely upon. This means, for example, that no package in Fedora should ever share ownership with any of the files or directories owned by the filesystem or man package. If you feel that you have a good reason to own a file or directory that another package owns, then please present that at package review time.
[+] MUST: At the beginning of %install, each package MUST run rm -rf %{buildroot} (or $RPM_BUILD_ROOT).
[+] MUST: All filenames in rpm packages must be valid UTF-8.

SHOULD Items:

[=] SHOULD: If the source package does not include license text(s) as a separate file from upstream, the packager SHOULD query upstream to include it.
[=] SHOULD: The description and summary sections in the package spec file should contain translations for supported Non-English languages, if available.
[+] SHOULD: The package should compile and build into binary rpms on all supported architectures.
[=] SHOULD: The reviewer should test that the package functions as described. A package should not segfault instead of running, for example.
[+] SHOULD: If scriptlets are used, those scriptlets must be sane. This is vague, and left up to the reviewers judgement to determine sanity.
[=] SHOULD: Usually, subpackages other than devel should require the base package using a fully versioned dependency.
[=] SHOULD: The placement of pkgconfig(.pc) files depends on their usecase, and this is usually for development purposes, so should be placed in a -devel pkg. A reasonable exception is that the main pkg itself is a devel tool not installed in a user runtime, e.g. gcc or gdb.
[=] SHOULD: If the package has file dependencies outside of /etc, /bin, /sbin, /usr/bin, or /usr/sbin consider requiring the package which provides the file instead of the file itself.

Summary of review: Please include documentation and properly register in system PEAR database and I'll approve package.

Comment 5 Pavel Alexeev 2010-01-04 18:56:41 UTC
One additional note, INSTALL says:
"These optional extensions can enhance the capabilities of HTML Purifier:
    * iconv  : Converts text to and from non-UTF-8 encodings
    * bcmath : Used for unit conversion and imagecrash protection
    * tidy   : Used for pretty-printing HTML"

I think add Requires php-bcmath and php-tidy is good idea to provide full power of package out of the box (until Fedora has not soft dependencies and install suggestions)

Comment 6 David Nalley 2010-01-30 05:25:28 UTC
sorry for letting this sit dormant for so long. This will mean that I need to package the pear channel provider. I'll try and get that up for review shortly. 

Thanks

Comment 7 David Nalley 2010-03-23 02:59:04 UTC
I have the pear channel provider packaged (after far too long a delay) and it's now bug 576032

Comment 8 Pavel Alexeev 2010-03-23 09:57:50 UTC
Hm, I does not mean package external repository configuration, just register this package. But in any case it may be useful.

Comment 9 David Nalley 2010-03-25 02:34:01 UTC
How does one register it without the channel configuration?? (Honestly I don't know, I could well be doing it wrong) 

Anyway - 

SPEC: http://ke4qqq.fedorapeople.org/php-htmlpurifier-htmlpurifer.spec
SRPM: http://ke4qqq.fedorapeople.org/php-htmlpurifier-htmlpurifier-4.0.0-3.fc12.src.rpm

As an aside - file placement may well be wrong, however looking at: 
http://fedoraproject.org/wiki/Packaging/PHP#File_Placement
does not clearly deal with non-PEAR pear channels. Comments and corrections welcome. 

Thanks!!

Comment 10 Pavel Alexeev 2010-03-27 21:38:32 UTC
Ok, I'll review php-channel-htmlpurifier then we go back here. Ok?

Comment 12 Adam Williamson 2011-08-31 03:32:32 UTC
hey, david, still active? I might pick up this review as I'm thinking of packaging tt-rss, and it uses htmlpurifier. Are you aware of any changes that might be needed now, or should Pavel's comment "Please include documentation and properly register in system PEAR database and I'll approve package." still apply, and thus you'd expect this to be approved now? Thanks!

Comment 13 David Nalley 2011-08-31 13:07:14 UTC
I'll admit this has dropped down a bit lower on my todo list, but I have htmlpurifier that was missed as a bundled lib in another one of my packages (sahana) so needs to get taken care of at some point. 

The only issue I see is that 4.3.0 is the latest version. I'll see if I can get this updated in the next day or so. My expectation  is that now that the channel has been approved there should be no further blockers to getting this approved.

Comment 14 Adam Williamson 2011-08-31 16:27:27 UTC
Cool - if you submit 4.3.0, I'll review it. thanks!

Comment 15 Pavel Alexeev 2011-09-03 08:03:37 UTC
Adam, I am revieer on this package. And please, submit new version and I' look on it.

Comment 16 Adam Williamson 2011-09-03 08:10:01 UTC
pavel: you didn't reply to david's last update for seven months, so I kinda figured you'd given it up.

Comment 17 Pavel Alexeev 2011-09-03 16:22:19 UTC
Sorry, I missed it. Why anyone don't ping me much early??

Comment 18 David Nalley 2011-09-03 20:33:30 UTC
Pavel, 

It kinda dropped off my radar - and until Adam mentioned it again it was so far down on my todo list, that I didn't think about it. 

SPEC: http://ke4qqq.fedorapeople.org/php-htmlpurifier-htmlpurifer.spec
SRPM: http://ke4qqq.fedorapeople.org/php-htmlpurifier-htmlpurifier-4.3.0-1.fc15.src.rpm

Thanks guys. 
adamw - unless you tell me not to, my plan is to add you as a comaintainer when I make the SCM request.

Comment 19 Pavel Alexeev 2011-09-04 12:25:29 UTC
1) Invalid spec name:
php-htmlpurifier-htmlpurifier.src: E: invalid-spec-name
2) php-htmlpurifier-htmlpurifier.noarch: W: spelling-error %description -l en_US whitelist -> white list, white-list, whistle
3) http://koji.fedoraproject.org/koji/getfile?taskID=3323223&name=php-htmlpurifier-htmlpurifier-doc-4.3.0-1.fc17.noarch.rpm
4) php-htmlpurifier-htmlpurifier-doc.noarch: E: htaccess-file /usr/share/doc/php-htmlpurifier-htmlpurifier-doc-4.3.0/benchmarks/.htaccess

I think it may be ignored.
5) php-htmlpurifier-htmlpurifier.noarch: W: no-documentation
As your doc sub-package independent and does not require main package, at least LICENSE (and may be CREDITS) should be duplicated:
http://fedoraproject.org/wiki/Packaging:LicensingGuidelines#Subpackage_Licensing

Consider also require main package and move these docs to main package instead of sub-package:
CREDITS, FOCUS, LICENSE, NEWS, README, TODO, VERSION, WHATSNEW
6) Doc sub-package must go in Documentation group
http://fedoraproject.org/wiki/Packaging/Guidelines#Documentation
7) If you define macros %channel, use it in
Requires: php-channel(htmlpurifier)
Provides:     php-pear(htmlpurifier/htmlpurifier) = %{version}
too.
In last also %pear_name should be.
8) For what you define %php_libname and %pear_name with same content? Is %pear_name is fully enough?
9) Requires:  iconv is redundant. There no package with this name in Fedora, and binary provided by glibc:
$ rpm -qf `which iconv`
glibc-common-2.14-5.i686

Comment 20 Pavel Alexeev 2011-09-04 12:27:18 UTC
Sorry about mistaken copied 3rd item, it should be:
3) php-htmlpurifier-htmlpurifier-doc.noarch: W: spurious-executable-perm /usr/share/doc/php-htmlpurifier-htmlpurifier-doc-4.3.0/tests/HTMLPurifier/Injector/RemoveSpansWithoutAttributesTest.php

Comment 21 Adam Williamson 2011-09-06 21:53:20 UTC
david: well, not really necessary - by 'pick this up' i meant the review, not the package. i'm a provenpackager anyway.

Comment 22 Adam Williamson 2011-09-06 21:55:31 UTC
it seems like you changed the name from php-htmlpurifier to php-htmlpurifier-htmlpurifier in 2011-03, which doesn't make sense afaict. the first name was correct, it seems like.

Comment 23 David Nalley 2011-09-15 01:47:24 UTC
(In reply to comment #22)
> it seems like you changed the name from php-htmlpurifier to
> php-htmlpurifier-htmlpurifier in 2011-03, which doesn't make sense afaict. the
> first name was correct, it seems like.


From: http://fedoraproject.org/wiki/Packaging:PHP#Naming_scheme

Packages from another channel should be named php-ChannelAlias-PackageName-%{version}-%{release}.noarch.rpm.

This comes from the htmlpurifier channel (the channel stuff was packaged separately as seen above)- hence php-htmlpurifier-htmlpurifier

Comment 24 David Nalley 2011-09-15 02:20:10 UTC
Pavel, 

Thanks for the review!

SPEC: http://ke4qqq.fedorapeople.org/php-htmlpurifier-htmlpurifer.spec
SRPM:
http://ke4qqq.fedorapeople.org/php-htmlpurifier-htmlpurifier-4.3.0-2.fc15.src.rpm



(In reply to comment #19)
> 1) Invalid spec name:
> php-htmlpurifier-htmlpurifier.src: E: invalid-spec-name

What's wrong with the spec name? 

> 2) php-htmlpurifier-htmlpurifier.noarch: W: spelling-error %description -l
> en_US whitelist -> white list, white-list, whistle

fixed

> 3)
> http://koji.fedoraproject.org/koji/getfile?taskID=3323223&name=php-htmlpurifier-htmlpurifier-doc-4.3.0-1.fc17.noarch.rpm
> 4) php-htmlpurifier-htmlpurifier-doc.noarch: E: htaccess-file
> /usr/share/doc/php-htmlpurifier-htmlpurifier-doc-4.3.0/benchmarks/.htaccess
> 
> I think it may be ignored.

This is removed

> 5) php-htmlpurifier-htmlpurifier.noarch: W: no-documentation
> As your doc sub-package independent and does not require main package, at least
> LICENSE (and may be CREDITS) should be duplicated:

Fixed

> http://fedoraproject.org/wiki/Packaging:LicensingGuidelines#Subpackage_Licensing
> 
> Consider also require main package and move these docs to main package instead
> of sub-package:
> CREDITS, FOCUS, LICENSE, NEWS, README, TODO, VERSION, WHATSNEW

Done

> 6) Doc sub-package must go in Documentation group
> http://fedoraproject.org/wiki/Packaging/Guidelines#Documentation

Done

> 7) If you define macros %channel, use it in
> Requires: php-channel(htmlpurifier)
> Provides:     php-pear(htmlpurifier/htmlpurifier) = %{version}
> too.
> In last also %pear_name should be.

Done

> 8) For what you define %php_libname and %pear_name with same content? Is
> %pear_name is fully enough?

%pear_name was used only once, so I adopted php_libname for that. 

> 9) Requires:  iconv is redundant. There no package with this name in Fedora,
> and binary provided by glibc:
> $ rpm -qf `which iconv`
> glibc-common-2.14-5.i686

fixed.

Comment 25 Pavel Alexeev 2011-09-16 14:23:08 UTC
(In reply to comment #24)
> Pavel, 
> What's wrong with the spec name? 

It must be named:
php-htmlpurifier-htmlpurifier.spec
nor
php-htmlpurifier-htmlpurifer.spec

Comment 27 Pavel Alexeev 2011-09-17 17:13:09 UTC
(In reply to comment #24)
> > 5) php-htmlpurifier-htmlpurifier.noarch: W: no-documentation
> > As your doc sub-package independent and does not require main package, at least
> > LICENSE (and may be CREDITS) should be duplicated:
> 
> Fixed
> > http://fedoraproject.org/wiki/Packaging:LicensingGuidelines#Subpackage_Licensing
> > 
> > Consider also require main package and move these docs to main package instead
> > of sub-package:
> > CREDITS, FOCUS, LICENSE, NEWS, README, TODO, VERSION, WHATSNEW
> 
> Done

You doc sub-package still don't require (its ok) base package.
Guidelines (http://fedoraproject.org/wiki/Packaging:LicensingGuidelines#Subpackage_Licensing) say about duplicate only LICENSE file in this case, not all others marked as %doc!
I think there may be (at you choose) also CREDITS. All other should be placed in doc subpackage only.


Please fix it.
In other things it is in good shape.

Package APPROVED.

Comment 28 Pavel Alexeev 2012-03-18 08:11:49 UTC
ping?
Do you plan import that package?

Comment 29 Gwyn Ciesla 2012-05-09 16:38:04 UTC
Ping?

Comment 30 David Nalley 2012-05-10 20:46:19 UTC
My apologies for the incredible lag. 

New Package SCM Request
=======================
Package Name: php-htmlpurifier-htmlpurifier
Short Description: standars-compliant HTML filter library
Owners: ke4qqq
Branches: f16 f17 el6
InitialCC:

Comment 31 Gwyn Ciesla 2012-05-10 22:52:41 UTC
Git done (by process-git-requests).

No worries, life happens. :)

Comment 32 Fedora Update System 2012-05-14 19:59:41 UTC
php-htmlpurifier-htmlpurifier-4.3.0-3.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/php-htmlpurifier-htmlpurifier-4.3.0-3.fc17

Comment 33 Fedora Update System 2012-05-15 02:32:21 UTC
php-htmlpurifier-htmlpurifier-4.3.0-3.fc17 has been pushed to the Fedora 17 testing repository.

Comment 34 Remi Collet 2012-05-17 14:07:37 UTC
Sorry to post late on this review, but I think this package is absolutely broken and don't respect PHP Guidelines.

Also See my comment on #822486

> Name:      php-htmlpurifier-htmlpurifier
Should be php-hp-HTMLPurifier

> Requires:  php >= 5.0.5
Very bad, will pull apache.
If needed use php-common instead (but we don't have any version with older version)

> BuildRequires: php-channel(htmlpurifier)
Sould be php-channel(htmlpurifier.org)

> %setup -qn %{php_libname}-%{version}
Should be %setup -qc because package.xml is outside the %{php_libname}-%{version} folder

> mkdir -p %{buildroot}%{_datadir}/php/%{php_libname}
> cp -a library/*  %{buildroot}%{_datadir}/php/%{php_libname}
Pear package are installed in /usr/share/pear using the pear installer
%{__pear} install --nodeps --packagingroot $RPM_BUILD_ROOT %{name}.xml

> %post
> %{__pear} install --nodeps --soft --force --register-only \
>    %{pear_xmldir}/%{name}.xml >/dev/null || :
Won't work as the %{name}.xml is not part of the XML


Please try:
yum install php-pear-PEAR-Command-Packaging
pear channel-discover htmlpurifier.org
pear download hp/HTMLPurifier
pear make-rpm-spec HTMLPurifier-4.4.0.tgz

You will get in 1' a nearly ready for review package.

Comment 35 David Nalley 2012-05-17 15:00:07 UTC
Remi, 

Thanks for the comments, I'll rectify these issues.

Comment 36 Remi Collet 2012-05-18 05:20:55 UTC
Looking more closely, you have 2 clean solutions:

- use the "standalone" distribution and install this as a PHP library in /usr/share/php
=> should be name php-htmlpurifier and all pear stuff clean

- use the "pear" distribution

It seems you are mixing the 2 (trying to include pear stuff for the standalone version)

Comment 37 Pavel Alexeev 2012-05-19 09:46:39 UTC
(In reply to comment #34)
> Sorry to post late on this review, but I think this package is absolutely
> broken and don't respect PHP Guidelines.
> 
> Also See my comment on #822486
> 
> > Name:      php-htmlpurifier-htmlpurifier
> Should be php-hp-HTMLPurifier
Why?
And in any case it should, not must? Right?

Comment 38 Remi Collet 2012-05-19 10:11:58 UTC
> Why?
See http://fedoraproject.org/wiki/Packaging/PHP#Naming_scheme

> Packages from another channel should be named 
> php-ChannelAlias-PackageName-%{version}-%{release}.noarch.rpm. 

And :
# pear list-all -c htmlpurifier.org
All packages [Channel htmlpurifier.org]:
==========================
Package         Latest Local
hp/HTMLPurifier 4.4.0        Standards-compliant HTML filter

So php-hp-HTMLPurifier

Yes, "hp" is probably an ugly alias, but it's the one chosen by upstream

> And in any case it should, not must? Right?
Yes (like upper/lowercase, like _ or -, etc)

Comment 39 Pavel Alexeev 2012-05-19 11:00:18 UTC
Thanks for notes and clarification.

Comment 40 Igor Gnatenko 2015-05-20 13:12:16 UTC
ping?

Comment 41 James Hogarth 2015-12-04 01:27:52 UTC
This package has been in Fedora for a long while now with this review left open.

Closing the bug to clean up the queue.