Bug 227117

Summary: Review Request: tagsoup-1.0.1-1jpp - A SAX-compliant parser written in Java that parses HTML as it is found in the wild: nasty and brutish
Product: [Fedora] Fedora Reporter: Rafael H. Schloming <rafaels>
Component: Package ReviewAssignee: Permaine Cheung <pcheung>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Package Reviews List <fedora-package-review>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: lkundrak, orion, tross
Target Milestone: ---Flags: pcheung: fedora-review+
kevin: fedora-cvs+
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-03-12 21:47:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rafael H. Schloming 2007-02-02 17:58:36 UTC
Spec URL: http://people.redhat.com/rafaels/specs/tagsoup-1.0.1-1jpp.spec
SRPM URL: ftp://jpackage.hmdc.harvard.edu/JPackage/1.7/generic/SRPMS.free/tagsoup-1.0.1-1jpp.src.rpm
Description: TagSoup is a SAX-compliant parser written in Java that, instead of
parsing well-formed or valid XML, parses HTML as it is found in the wild: nasty
and brutish, though quite often far from short. TagSoup is designed for people
who have to process this stuff using some semblance of a rational application
design. By providing a SAX interface, it allows standard XML tools to be
applied to even the worst HTML.

Javadoc for tagsoup.

Comment 1 Permaine Cheung 2007-02-12 17:27:24 UTC
      X MUST: rpmlint must be run on every package. The output should be posted
in the review.
rpmlint output:
E: tagsoup summary-too-long A SAX-compliant parser written in Java that parses
HTML as it is found in the wild: nasty and brutish
W: tagsoup non-standard-group Text Processing/Markup/XML
W: tagsoup mixed-use-of-spaces-and-tabs (spaces: line 9, tab: line 41)
Error while reading /home/pcheung/tagsoup: error reading package header
E: tagsoup summary-too-long A SAX-compliant parser written in Java that parses
HTML as it is found in the wild: nasty and brutish
W: tagsoup non-standard-group Text Processing/Markup/XML
W: tagsoup-javadoc non-standard-group Development/Documentation
 
      - MUST: The package must be named according to the Package Naming Guidelines.
      - MUST: The spec file name must match the base package %{name}, in the
format %{name}.spec
 
      X MUST: The package must meet the Packaging Guidelines.
 
Release: should be 1jpp.1%{?dist}
BuildRoot: should be
%{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)
 
      - MUST: The package must be licensed with an open-source compatible
license and meet other legal requirements as defined in the legal section of
Packaging Guidelines.
      - MUST: The License field in the package spec file must match the actual
license.
      X MUST: If (and only if) the source package includes the text of the
license(s) in its own file, then that file, containing the text of the
license(s) for the package must be included in %doc.
 
LICENCE file missing in %doc
 
      X MUST: The spec file must be written in American English.
not needed:
Vendor:         JPackage Project
Distribution:   JPackage
When adding gcj bits, remove BuildArch:      noarch
The license is a  disjunction of the Academic Free License,
version 3.0, and the GNU General Public License, version 2.0
lines are > 80 characters: line 37, 76
URL doesn't exist: http://mercury.ccil.org/~cowan/XML/tagsoup/
 
      - MUST: The spec file for the package MUST be legible. If the reviewer is
unable to read the spec file, it will be impossible to perform a review. Fedora
is not the place for entries into the Obfuscated Code Contest (
http://www.ioccc.org/).
      - MUST: The sources used to build the package must match the upstream
source, as provided in the spec URL. Reviewers should use md5sum for this task.
      - MUST: The package must successfully compile and build into binary rpms
on at least one supported architecture.
      - MUST: If the package does not successfully compile, build or work on an
architecture, then those architectures should be listed in the spec in
ExcludeArch. Each architecture listed in ExcludeArch needs to have a bug filed
in bugzilla, describing the reason that the package does not compile/build/work
on that architecture. The bug number should then be placed in a comment, next to
the corresponding ExcludeArch line. New packages will not have bugzilla entries
during the review process, so they should put this description in the comment
until the package is approved, then file the bugzilla entry, and replace the
long explanation with the bug number. (Extras Only) The bug should be marked as
blocking one (or more) of the following bugs to simplify tracking such issues:
FE-ExcludeArch-x86, FE-ExcludeArch-x64, FE-ExcludeArch-ppc
      - MUST: All build dependencies must be listed in BuildRequires, except for
any that are listed in the exceptions section of Packaging Guidelines; inclusion
of those as BuildRequires is optional. Apply common sense.
      - MUST: The spec file MUST handle locales properly. This is done by using
the %find_lang macro. Using %{_datadir}/locale/* is strictly forbidden.
      - MUST: Every binary RPM package which stores shared library files (not
just symlinks) in any of the dynamic linker's default paths, must call ldconfig
in %post and %postun. If the package has multiple subpackages with libraries,
each subpackage should also have a %post/%postun section that calls
/sbin/ldconfig. An example of the correct syntax for this is:
 
      - MUST: If the package is designed to be relocatable, the packager must
state this fact in the request for review, along with the rationalization for
relocation of that specific package. Without this, use of Prefix: /usr is
considered a blocker.
      - MUST: A package must own all directories that it creates. If it does not
create a directory that it uses, then it should require a package which does
create that directory. The exception to this are directories listed explicitly
in the Filesystem Hierarchy Standard (
http://www.pathname.com/fhs/pub/fhs-2.3.html), as it is safe to assume that
those directories exist.
      - MUST: A package must not contain any duplicate files in the %files listing.
      - MUST: Permissions on files must be set properly. Executables should be
set with executable permissions, for example. Every %files section must include
a %defattr(...) line.
      - MUST: Each package must have a %clean section, which contains rm -rf
%{buildroot} (or $RPM_BUILD_ROOT).
      - MUST: Each package must consistently use macros, as described in the
macros section of Packaging Guidelines.
      - MUST: The package must contain code, or permissable content. This is
described in detail in the code vs. content section of Packaging Guidelines.
      - MUST: Large documentation files should go in a -doc subpackage. (The
definition of large is left up to the packager's best judgement, but is not
restricted to size. Large can refer to either size or quantity)
      - MUST: If a package includes something as %doc, it must not affect the
runtime of the application. To summarize: If it is in %doc, the program must run
properly if it is not present.
      - MUST: Header files or static libraries must be in a -devel package.
      - MUST: Packages containing pkgconfig(.pc) files must 'Requires:
pkgconfig' (for directory ownership and usability).
      - MUST: If a package contains library files with a suffix (e.g.
libfoo.so.1.1), then library files that end in .so (without suffix) must go in a
-devel package.
      - MUST: In the vast majority of cases, devel packages must require the
base package using a fully versioned dependency: Requires: %{name} =
%{version}-%{release}
      - MUST: Packages must NOT contain any .la libtool archives, these should
be removed in the spec.
      - MUST: Packages containing GUI applications must include a
%{name}.desktop file, and that file must be properly installed with
desktop-file-install in the %install section. This is described in detail in the
desktop files section of Packaging Guidelines. If you feel that your packaged
GUI application does not need a .desktop file, you must put a comment in the
spec file with your explanation.
      - MUST: Packages must not own files or directories already owned by other
packages. The rule of thumb here is that the first package to be installed
should own the files or directories that other packages may rely upon. This
means, for example, that no package in Fedora should ever share ownership with
any of the files or directories owned by the filesystem or man package. If you
feel that you have a good reason to own a file or directory that another package
owns, then please present that at package review time.
 


Comment 2 Vivek Lakshmanan 2007-02-13 02:12:17 UTC
(In reply to comment #1)
>       X MUST: rpmlint must be run on every package. The output should be
>       posted
> in the review.
> rpmlint output:
> E: tagsoup summary-too-long A SAX-compliant parser written in Java that
> parses
> HTML as it is found in the wild: nasty and brutish
  Fixed.
> W: tagsoup non-standard-group Text Processing/Markup/XML
  Ignoring
> W: tagsoup mixed-use-of-spaces-and-tabs (spaces: line 9, tab: line 41)
  Fixed.
> Error while reading /home/pcheung/tagsoup: error reading package header
? Dont see this anymore
>       X MUST: The package must meet the Packaging Guidelines.
>  
> Release: should be 1jpp.1%{?dist}
Fixed.

> BuildRoot: should be
> %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)
Done
>       X MUST: If (and only if) the source package includes the text of the
> license(s) in its own file, then that file, containing the text of the
> license(s) for the package must be included in %doc.
>  
> LICENCE file missing in %doc
Fixed.

>  
>       X MUST: The spec file must be written in American English.
> not needed:
> Vendor:         JPackage Project
> Distribution:   JPackage
Fixed
> When adding gcj bits, remove BuildArch:      noarch
?

> The license is a  disjunction of the Academic Free License,
> version 3.0, and the GNU General Public License, version 2.0
The GPL (like) item in the License field is good enough IMO

> lines are > 80 characters: line 37, 76
Fixed but some lines are forced to overflow (especially after GCJ support)
> URL doesn't exist: http://mercury.ccil.org/~cowan/XML/tagsoup/
Changed to http://home.ccil.org/~cowan/XML/tagsoup/

Also fixed the javadoc handling following the suggestion here:
https://zarb.org/pipermail/jpackage-discuss/2007-February/011119.html

NOTE: SRPM/RPMS available at:
http://tequila-sunrise.ath.cx/rpmreviews/F7/tagsoup/tagsoup-1.0.1-1jpp.1.fc7.src.rpm




Comment 3 Vivek Lakshmanan 2007-02-13 02:14:59 UTC
GCJ support has also been added

Comment 4 Permaine Cheung 2007-02-13 02:52:21 UTC
I'm still getting this on the srpm:
W: tagsoup mixed-use-of-spaces-and-tabs (spaces: line 9, tab: line 54) (this is
on the gcj requires)
and this on the binary rpm:
W: tagsoup incoherent-version-in-changelog 0:1.0.1-1jpp.1.fc7 0:1.0.1-1jpp.1
                                                                                
Buildroot should be:
%{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n
(-root missing)
                                                                                
Also, since gcj support is added, we need to have %define _with_gcj_support 1.


Comment 5 Vivek Lakshmanan 2007-02-13 15:49:39 UTC
(In reply to comment #4)
> I'm still getting this on the srpm:
> W: tagsoup mixed-use-of-spaces-and-tabs (spaces: line 9, tab: line 54) (this is
> on the gcj requires)
Oops, fixed...

> and this on the binary rpm:
> W: tagsoup incoherent-version-in-changelog 0:1.0.1-1jpp.1.fc7 0:1.0.1-1jpp.1
>                                                                                 
> Buildroot should be:
> %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n
> (-root missing)
>                                                                                 
Thanks

> Also, since gcj support is added, we need to have %define _with_gcj_support 1.
> 
Fixed.
The updated package is available at the same URL



Comment 6 Permaine Cheung 2007-02-13 18:52:39 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > I'm still getting this on the srpm:
> > W: tagsoup mixed-use-of-spaces-and-tabs (spaces: line 9, tab: line 54) (this is
> > on the gcj requires)
> Oops, fixed...
> 
Great
> > and this on the binary rpm:
> > W: tagsoup incoherent-version-in-changelog 0:1.0.1-1jpp.1.fc7 0:1.0.1-1jpp.1
> >                                                                                 
This is fixed too

> > Buildroot should be:
> > %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n
> > (-root missing)
> >                                                                                 
> Thanks
> 
Sweet
> > Also, since gcj support is added, we need to have %define _with_gcj_support 1.
> > 
> Fixed.
> The updated package is available at the same URL
> 
> 
rpmlint on the rpms built in mock:
[pcheung@to-fcjpp1 ~]$ rpmlint
/var/lib/mock/fedora-development-x86_64-core-pcheung/result/tagsoup-*
W: tagsoup non-standard-group Text Processing/Markup/XML
W: tagsoup non-standard-group Text Processing/Markup/XML
W: tagsoup-javadoc non-standard-group Development/Documentation


APPROVED. Thanks!


Comment 7 Vivek Lakshmanan 2007-02-15 00:47:29 UTC
Funny... Acording to http://fedoraproject.org/wiki/JavaPackagingStatus#preview
you own the package so I am asigning the bug back to you. Please build the
package on plague and let me know so I can close the bug...

Comment 8 Permaine Cheung 2007-03-05 17:02:33 UTC
New Package CVS Request
=======================
Package Name: tagsoup
Short Description: A SAX-compliant HTML parser written in Java
Owners: pcheung
Branches: 
InitialCC: 

Comment 9 Permaine Cheung 2007-03-12 21:47:44 UTC
Package built into plague. Closing as NEXTRELEASE.

Comment 10 Orion Poplawski 2010-01-08 16:34:12 UTC
This builds in EL-5, could we get a branch made for that?

Comment 11 Lubomir Rintel 2010-07-20 08:39:02 UTC
Orion: just requesting an EL-6 branch, pcheung stated to be ok with that by mail; I guess that applies to EL-5 as well.

Package Change Request
======================
Package Name: tagsoup
New Branches: EL-6
Owners: lkundrak

Fedora maintainer was mailed and does not wish to maintain the branch.

Comment 12 Kevin Fenzi 2010-07-21 05:04:27 UTC
CVS done (by process-cvs-requests.py).