Bug 252108 (python-html5lib)

Summary: Review Request: python-html5lib - A python based HTML5 parser/tokenizer
Product: [Fedora] Fedora Reporter: Oded Arbel <oded>
Component: Package ReviewAssignee: Rahul Sundaram <metherid>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: bugzilla-redhat, davidf, fedora, fedora-package-review, iny, itamar, mattdm, metherid, notting, ruben
Target Milestone: ---Keywords: Reopened
Target Release: ---Flags: metherid: fedora‑review?
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-18 06:22:32 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
cleanup spec
none
Updated spec for python-html5lib
none
Updated spec file according to comments
none
Updated spec file
none
Updated spec file
none
Updated spec file none

Description Oded Arbel 2007-08-13 18:17:21 EDT
Spec URL: http://rpms.geek.co.il/fc7/python-html5lib.spec
SRPM URL: http://rpms.geek.co.il/fc7/python-html5lib-0.9-2fc7.src.rpm
Description: A python library to parse HTML 5 as defined by the WHATWG HTML5 specification, see http://code.google.com/p/html5lib
This package is wanted for building gtk-doc for library.gnome.org, see http://blogs.gnome.org/ovitters/2007/08/13/wanted-html5lib-package-for-epel-fedora/
Comment 1 Oded Arbel 2007-08-13 18:20:08 EDT
rpmlint output:
W: python-html5lib uncompressed-zip /html5lib-0.9.zip

I can recompress the source zip file, but that would break the requirement for
the source file to be md5sum identical to the upstream source, here -
http://fedoraproject.org/wiki/Packaging/ReviewGuidelines
Comment 2 Ruben Kerkhof 2008-01-19 19:00:26 EST
Created attachment 292281 [details]
cleanup spec
Comment 3 Ruben Kerkhof 2008-01-19 19:01:27 EST
Hi Odel,

> W: ipython-html5lib uncompressed-zip /html5lib-0.9.zip

Please consider reporting this upstream.

I've attached a patch to get the spec up-to-date with the latest guidelines.
Comment 4 Jason Tibbitts 2008-01-27 14:58:19 EST
Oded, are you still interested in submitting this package?

BTW, you can ignore the uncompressed-zip complaint; if that's what upstream
decides to ship for whatever crazy reason, then that's what we'll take.
Comment 5 Oded Arbel 2008-01-27 16:14:53 EST
Yes, I am interested in submitting this package - how should I proceed? Also -
there's a new upstream release, so I want to try to submit a package (with all
the spec fixes) based on the latest release.

I've repackaged html5lib 0.10 with the updated spec file and its available here:
SRPM: http://rpms.geek.co.il/fc8/python-html5lib-0.10-1.fc8.src.rpm
SPEC: http://rpms.geek.co.il/fc8/python-html5lib.spec
RPM: http://rpms.geek.co.il/fc8/python-html5lib-0.10-1.fc8.noarch.rpm
Comment 6 Jason Tibbitts 2008-01-27 16:48:56 EST
I'll let Ruben have first crack at this, but first some bureaucracy.  I can't
seem to find you in the Fedora account system at all; is this your first package
for Fedora?  If so, you will need a sponsor before you can be granted access to
the internals of the Fedora build and release systems.  Please see
http://fedoraproject.org/wiki/PackageMaintainers/Join and specifically
http://fedoraproject.org/wiki/PackageMaintainers/HowToGetSponsored
Comment 7 Oded Arbel 2008-01-27 17:01:54 EST
Yes, this is my first package. I'll read up on this and see what I can do. 
Thanks.
Comment 8 Ruben Kerkhof 2008-01-28 18:18:03 EST
Hi Odel, thanks for updating the package.
I'm a bit busy atm, but I'll try to have another look later this week.

Someone else has to sponsor you however.
Comment 9 Rakesh Pandit 2008-07-05 06:38:32 EDT
I am not sponsored. So my review is just to help this package go through 
process:

You can drop the %if 0%{?fedora} >= 8, conditional stuff is no longer needed as 
new releases wouldn't trigger it -- verify from reviewer or on #fedora-devel

You should remove tests/ folder.

[X] rpmlint output:
python-html5lib.src: W: uncompressed-zip /html5lib-0.10.zip
The zip file is not compressed.
   -- ignored Check
[X] md5sum  -- matches Check
[X] version guideline naming convention -- Check
[X] spec file name -- Check
[X] description and summary -- Check
[X] build root -- Check
[X] spec license field and compatibility  -- Checked
[X] license file -- absent in source -- Check
[X] readable and American language -- Check
[X] %clean -- present -- Check
[X] compile and build using koji -- Check
[X] rpmlint is silent except one message above -- Check
[X] %docs not required, no headers, no static libraries

Optional:
you may query upstream about including license text as a separate file.

Package looks okay to me.
Comment 10 Oded Arbel 2008-07-06 03:52:49 EDT
Thanks for the feedback.

Regarding the %{?fedora} conditional - it is required if I want to support
Fedora versions older then 8 (do I?), and Fedora 9 at least seems to still
support it. I would be very surprised if it would be dropped from future Fedora
versions.

I'll contact upstream and see if I can get them to add a license file.

Regarding the rpmlint output: 
  * I didn't understand the %doc output from rpmlint - I'm using %doc because I
want to package the README and examples, I don't see how it relates to headers
or static libraries. BTW - my rpmlint doesn't report that.
  * The uncompressed zip file is an issue with upstream - I decided not to
repack upstream for obvious reasons. The new 0.11.1 version seems to fix this.

I've repacked and published a new version of the SPEC and package for a new
upstream version, and for Fedora 9 (which I'm currently running):
SRPM: http://rpms.geek.co.il/fc9/python-html5lib-0.11.1-1.fc9.src.rpm
SPEC: http://rpms.geek.co.il/fc9/python-html5lib.spec
RPM: http://rpms.geek.co.il/fc9/python-html5lib-0.11.1-1.fc9.noarch.rpm

The new SPEC file work seamlessly for Fedora 8 and an update package will be
available at http://rpms.geek.co.il/fc8 shortly.
Comment 11 Rakesh Pandit 2008-07-06 04:22:05 EDT
I think I should have been more clear about keys I am using
x = Checked, - = Problem, ? = No idea. :-)

Regarding the rpmlint output: 
  * I didn't understand the %doc output from rpmlint - I'm using %doc because I
want to package the README and examples, I don't see how it relates to headers
or static libraries. BTW - my rpmlint doesn't report that.

  rpmlint out does not mention it for me either. 

  * The uncompressed zip file is an issue with upstream - I decided not to
repack upstream for obvious reasons. The new 0.11.1 version seems to fix this
    What is fixed by 0.11.1? zip format seems to be still there.

  You forget to remove tests/ folder as it is not required.

  For trigger I suggest you to refer to reviewer or list or #fedora-devel %if 
0%{?fedora} >= 8. I will also try to verify, what is correct?

Other then these two points package looks okay to me.

Thanks. 
Comment 12 Rakesh Pandit 2008-07-06 05:09:03 EDT
Searching in 'Package source' at http://cvs.fedoraproject.org/viewcvs/rpms/
python-setuptools it looks like we don't need this trigger as python-setuptools-
devel is available for FC-7 and FC-6 since 9 months now. We don't need to 
anyway worry about FC-6 ;-)
Comment 13 Oded Arbel 2008-07-06 06:35:11 EDT
Ok, thanks for your comments and for checking up on the python-setuptools.

I've removed the Fedora 8 test, and have new files at:
SRPM: http://rpms.geek.co.il/fc9/python-html5lib-0.11.1-2.fc9.src.rpm
SPEC: http://rpms.geek.co.il/fc9/python-html5lib.spec
RPM: http://rpms.geek.co.il/fc9/python-html5lib-0.11.1-2.fc9.noarch.rpm

"Uncompressed zip": Version 0.10's zip file had all files in "stored" mode,
meaning they are uncompressed. The current 0.11 zip file have all the files
compressed properly, so the "uncompressed zip" warning is no longer relevant.

tests/ folder: I'm not sure what you refer to - the tests folder is not packaged
into the binary. It /is/ in the upstream source package, but then again I'm
providing a pristine source package for obvious reasons so I can't remove it.
Currently we're not running the tests as part of the build process because they
cause external dependencies to be downloaded automatically (and I don't know how
to turn it off), but we would probably want to do it in the future.
Comment 14 Rakesh Pandit 2008-07-06 07:46:16 EDT
Package looks ready to me.

I agree with you, about keeping pristine source,
as it is not anyway in binary rpm.

Thanks
Comment 15 Rakesh Pandit 2008-08-23 11:23:42 EDT
Why was status ASSIGNED?

@Oded
My review was unofficial as I am not a sponsor.

Are you still interested in package?

if yes, you will need to look at this then:
https://fedoraproject.org/wiki/PackageMaintainers/HowToGetSponsored
Comment 16 Ruben Kerkhof 2008-08-24 06:54:45 EDT
Hi Odel,

Now that 0.11 is distributed in a normal zip file, is it possible to just use %setup -q to extract it?
Comment 17 Oded Arbel 2008-08-24 16:00:38 EDT
Yes, thanks for catching that. 

I'll fix this as soon as I have the time. Unfortunately I'll be out of the office in the next 4 weeks, so I probably won't have time to fix this before then.
Comment 18 Ruben Kerkhof 2008-10-04 09:58:59 EDT
Hi Odel, do you have time to look at this again?
Comment 19 Ruben Kerkhof 2008-10-19 06:42:33 EDT
Ping?
Comment 20 Ruben Kerkhof 2009-04-29 09:39:55 EDT
Ping again
Comment 21 Oded Arbel 2010-12-05 07:04:17 EST
Created attachment 464828 [details]
Updated spec for python-html5lib

Sorry for taking a long while with this (3 years :-( ), but here is an updated spec for python-html5lib 0.11, with the zip issue fixed.

SRPM and precompiled binary for Fedora 14 are available at http://rpms.geek.co.il/fc14/SRPMS/python-html5lib-0.11-1.fc14.src.rpm and http://rpms.geek.co.il/fc14/x86_64/python-html5lib-0.11-1.fc14.noarch.rpm.

Thanks for Matthew Miller for nudging me.
Comment 22 Matthew Miller 2010-12-08 15:32:27 EST
Reopening. See comment #21. :)
Comment 23 Ruben Kerkhof 2010-12-12 08:29:53 EST
Hi Odel,

Some comments:

Replace %define modulename htmllib with %global modulename htmllib:
https://fedoraproject.org/wiki/Packaging:Guidelines#.25global_preferred_over_.25define

The %define python_sitelib isn't needed anymore (since F-13):
https://fedoraproject.org/wiki/Packaging:Python#Macros

The same goes for the BuildRoot tag and the %clean section:
https://fedoraproject.org/wiki/Packaging:Guidelines#BuildRoot_tag

What's with all the #---------------?
Please drop them.

%if 0%{?fedora} >= 8
BuildRequires:  python-setuptools-devel
%else
BuildRequires:  python-setuptools
%endif

We're on F-14 now, so that conditional isn't needed.

The %files section is a bit too general for my taste.
Adding files explicitly helps you by breaking the build when newer versions of htmllib decide to drop/add files. So something like:

%files
%defattr(-,root,root,0755)
%doc examples README
%{python_sitelib}/%{modulename}
%{python_sitelib}/%{modulename}-*.egg-info

In the %pre section:
rm -rf %{modulename}-%{version}
isn't needed, that's handled by the %setup macro.

And what about the tests? You should run them in a %check section, if possible.
Comment 24 Oded Arbel 2010-12-12 12:15:09 EST
Created attachment 468243 [details]
Updated spec file according to comments

Thanks for the comments :-)

(In reply to comment #23)
> Replace %define modulename htmllib with %global modulename htmllib:

Done.

> The %define python_sitelib isn't needed anymore (since F-13):

Done.

> The same goes for the BuildRoot tag and the %clean section:

Done. I should really sit down and read the packaging guidelines from top to bottom.

> What's with all the #---------------?
> Please drop them.

Its from a template I use for creating new spec files - in my company we expect developers to maintain their own spec files and I find it helps newbies. I've removed them.

> We're on F-14 now, so that conditional isn't needed.

Done.

> The %files section is a bit too general for my taste.
> Adding files explicitly helps you by breaking the build when newer versions of
> htmllib decide to drop/add files.

I've changed the %files section to specifically list all the python packages that are expected to exist (as I've seen some other python RPMs do). It's not a lot more specific then your example, I hope that's OK.

> In the %pre section:
> isn't needed, that's handled by the %setup macro.

Removed.

> And what about the tests? You should run them in a %check section, if possible.

I've added %check with the tests, but on my system almost all of them fail. I'm not an html5lib developer so I'm not sure who's at fault here, so I commented it out in the mean time. I'll check whats up with that when I have the time.

SRPM and precompiled binary for Fedora 14 are again available at
http://rpms.geek.co.il/fc14/SRPMS/python-html5lib-0.11-2.fc14.src.rpm and
http://rpms.geek.co.il/fc14/x86_64/python-html5lib-0.11-2.fc14.noarch.rpm.
Comment 25 Matthew Miller 2010-12-15 15:55:48 EST
The tests are failing because there are tons of deprecation warnings, and the tests are (justifiably) run with warnings made into errors.
Comment 26 Matthew Miller 2010-12-15 15:57:41 EST
Also — the current version is 0.90, as of January 2010. I thought that might fix up the deprecation warnings, but no such luck.

http://code.google.com/p/html5lib/downloads/detail?name=html5lib-0.90.zip
Comment 27 Rahul Sundaram 2011-04-18 12:49:08 EDT
This seems to be a dead review for a long time.  I am closing this since I am going to open a new review request for the same software (need it as a dependency of a software that I would like to see in Fedora).  

Oded Arbel,  if you are still interested in being a Fedora package maintainer, drop a me a mail and I will find a way to get you sponsored.
Comment 28 Rahul Sundaram 2011-04-18 12:56:11 EDT
On second thought, the last comment has been about four months back and does have a assigned reviewer.  I will give it say a week time before closing.
Comment 29 Oded Arbel 2011-05-04 16:38:48 EDT
Created attachment 496915 [details]
Updated spec file

Guys, sorry for the long response time - I was very busy and I actually kind of forgotten about it.

Anyway - here is an updated SPEC file, which is basically the last spec updated to the current release and with the %check section enabled. For some reason the current release does not carry a tests directory and so %check immediately succeeds.

Currently rpmlint reports only that the source file does not exist- which is weird because it does: I can copy the URL it reports as 404-erroring and run it with wget and it downloads fine.
Comment 30 Rahul Sundaram 2011-06-23 09:53:07 EDT
I am not a sponsor but I will do a final review and see if I can get someone to sponsor you.   I have some comments:

* Must not define dist in your spec file. 
* Must replace %define with %global in all applicable cases in this spec
* You are defining a macro called section and only using it once.  Seems useless
* Don't need to define python_sitelib or buildroot in the spec
* No need for %clean section anymore
* Remove all commented out lines.  They just clutter the spec file

Run rpmlint on spec file, srpm and binary rpm and fix the warnings and errors as applicable.   Also ensure that you post a link to both the spec and the srpm next time.  Verify that it works under mock.  I would also suggest doing a scratch build under koji and posting a link here. 

Do ask if anything above is not clear to you.  I am mether in #fedora-devel if you have any questions.
Comment 31 Rahul Sundaram 2011-06-23 09:54:29 EDT
Also you can remove %defattr as that is redundant now as well.
Comment 32 Rahul Sundaram 2011-06-27 11:38:44 EDT
Any update?
Comment 33 Oded Arbel 2011-06-27 12:38:40 EDT
(In reply to comment #32)
> Any update?

Thanks for taking the time to review the package.

I apologize that I have not made any response yet - I'm busy doing other urgent things, but I will get back to this tomorrow.

Thanks again.
Comment 34 Rahul Sundaram 2011-06-30 02:32:55 EDT
Ping again.  I would like to move forward here since this package is a dependency for something I am working on.
Comment 35 Rahul Sundaram 2011-07-09 04:43:57 EDT
Any update?
Comment 36 Rahul Sundaram 2011-07-12 05:06:31 EDT
I am going to give this 2 more days and close this review request and open my own to make forward progress in this since this happens to be a dependency for Askbot, which I am packaging.  I would happy to assist in Oded Arbel getting sponsored when he is more free.
Comment 37 Oded Arbel 2011-07-12 16:55:16 EDT
I'm sorry for taking a long time on this. Its been a hectic couple of weeks but I'm here now. I'll try to respond quickly from now on.

Rahul - I've reviewed your notes in comment #30, but it seems that almost none of them are relevant to the updated spec file (attachment #2 [details]). The only one that is relevant is the one about the comments, which I think are still useful but nonetheless at your request they have been removed, and the one about %defattr that I also removed.

> Run rpmlint on spec file, srpm and binary rpm and fix the warnings and errors
> as applicable.   

rpmlint complains that "tokenizer" is not spelled correctly. I'm not sure whether an official English dictionary recognizes the word "tokenizer", but its what upstream uses to describe the project so I'm leaving it in. Everything else is clean.

> I would also suggest doing a
> scratch build under koji and posting a link here. 

Unfortunately I do not have access to a working Koji setup, and last time I tried to do that I gave up two days later due to it being too difficult and not enough time.

SRPM: http://rpms.geek.co.il/fc15/SRPMS/python-html5lib-0.90-2.fc15.src.rpm
RPM: http://rpms.geek.co.il/fc15/x86_64/python-html5lib-0.90-2.fc15.noarch.rpm
The spec file will be attached to this ticket.

Thank you and please let me know if there's anything else to change.
Comment 38 Oded Arbel 2011-07-12 16:56:23 EDT
Created attachment 512515 [details]
Updated spec file
Comment 39 Oded Arbel 2011-07-12 16:57:24 EDT
For some reason I can't obsolete Ruben Kerkhof original spec. Please disregard it and review my updated spec, which is the later attachment.

TIA
Comment 40 Rahul Sundaram 2011-07-12 17:34:08 EDT
This is unusual

Provides:	%{modulename} = %{version}

It bloats the metadata.  Is it necessary?  consider removing

in %install stage, rm -rf $RPM_BUILD_ROOT is redundant.  

Add Build Requires on python-devel and RPM would pick up the requires automatically.  You can remove the explicit requires.  

%files can be reduced to

%doc examples README
%{python_sitelib}/%{modulename}

Ideally,  upstream should update README to explicitly mention the license and you must contact upstream to ask them to include a copy of the license.  Fix these and I will get you approved
Comment 41 Oded Arbel 2011-07-12 18:21:49 EDT
(In reply to comment #40)
> This is unusual> 
> Provides: %{modulename} = %{version}
> It bloats the metadata.  Is it necessary?  consider removing

I think its useful because people may want to depend on "html5lib" instead of "python-html5lib", but I see that no one else is doing this - so I removed it.

> in %install stage, rm -rf $RPM_BUILD_ROOT is redundant.  

OK.

> Add Build Requires on python-devel and RPM would pick up the requires
> automatically.  You can remove the explicit requires.

I don't agree - html5lib is a pure python implementation and does not require python-devel to build or run, so there's no need to force the user to install it. python-setuptools on the other hand is required for building the package but requiring python-devel will not cause it to be installed.

As for the explicit run-time requirement on python, RPM indeed picks that up automatically, so I removed this explicit requirement.

> %files can be reduced to
> 
> %doc examples README
> %{python_sitelib}/%{modulename}

I need to add %{python_sitelib}/%{modulename}-*.egg-info otherwise it complains that these files are not installed but not packaged. I'm not an expert on Python and I'm not sure what the egg-info files are, but I have a feeling they are important. Regarding the other lines, I'm pretty sure I had a good reason to do it but I can't figure it out now so I cleaned it up.

> Ideally,  upstream should update README to explicitly mention the license and
> you must contact upstream to ask them to include a copy of the license.

I've asked on the project's Google group a couple of times to include a license file, but with very little response. I've now opened an issue on that in their issue tracker (http://code.google.com/p/html5lib/issues/detail?id=188) but I doubt it will get any better response. If this is requirement is a must and the package cannot get into Fedora without fulfilling it, then we will just have to wait. If upstream doesn't get back on that issue within a week or so, I'll open up a personal email campaign ;-)

Updated RPM files are in the links above, updated SPEC file will be attached shortly.
Comment 42 Oded Arbel 2011-07-12 18:22:23 EDT
Created attachment 512535 [details]
Updated spec file
Comment 43 Rahul Sundaram 2011-07-13 04:09:57 EDT
Do a scratch build via koji or a mock build locally.  Run rpmlint on the spec file,  srpm and binary rpm.  In particular the changelog should be something along the lines of

* Tue Jul 12 2011 Oded Arbel <oded@geek.co.il> - 0.90-2

Dist tag shouldn't be explicitly specified in the changelog.   Meanwhile  I have filed a request to sponsor you

https://fedorahosted.org/fesco/ticket/643

I recommend you create a profile page in the Fedora wiki and drop a note in the ticket with your Fedora account system name.
Comment 44 Rahul Sundaram 2011-07-16 05:30:32 EDT
Kindly update.  I am waiting on this for my project  (https://fedoraproject.org/wiki/Askbot)
Comment 45 Rahul Sundaram 2011-07-18 06:22:32 EDT
I have gotten a new review request opened since this is 722874a blocker for what I am working on and you seem to be busy.  I would be happy to work with you and get you sponsored however when you have more time.  Do drop me a mail when you are free.  Thanks.

*** This bug has been marked as a duplicate of bug 722874 ***
Comment 46 Oded Arbel 2011-07-18 18:14:10 EDT
Thanks. It was important to get this package into Fedora and as that was done, I have no complaints :-)

I'd like to work on more Fedora packages but as you can tell I'm currently way too busy. I will drop you a note when I can do some more work, Thanks for being patient with me as much as you have been.