226079 – Merge Review: libxml2

Bug 226079 - Merge Review: libxml2

Summary: Merge Review: libxml2

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	Package Review
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Matěj Cepl
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-01-31 19:31 UTC by Nobody's working on this, feel free to take it
Modified:	2018-04-11 13:34 UTC (History)
CC List:	6 users (show)
Fixed In Version:	libxml2-2.6.32-3.fc10
Clone Of:
Environment:
Last Closed:	2012-10-11 17:31:17 UTC
Type:	---
Embargoed:
Dependent Products:
Flags:	mcepl: fedora-review+

Attachments	(Terms of Use)
libxml current svn ChangeLog in UTF-8 (710.01 KB, text/plain) 2008-05-31 12:26 UTC, Hans de Goede	no flags	Details
unified diff between orig Changelog and UTF-8 Changelog (18.92 KB, patch) 2008-05-31 12:28 UTC, Hans de Goede	no flags	Details \| Diff
commented output of fedpkg lint (2.70 KB, text/plain) 2010-12-17 15:15 UTC, Matěj Cepl	no flags	Details
patch file in .spec (721 bytes, patch) 2010-12-17 15:18 UTC, Matěj Cepl	no flags	Details \| Diff
View All

Description Nobody's working on this, feel free to take it 2007-01-31 19:31:27 UTC

Fedora Merge Review: libxml2

http://cvs.fedora.redhat.com/viewcvs/devel/libxml2/
Initial Owner: veillard

Comment 1 Michael Schwendt 2008-04-13 11:06:36 UTC

> Prefix: %{_prefix}

This is wrong. The packages are not really relocatable.
For example, there are hardcoded paths in xml2-config,
the pkgconfig file, the libtool archive, and xml2Conf.sh


> %{prefix}/include/*
> %{prefix}/bin/xml2-config
> %{prefix}/share/aclocal/libxml.m4

It should be  %{_bindir}, %{_includedir} and %{_datadir} everywhere.


> Docdir: %{_docdir}

Redundant.


> -rw-r--r-- root  root   1546914 /usr/lib/libxml2.a

http://fedoraproject.org/wiki/PackagingDrafts/StaticLibraryPolicy


> %post
> /sbin/ldconfig

> %postun
> /sbin/ldconfig

Prefer

%post -p /sbin/ldconfig
%postun -p /sbin/ldconfig

to execute ldconfig directly and not via /bin/sh


> Patch0: multilib.patch

It is common practise to prefix such files with the name (and
preferably also the version) of the package, e.g.
libxml2-2.6.32-multilib.patch


> make

If  make %{_smp_mflags}  causes problems, please add a comment.


> %files devel
> %doc AUTHORS ChangeLog.gz NEWS README Copyright TODO

> %files python
> %doc AUTHORS ChangeLog.gz NEWS README Copyright

Is it necessary to duplicate these files? They are already
in the main package, which is a strict requirement for these
two sub-packages.


* rpmlint gives lots of warnings when run on the binary rpms.
  Some are valid, e.g. "file-not-utf8"

Comment 2 Daniel Veillard 2008-05-30 12:35:04 UTC

I think I have done most of this. The most annoying point is the static
lib because I know some people use that to shave 2% of extra performance
on parsing speed. But by separating as a -static package that's reasonable.
For the "file-not-utf8", sorry can't fix here, report upstream if you believe
it's worth it.
Ah the patch is not dependent on the version of the libxml2 package but on
multiarch crazyness, so just renamed libxml2-multilib.patch

Considered done from my POV,

Daniel

Comment 3 Dominik 'Rathann' Mierzejewski 2008-05-30 13:28:04 UTC

(In reply to comment #2)
[...]
> For the "file-not-utf8", sorry can't fix here,

Sure you can. What's stopping you?

> report upstream if you believe it's worth it.

I believe it's the maintainer's job to do that. Upstream might not want to
convert to UTF-8, but it is a Fedora requirement for all text documents in a
package to be encoded in UTF-8, so it is your responsibility.

Comment 4 Daniel Veillard 2008-05-30 14:36:54 UTC

What's stopping me ? ... maybe realizing this exercise is just about
dogmatic rules, and as you proved, I should rather spend 30 mn of my time
as the libxml2/libxslt upstream maintainer on patch review than just
getting annoyed by people who insist on changing package content at the
distro level. I was about to conver libsxlt along those lines, but right
I'm better off checking the patches from Huawei on the xml list.

Daniel

Comment 5 Robert Scheck 2008-05-30 15:14:44 UTC

Can you please take care on all points mentioned in the review before closing 
this bug report? And please don't close this bug report until you got an approval
for your package - thank you.

Comment 6 Daniel Veillard 2008-05-30 16:25:40 UTC

I did take care/checked everything and, NO I won't change files coming from
upstream unless i have a clear valid and checked with upstream reason for 
doing so.

Daniel

Comment 7 Patrice Dumas 2008-05-30 17:01:54 UTC

Maybe you want to focus on upstream work, in that case you could consider
having somebody else follow the dogmatic rules. Those dogmatic
rules are not only dogmatic, for instance the NEWS file has strange 
characters when seen in less. 

(I would personally prefer that writer.xml is not changed since it has 
the encoding specified in the file).

Maybe you could search for a co-maintainer, I think that it is better
in most cases to separate upstream from the primary maintainer (though 
upstream as co-maintainer is very nice) such that when there is conflict
between the packaging rules and what upstream want, the maintainer can
decide what is best (which could be what upstream want, but not 
necessarily).

Comment 8 Robert Scheck 2008-05-30 17:08:21 UTC

Daniel, all packages in Fedora have to follow the MUSTS of the Fedora Packaging
Rules and Guidelines, libxml2 is there no exception. As you're upstream and also
downstream you can change it where you want; at least it has to be solved in the
end. Why do I have to explain a Red Hat guy how the Fedora Packaging works? ;-)

Comment 9 Daniel Veillard 2008-05-30 21:23:16 UTC

Why do I have to explain to Fedora guys that changing pieces of a package
from upstream is fundamentally wrong ??? Unless you have reported upstream
and check with them you should *never* do that, and having a rule which by
default forces such a behaviour is Just Plain Wrong please tell me who I 
need to complain, i.e. who made that decision, I want to be able to appeal.

I would really like to get out of your head that a packager can just change
bits which come from upstream freely. I would really like people to understand
what bias led ultimately to the OpenSSL Debian fiasco. The fact that I am
'a Red Hat guy' just means I can't let you do that and think you're right.
I need to react and not let this go, precisely.

Okay, give me any good justification for changing the encoding of an
XML file which is part of a package ?

For the ChangeLog the explanation is that the fragments have accumulated
over 10 years of developments, last I checked on this very problem is that
some contributions came from eastern europe, others from western europe
this is a mix of various ISO-8859-*, you just can't convert in one pass
iconv won't make it, it will just silently corrupt the people names. It's
way better to keep it that way, I can't expect upstream to just use UTF-8.
In general encoding is a property of one character. The fact that Fedora
assumes that all characters in a file must expressed with the same encoding
means they give the liberty to the packager to convert the way it feels best
and IMHO that's way over what a packager should be allowed to do.

  please stop assuming upstream is stupid, please stop assuming I'm just
dumb, I may be a grumpy french but I have reason to react negatively when
I see people trying to push on me thing which I think are very much broken.

 Go back to my comment #2, I actually made an effort to play by the rules.
It's when I get a dogmatic answer that I start getting nasty.

 If forced, I would just drop all concerned text files from the package
at least for Fedora, but converting files on the fly at packaging time is
NOT something I will do in any package I maintain.

Daniel

Comment 10 Robert Scheck 2008-05-30 23:04:35 UTC

Sorry, but I can't see a real problem converting non UTF-8 files from %doc into 
UTF-8; this is not a Debian-like OpenSSL issue. And yes, writer.xml is surely a 
special thing, I agree with the previous comment, just keep as it is. I'm really
sorry if I was maybe unclear, my concern is regarding %doc, not for writer.xml.

Comment 11 Jason Tibbitts 2008-05-31 02:32:07 UTC

Looks like someone needs to moderate here.

First off, I'd hope that nobody gets nasty regardless of how dogmatic people are
being.

Please keep in mind at all times that rpmlint is a tool, not any authoritative
source of packaging knowledge. It can be quite dumb and its results always need
to be filtered by a human.

Now, when dealing with rpmlint output, it's always nice to have the actual
output somewhere in the ticket so we can see just what's being complained about.
I think it's this:

libxml2.x86_64: W: file-not-utf8 /usr/share/doc/libxml2-2.6.32/ChangeLog.gz
libxml2.x86_64: W: file-not-utf8 /usr/share/doc/libxml2-2.6.32/NEWS

The issues seem to reside solely with people's names. Isn't it anyone's
priority that the contributors' names at least render properly somewhere?

Here are the other two utf8-related complaints, which I don't think are at issue
here; the C source is purposefully not utf8 anyway.
libxml2-devel.x86_64: W: file-not-utf8
/usr/share/doc/libxml2-devel-2.6.32/examples/testWriter.c
libxml2-devel.x86_64: W: file-not-utf8
/usr/share/doc/libxml2-devel-2.6.32/examples/writer.xml

In any case, hopefully some useful discussion will result from this
fedora-packaging post:
http://www.redhat.com/archives/fedora-packaging/2008-May/msg00070.html

And regardless, this ticket should certainly stay open, because nobody's done a
full review yet. I built current CVS and found a few more issues:

/usr/lib64/python2.5/site-packages/libxml2mod.a (from libxml2-python) seems to
be spurious; I don't think that's useful for anything at all.

The -devel package seems to have a dependency on /usr/bin/python solely because
of the executable documentation. rpmlint will complain about executable
documentation:
libxml2-devel.x86_64: W: spurious-executable-perm
/usr/share/doc/libxml2-devel-2.6.32/examples/index.py
libxml2-devel.x86_64: W: doc-file-dependency
/usr/share/doc/libxml2-devel-2.6.32/examples/index.py /usr/bin/python
(repeat many times)
which isn't really an issue unless it causes unwanted dependencies. I don't
know if the dependency is problematic in this case; I'm not sure if the -python
package was split up in an attempt to reduce dependencies or not, but I'm pretty
sure this is a non-issue. If the main package had grown a python dependency
then I'd see that as a problem.

Looks like there's a typo in some of the recent changelog entries:
libxml2.x86_64: W: incoherent-version-in-changelog 2.6.31-3.fc10 2.6.32-3.fc10

The only other rpmlint complaint is:
libxml2-static.x86_64: W: no-documentation
which is obviously not a problem.

Outside of rpmlint, I glanced at the specfile and noted a couple of additional
issues.

The general prohibition on the %makeinstall macro forces me to ask if a regular
"make install DESTDIR=%{buildroot}" will work for this package.
http://fedoraproject.org/wiki/Packaging/Guidelines#Why_the_.25makeinstall_macro_should_not_be_used

Finally, I know this is pretty trivial, but because it makes the specfile a bit
more readable, please choose either $RPM_BUILD_ROOT or %{buildroot} and use it
consistently. http://fedoraproject.org/wiki/Packaging/Guidelines#macros

Comment 12 Jason Tibbitts 2008-05-31 02:40:30 UTC

BTW, if you ask me to fix the trivial stuff, I'll do so and save you the hassle.

Comment 13 Hans de Goede 2008-05-31 06:25:05 UTC

About the text files encoding discussion.

Although I fully agree with Daniel that blindly converting text-ish files which
actually specify an encoding in their headers is both wrong and dangerous as
that actually breaks stuff, normal text files, esp. ones in %doc should be in
UTF-8, so that when opened they display correctly.

Indeed the changelog is a perfect example of why all plain text files must be
UTF-8, had it always been UTF-8 the problems between part being in west-european
encoding and parts in east-european encoding would not exist.

Also I think its worth noting that Fedora is not the only distro doing this,
Debian for example also tries to have all text files in the distro in UTF-8.

Now I can understand when Daniel says that he has higher priority things to fix.
So Daniel, as upstream, will you take a patch changing the encoding of all files
which end up in %doc, which changes them to UTF-8? and even fix the changelog to
be UTF-8 (split it in pieces, iconv pieces, put back together)

Comment 14 Daniel Veillard 2008-05-31 10:19:53 UTC

Okay, I will drop ChangeLog and News from the package, after reading this
and the thread it looks apparently more important to the Fedora Packaging
guys to follow blindly a "This has to be UTF-8" rule than understanding
that people may have preferences set in their viewing and editing tools.
I would think it is more important to be able to read the ChangeLog missing
some characters than not seeing anything. Not a big deal they will go look
upstream if they really need this.

For the extended review from comment #11, thanks, I will look at those, best
is probably to drop anything which gives problems. For the spec file specifics
if you have time to derive a patch I will take it. For the DESTDIR stuff
over the years libxml2 had to support it, sometime it broke, I think other
OSes rely on it (BSDs) but I don't have a 100% guaranteed answer to this.

W.r.t. comment #13 , the ChangeLog is 19000+ lines how would you make sure
you actually guessed the encoding of characters in a name correctly ? IMHO
you just can't unless chasing each people name and making sure it matches.
That's history, I prefer to keep it that way. If the history annoys Fedora
I will just drop it from the package. 
The argument for it having to be UTF-8 just doesn't hold, you're picking
one encoding while, the problem is that 1/ you can't garantee there is only
one encoding in a text file 2/ even if it's the case you have no metadata
to indicate what encoding was used for authoring. Blindly assuming this has
to be UTF-8 is a wrong and lost battle, even the Unicode consortium would never
try to suggest something that radical, and sorry if I were to write a README.cn
I would use a BIG5 encoding in it, not UTF-8, because that's not what the
concerned people would expect. Get yourself a filesystem with metadata if you
really want to solve the problem, forcing upstream into UTF-8 or converting
on the fly are both wrong IMHO.

For the NEWS file it's actually generated via XSLT when the package is released
  <xsl:output method="text" encoding="ISO-8859-1"/>
it could be switched to UTF-8 without problem but converting at rpm generation
time is just the Wrong Way of doing it.

Still w.r.t. #13 it's not so much that I think I have higher priorities to do,
it's that I see things imposed like a dogma, that they look broken to me, and
that I feel that trying to change the minds of people around about this is 
a lost battle. It's getting a religious thing, and my belief is that changing
files coming from upstream is fundamentally wrong %doc or not. Your religion
v.s. my religion, we will go nowhere, and I prefer to do something else. 

Daniel

Comment 15 Patrice Dumas 2008-05-31 10:36:30 UTC

(In reply to comment #14)
> Okay, I will drop ChangeLog and News from the package, after reading this
> and the thread it looks apparently more important to the Fedora Packaging
> guys to follow blindly a "This has to be UTF-8" rule than understanding
> that people may have preferences set in their viewing and editing tools.

It should not be done blindly. But utf8 is the default for fedora since
some time for locales, so, for example, the file will look better in less
if it is encoded in utf8. That being said I think that it is perfectly right
to use other locales, and that things should also work in these locales.
But since utf8 is the default, it is more likely to be correct for text 
files that don't carry over the encoding.

> I would think it is more important to be able to read the ChangeLog missing
> some characters than not seeing anything. Not a big deal they will go look
> upstream if they really need this.

In this case, with multiple encoding, the best may indeed be to keep it 
as is.

> If the history annoys Fedora I will just drop it from the package.

It is not the point.
 
> The argument for it having to be UTF-8 just doesn't hold, you're picking
> one encoding while, the problem is that 1/ you can't garantee there is only
> one encoding in a text file 2/ even if it's the case you have no metadata
> to indicate what encoding was used for authoring. Blindly assuming this has
> to be UTF-8 is a wrong and lost battle, even the Unicode consortium would never
> try to suggest something that radical, 

Fedora is not bound to the Unicode consortium. Default encoding is utf8
so text files are more likely to be have correct rendering of characters
if dile is utf8. 

> and sorry if I were to write a README.cn
> I would use a BIG5 encoding in it, not UTF-8, because that's not what the
> concerned people would expect. 

This could be a good reason not to change the encoding for this file.

> Get yourself a filesystem with metadata if you
> really want to solve the problem, forcing upstream into UTF-8 or converting
> on the fly are both wrong IMHO.

Doing it blindly is wrong. But doing it when it makes sense is right.

> For the NEWS file it's actually generated via XSLT when the package is released
>   <xsl:output method="text" encoding="ISO-8859-1"/>
> it could be switched to UTF-8 without problem but converting at rpm generation
> time is just the Wrong Way of doing it.

If the text file is ISO-8859-1 encoded for sure, converting it
on the fly seems acceptable and even preferrable to me.

> Still w.r.t. #13 it's not so much that I think I have higher priorities to do,
> it's that I see things imposed like a dogma, that they look broken to me, and

It is not a dogma. There are reasons to prefer utf8 for text files in 
fedora. Now if you don't like it, you are perfectly right, but you'll
have to find another packager who agree with you to review your package.

Comment 16 Hans de Goede 2008-05-31 11:40:12 UTC

(In reply to comment #14)
> Okay, I will drop ChangeLog and News from the package, after reading this
> and the thread it looks apparently more important to the Fedora Packaging
> guys to follow blindly a "This has to be UTF-8" rule than understanding
> that people may have preferences set in their viewing and editing tools.

I'm sorry, my prefered viewing and editing tools (less and joe) don't have any
encoding settings, and even if I were to use tools which do, they default to the
system locale and system locale's are UTF-8 on Fedora (and many other modern
distros). ISO-8859-X is dead, not as dead as it should be but it is dead, this
is not a dogma, we want files to show up correctly using the default locale's,
thus they should be encoded usings the default locale settings.

I'm especially disappointed in you staying to your stance that UTF-8 is wrong,
given that I have offered todo the work for you.

> I would think it is more important to be able to read the ChangeLog missing
> some characters than not seeing anything.

I agree, but it would be even better to see the full changelog including those
chars!

> W.r.t. comment #13 , the ChangeLog is 19000+ lines how would you make sure
> you actually guessed the encoding of characters in a name correctly ?

I'm european and as such know many european names / dialects, atleast enough to
have a high chance of guessing correctly, then again in some cases I might guess
wrong, but then again to quote you: "I would think it is more important to be
able to read the ChangeLog missing some characters than not seeing anything."
Well my intend is to make the number of missing / wrong characters smaller,
preferably 0 but atleast much much smaller, I might get one are two names wrong,
bad luck atleast the other names will be correct.

> IMHO
> you just can't unless chasing each people name and making sure it matches.
> That's history, I prefer to keep it that way. If the history annoys Fedora
> I will just drop it from the package. 

This is not history, this is just plain wrong, as is the file will not look
correct in any encoding. This file is the perfect example of why UTF-8 has
become the default and why UTF-8 is a blessing (allowing both special western
and east european chars in one file, and even more exotic chars in the same file).

> The argument for it having to be UTF-8 just doesn't hold, you're picking
> one encoding while,

I'm not picking one encoding, Fedora has picked one encoding, and I believe they
have made a good choice

> the problem is that 1/ you can't garantee there is only
> one encoding in a text file

If there is more then one encoding in a plain text, then the file is broken, we
have a name for things like this, we call it a bug and usually we fix those were
we can (and have the resources).

> 2/ even if it's the case you have no metadata
> to indicate what encoding was used for authoring.

Actually, if its in English, but there are some names of people in there which
are French and those names are the only one containing non ascii codes, then I
can make a pretty good bet. After that I can check if the resulting name is a
valid French name.

You know I maintain close to 200 packages in Fedora, and as such I've even
written a script to locate any non ascii chars in text files for me (used when
rpmlint complains about them not being non utf-8. Did you know that in the
libxml2 changelog there are only 48 lines which contain non ascii code, after
replacing "St\?phane Bidoul" with "Stephane Bidoul" (I will fix this to the
proper UTF-8 name later) I've only 16 non-ascii containing lines left.

Really this is fixable just _fine_, the problem is you being unwilling to accept
a fix for it.

> if I were to write a README.cn
> I would use a BIG5 encoding in it, not UTF-8, because that's not what the
> concerned people would expect

Ofcourse there will always be exceptions to the rule, but were not talking about
Chinese files here, ChangeLog is an English file, with some non English person
names in there which use chars beyond ascii

. Get yourself a filesystem with metadata if you
> really want to solve the problem

Well thats not going to help with multiple encodings in one file, like you have
with ChangeLog now is it, or are we going to put en encoding per line in the
metadata, or maybe an encoding per char?

> Still w.r.t. #13 it's not so much that I think I have higher priorities to do,
> it's that I see things imposed like a dogma, that they look broken to me, and
> that I feel that trying to change the minds of people around about this is 
> a lost battle. It's getting a religious thing, and my belief is that changing
> files coming from upstream is fundamentally wrong.

So fixing bugs, which last time I checked requires changing files, (and then
sending the fix back upstream) is fundamentally wrong?

Thats an interesting stance, weird but interesting.

The only fundamentally broken thing I see is the current libxml2 ChangeLog file,
which currently will render wrong no matter which encoding you choose in your
viewer.

Comment 17 Hans de Goede 2008-05-31 12:26:51 UTC

Created attachment 307275 [details]
libxml current svn ChangeLog in UTF-8

Show me the code ...

Can we now please get over this and have this _bug_ of a ChangeLog with
different parts in different encodings fixed please?

The only remaining problem AFAIK is NEWS then, which can be converted on the
fly during the rpmbuild just fine, so that it matches Fedora's _default_
encoding.

Comment 18 Hans de Goede 2008-05-31 12:28:15 UTC

Created attachment 307276 [details]
unified diff between orig Changelog and UTF-8 Changelog

Look as this with your browsers encoding (view menu) set to UTF-8 and notice
how _everyones_ name is spelled correct in the new version.

Comment 19 Daniel Veillard 2008-06-02 16:04:55 UTC

Incredible ... okay thanks for doing this !
I still think you're wrong in your 'ISO-8859-X is dead' stance,
maybe you think it's dead for Fedora, or maybe Linux. But please remember
that Linux/Fedora is still a relatively modest part of the computing 
ecosystem. From a libxml2 perspective, it's very probable there is more 
user on Windows than on Linux for example.
Still the offer is too good to drop and I just commited this upstream:

Mon Jun  2 17:39:42 CEST 2008 Daniel Veillard <daniel>

        * ChangeLog: patch from Hans de Goede to switch the file to UTF-8
        * doc/news.xsl: switch to generate the NEWS file in UTF-8 instead of
          ISO-8859-1

My main issue now is how to make sure Changelog will stay UTF-8 over the
edits, I also added a comment at the end
#
# vim: set enc=utf-8
#
hopefully this will help ensure the file stay UTF-8 over the edits 
by other commiters, as each time I will edit the file, vim should do the
check and hopefully report problems.

Daniel

Comment 20 Hans de Goede 2008-06-02 17:07:20 UTC

(In reply to comment #19)
> Incredible ... okay thanks for doing this !

Your welcome.

> Still the offer is too good to drop and I just commited this upstream:
> 
> Mon Jun  2 17:39:42 CEST 2008 Daniel Veillard <daniel>
> 
>         * ChangeLog: patch from Hans de Goede to switch the file to UTF-8
>         * doc/news.xsl: switch to generate the NEWS file in UTF-8 instead of
>           ISO-8859-1
> 

Thanks!!

> My main issue now is how to make sure Changelog will stay UTF-8 over the
> edits, I also added a comment at the end
> #
> # vim: set enc=utf-8
> #
> hopefully this will help ensure the file stay UTF-8 over the edits 
> by other commiters, as each time I will edit the file, vim should do the
> check and hopefully report problems.
> 

Yes, keeping it UTF-8 might be a problem, but if it becomes non UTF-8 I'm sure
it can be fixed again.

Comment 21 Daniel Veillard 2008-06-02 17:45:49 UTC

Side note, could you point Toshio Kuratomi to XML specification appendix F.
  http://www.w3.org/TR/REC-xml/#sec-guessing-with-ext-info

He seems unaware that the software can expect XML files to be in a specific
encoding - contextual metadata which can't be preserved when saved on
a POSIX filesystem - and that this external information do override
both the autodetection and the encoding declaration. So basically by 
converting a XML file blindly you can break it in context even if it looks
legit say from an out of context xmllint checking.
  i.e. his suggestion here
  https://www.redhat.com/archives/fedora-packaging/2008-May/msg00071.html
is just broken, sorry. (me being 'chilling' is probably right in perspective
but that's not a technical issue !)

  And just in case I may sound stupid, please remember that UTF-16 is
the most common encoding for XML internal processing, on Windows, on the
Web (DOM imposes UTF-16) and in Java, which represent a huge subset of
the XML computing world.

Daniel

P.S.: It's not that i agree with appendix F-2 (I tried to get rid of it, failed)
      or with the use of UTF-16 (or derivative) over UTF-8, but that's just the
      way things are defined now.

Comment 22 Robert Scheck 2008-12-20 14:45:54 UTC

Patrice, Hans, Jason - ping?

Comment 23 Matěj Cepl 2010-12-17 15:15:55 UTC

Created attachment 469394 [details]
commented output of fedpkg lint

(In reply to comment #21)
> Side note, could you point Toshio Kuratomi to XML specification appendix F.
>   http://www.w3.org/TR/REC-xml/#sec-guessing-with-ext-info

Just a side-note before doing a proper review ... the reason why you are right and Toshio is (in this question) most likely wrong is that he considers XML files as text files. They are clearly not, because one of the main important characteristics of plain text files is exactly this ... they don't have encoding metada. However, every XML has stated encoding (defaulting to UTF-8 if missing). IMHO, XML files don't need to be touched.

Anyway, now the proper review (based on clone of the Fedora git):

- MUST: rpmlint must be run on every package. The output should be posted in
the review

Complete commented output of fedpkg lint has been attached. Here is only the summary:

 * Removing executable bits from examples, tests, and chvalid.c
 * Running tests in %check
 * Not sure about /usr/lib64/python2.7/site-packages/libxml2mod.so providing libxml2mod.so()(64bit)
 * See in the patch fix for libxml2mod.a being in the non-static package.

+ MUST: package named according to the Package Naming Guidelines
+ MUST: The spec file name must match the base package %{name}
+ MUST: The package must meet the Packaging Guidelines .

http://fedoraproject.org/wiki/Packaging/Guidelines#Why_the_.25makeinstall_macro_should_not_be_used

I have on idea why %makeinstall is used, when

make install DESTDIR=%{?buildroot}

works apparently as well.

+ MUST: The package licensed with a Fedora approved license and meets the
Licensing Guidelines
+ MUST: The License field in the package spec file matches the actual
license
License: MIT
+ MUST: If (and only if) the source package includes the text of the license(s)
in its own file, then that file, containing the text of the license(s) for the
package must be included in %doc.
+ MUST: The spec file must be written in American English.
+ MUST: The spec file for the package MUST be legible.
+ MUST: The sources used to build the package must match the upstream
source, as provided in the spec URL. Reviewers should use md5sum for this task
From srpm:
8127a65e8c3b08856093099b52599c86  libxml2-2.7.8.tar.gz
8127a65e8c3b08856093099b52599c86  libxml2-2.7.8.tar.gz.new
= MATCHES
+ MUST: The package successfully compiles and builds into binary rpms on at
least one primary architecture
 - was builded many times in koji, but just to be sure my patch doesn't break anything, here is another scratch build
 http://koji.fedoraproject.org/koji/taskinfo?taskID=2673167
0 MUST: If the package does not successfully compile, build or work on an
architecture, then those architectures should be listed in the spec in
ExcludeArch
+ MUST: All build dependencies must be listed in BuildRequires, except for any
that are listed in the exceptions section of the Packaging Guidelines
Build in koji
0 MUST: The spec file handles locales properly. This is done by using the
%find_lang macro
No locales
0 MUST: Every binary RPM package (or subpackage) which stores shared library
files (not just symlinks) in any of the dynamic linker's default paths, must
call ldconfig in %post and %postun.
Not appllicable
0 MUST: Packages must NOT bundle copies of system libraries
0 MUST: If the package is designed to be relocatable, the packager must state
this fact in the request for review, along with the rationalization for
relocation of that specific package. Without this, use of Prefix: /usr is
considered a blocker
+ MUST: Package must own all directories that it creates. If it does not create
a directory that it uses, then it should require a package which does create
that directory
+ MUST: Package must not list a file more than once in the spec file's %files
listings
+ MUST: Permissions on files must be set properly. Every %files section must
include a %defattr(...) line.
+ MUST: Each package must have a %clean section, which contains rm -rf
%{buildroot} (or $RPM_BUILD_ROOT).
+ MUST: Each package must consistently use macros
+ MUST: The package must contain code, or permissable content
0 MUST: Large documentation files must go in a -doc subpackage
+ MUST: If a package includes something as %doc, it must not affect the runtime
of the application
+ MUST: Header files must be in a -devel package
+ MUST: Static libraries must be in a -static package
+ MUST: Packages containing pkgconfig(.pc) files must 'Requires: pkgconfig'
+ MUST: If a package contains library files with a suffix (e.g. libfoo.so.1.1),
then library files that end in .so (without suffix) must go in a -devel package
+ MUST: devel packages must require the base package using a fully versioned
dependency: Requires: %{name} = %{version}-%{release}
+ MUST: Packages must NOT contain any .la libtool archives, these must be
removed in the spec if they are built
0 MUST: Packages containing GUI applications must include a %{name}.desktop
file, and that file must be properly installed with desktop-file-install in the
%install section
+ MUST: Packages must not own files or directories already owned by other
packages
+ MUST: At the beginning of %install, each package MUST run rm -rf %{buildroot}
(or $RPM_BUILD_ROOT)
+ MUST: All filenames in rpm packages must be valid UTF-8

Please, fix rpmlint warnings, otherwise everything is perfect.

Comment 24 Matěj Cepl 2010-12-17 15:18:07 UTC

Created attachment 469395 [details]
patch file in .spec

Comment 25 Daniel Veillard 2012-10-11 06:43:37 UTC

I picked what was still making sense and also added a %check
based on make check, this is pushed in rawhide as libxml2-2.9.0-2.fc19

Comment 26 Matěj Cepl 2012-10-11 17:29:47 UTC

The only remaining lint problem is 

libxml2.x86_64: W: shared-lib-calls-exit /usr/lib64/libxml2.so.2.9.0 exit.5

of which the maintainer is aware and tries to eliminate them. Anyway, I would consider this more a programming issue rather than a packaging one.

APPROVED

Comment 27 Matěj Cepl 2012-10-11 17:31:17 UTC

Given this package has been long time in Fedora and there is no need for any manipulations with git, I am closing this bug.

Note You need to log in before you can comment on or make changes to this bug.