Bug 181068 - Review Request: html401-dtds - HTML 4.01 document type definitions
Summary: Review Request: html401-dtds - HTML 4.01 document type definitions
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jason Tibbitts
QA Contact: Fedora Package Reviews List
URL:
Whiteboard:
Depends On:
Blocks: FE-ACCEPT
TreeView+ depends on / blocked
 
Reported: 2006-02-12 17:34 UTC by Ville Skyttä
Modified: 2007-11-30 22:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-06-21 14:40:42 UTC
Type: ---
Embargoed:
kevin: fedora-cvs+


Attachments (Terms of Use)

Description Ville Skyttä 2006-02-12 17:34:35 UTC
SRPM Name or Url: http://cachalot.mine.nu/4/SRPMS/html401-dtds-4.01-0.2.src.rpm

Note for reviewers: the packaging approach here has been discussed with various folks over past few years.  The general ideas in it roughly follow the Debian SGML packaging draft policy, http://debian-xml-sgml.alioth.debian.org/sgml-policy/

%description improvements gracefully accepted :)

Comment 1 Ville Skyttä 2006-02-12 17:36:27 UTC
Ditto opinions whether the specification subpackage should be called -spec,
-doc, or -docs.

Comment 2 Nicolas Mailhot 2006-02-12 17:52:45 UTC
What are the differences with the policies followed by docbook-dtds and
xhtml1-dtds ?

Comment 3 Ville Skyttä 2006-02-12 19:18:11 UTC
docbook-dtds and xhtml1-dtds don't really seem to constitute a common policy.  I
think the basic ideas are mostly the same in them, this package and the Debian
policy.

docbook-dtds AFAICT unnecessarily embeds full version-release strings into its
install dirs, making it hard for apps to point only to a specific version of the
dtds.  It also sets SGMLDECL in a catalog that gets included in the global
catalog which is problematic because SGMLDECL (at least as implemented in
opensp) is global and the first one encountered will be used for _all_
subsequent matches -> breakage.  Further, it registers itself to the private
catalogs of openjade -> most likely unneeded, needs changing between openjade
releases, and results in a build dependency loop.

xhtml1-dtds registers the dtds only to the XML catalogs (despite of installing
into sgml dirs) so it's somewhat different.  It also embeds a date stamp (but
not version-release) in its install dirs but that's less problematic than in
docbook-dtds because there's a xmlcatalog in an unversioned path in
/usr/share/sgml/xhtml1.

Both of the above install DTDs executable, but that's probably just a bug.

So in a nutshell, compared to those, this package uses a dir structure that is
very unlikely to need changing, installs a self-contained catalog to its own dir
below /usr/share/sgml, uses DTDDECLs instead of SGMLDECLs in order to not
interfere with (nor be confused by) other SGML dtd packages' SGML declarations,
+ bug fixes.

Yes, I intend to file some related bug reports in the future :)  And I have also
HTML 2.0, 3.2, 4.0, ISO-HTML, XHTML Basic 1.0, XHTML 1.1, SMIL 2.0, and RSS
*-dtds packages more or less ready to roll, some of which will probably be
submitted later.

Comment 4 Daniel Veillard 2006-02-12 19:35:04 UTC
you can't mix sgml and xml resources in the XML catalogs. SGML definition
will generate fatal errors when loaded by an XML parser. 

Anything reachable from /etc/xml/catalog must be XML only. So html-4... just
can share things with xhtml1. W.r.t. using /usr/share/sgml for the XML catalog
this is the unfortunate result of SGML nutheads blocking XML from the LSB
standard a few years ago, so Red Hat had to keep them there instead of a
far more logical /usr/share/xml subtree !

W.r.t. XHTML 1.1 and SMIL 2.0, they are extensible languages, i.e. the basic
language defined in the DTDs are supposed to be extended with foreign elements
in different namespaces, which is actually something which doesn't work with
DTDs, so the usefulness of shipping those 2 gets very limited, as DTD based
validation will just fail in general (but Relax-NG or XSD schemas would work
better though there isn't good ways to reference them for catalog access from
the instances).

Please be very careful when trying to handle XML resources if you're competent
in SGML but not really aware of XML, this is a very different field with 
very different rules and specifications, this has bitten us in the past hard
and I don't want this to happen again.

Daniel (libxml2 author and member of W3C XML Core Working Group)


Comment 5 Ville Skyttä 2006-02-12 20:17:09 UTC
(In reply to comment #4)
> you can't mix sgml and xml resources in the XML catalogs. SGML definition
> will generate fatal errors when loaded by an XML parser. 

Of course.  I don't know where you got the impression that someone/thing wanted
to do that.  It would be nice if xhtml1-dtds would register the DTDs to the SGML
catalogs in addition to XML catalogs though (for example for use in SGML
validation tools and things that grok SGML but not XML catalogs), but that's a
bit offtopic for this submission.

Of course, the html401-dtds package deals with SGML catalogs only.

> Please be very careful when trying to handle XML resources if you're competent
> in SGML but not really aware of XML, this is a very different field with 
> very different rules and specifications, this has bitten us in the past hard
> and I don't want this to happen again.

I don't claim to be an SGML nor XML god, but do have more than a little
experience with both.

> Daniel (libxml2 author and member of W3C XML Core Working Group)

Ville (member of the W3C QA Tools Development Team) ;)

Comment 6 Daniel Veillard 2006-02-13 13:28:30 UTC
:-)

Note that there is no garantee that SGML tools will handle XML correctly,
it will parse but for example they will probably misprocess xml:base or
xml:id , checking dependancies to those one a case by case basis before
pushing on both catalogs will be a good idea :-)

Daniel

Comment 7 Terje Bless 2006-02-13 18:22:04 UTC
The package in general looks good to me.

I'd suggest either dropping the actual text of the specification entirely or
simply include it as %doc. Having a separate sub-package for this seems
essentially overkill.

Alternately I'd package up the spec and have the DTDs tag along (as they're a
normative part of the HTML 4.01 Recommendation).


The .ent(ity) files might beneficially be shipped by sgml-common -- since
they're common for HTML 2.0, 3.2, 4.0, and 4.01 -- and shared by the relevant
packages (cf. the comment at the top of the spec file).


For the %description I might use something along the lines of: [[[
  Provides the three HTML 4.01 DTDs (strict, frameset, and transitional).
  The DTDs are required for processing HTML 4.01 document instances using
  SGML tools such as OpenSP, OpenJade, or SGMLSpm.
]]]


Comment 8 Ville Skyttä 2006-02-13 18:53:28 UTC
Well, the specification is already in the source tarball, so let's just package
it somewhere.  I don't have that strong opinions on actually _where_, but FWIW,
the DTDs are about 30kB and the documentation is 370kB.

Note that *.ent are the same "only" in 4.0, 4.01, and ISO-HTML; not 2.0 or 3.2.
 I'm not against including them in sgml-common (or let's say a hypothetical
html-common) package, but I think the approach here is good enough at least for
now, and doesn't require waiting for the FC sgml-common package to be updated,
possibly also for earlier distro versions etc.  And including them here makes it
easier to maintain a self-contained catalog containing only the HTML 4.01 stuff
in /usr/share/sgml/ which is always nice to have.

The description sounds good to me, will adopt for the next package revision, due
out when other things discussed here have been agreed on.

Comment 9 Ville Skyttä 2006-02-25 21:50:06 UTC
http://cachalot.mine.nu/5/SRPMS/html401-dtds-4.01-0.3.src.rpm

* Sat Feb 25 2006 Ville Skyttä <ville.skytta at iki.fi> - 4.01-0.3
- Improve description (#181068).
- Fold specification into main package as %doc (#181068).


Comment 10 Terje Bless 2006-02-26 12:46:53 UTC
Hmmm.

* html401-dtds-4.01 seems odd. W3C specs have a Version (4.01) and each
  edition of that version get tagged with the Date of publication (19991224).
  "html-dtd-4.01-19991224" or "dtd-html-4.01-19991224" seem like saner
  naming and versioning conventions. Dunno what that'd be in RPM spec terms
  though.
* In any case, the rec version (4.01) should in either he package name xor
  the package version; a directory named "html401-dtds-4.01" just seems odd.

Other than that this looks good; I'd say push it as is if there isn't an easy
way to fix the versioning scheme nits.

Comment 11 Ville Skyttä 2006-02-26 13:45:19 UTC
401 needs to be in the package's name in order to make rpm and friends not
consider eg. the 4.01 DTDs an upgrade over let's say 3.2 DTDs, but to keep them
cleanly parallel installable.

The package's version number is intentionally "duplicated" so that it'll be kept
unchanged between package (and possibly spec) revisions so that the
documentation dir stays the same between possible package revisions for
/usr/share/doc/html401-dtds-$version/index.html bookmarkability; I don't want to
 invent another version number.  This also follows the practice of the existing
xhtml1-dtds package.

Including the date stamp to the package's release field should be no problem,
will do before requesting the first build unless there are other things that
require changing.

-dtds vs -dtd: there are already xhtml1-dtds and docbook-dtds, I don't see a
reason to deviate from that.

Comment 12 Daniel Veillard 2006-02-27 21:03:24 UTC
w.r.t. 401, I don't think the W3C will ever release an updated version of the
SGML HTML DTDs. All new developments have been around XML anyway.
"Cast in stone"

Daniel

Comment 13 Ville Skyttä 2006-02-27 21:17:03 UTC
Fully agreed, but I'm afraid there might be some confusion here.  Daniel, could
you clarify if/what you imply in practical packaging terms with "w.r.t. 401"?

Comment 14 Daniel Veillard 2006-02-28 12:39:20 UTC
To me it reinforced the fact that it's fine to not put it as a version,
we don't expect any more versions.
I also put the number in the name for xhtml1-dtds for the reasons exposed in #11

Daniel

Comment 15 Ville Skyttä 2006-02-28 21:45:26 UTC
Ok, so... how about a formal acceptance or an explicit list of blockers so we
could proceed here?

Comment 16 Daniel Veillard 2006-03-04 18:13:03 UTC
Well I think the idea is fine, I didn't checked closely the package, I'm
not really used to that. I don't know who takes the decision, in principle
it's fine by me.

Daniel

Comment 17 Ville Skyttä 2006-06-18 15:36:12 UTC
http://cachalot.mine.nu/5/SRPMS/html401-dtds-4.01-19991224.1.src.rpm

* Sun Jun 18 2006 Ville Skyttä <ville.skytta at iki.fi> - 4.01-19991224.1
- Include specification date in release field (#181068).
- Make doc symlinks relative.


Comment 18 Jason Tibbitts 2006-06-20 00:25:54 UTC
Unfortunately I know little about SGML and can't really evaluate this package in
that context or test it.  But I can evaluate it against the general set of
packaging guidelines and take Daniel's acceptance in comment 16 that it OK from
a SGML standpoint.  Hopefully that's sufficient.

The package builds fine; rpmlint only complains about the license, which is OK.
 It installs and uninstalls fine and the catalog in /etc/sgml is updated properly.

The only major issue I see is that there don't seem to be any dependencies on
/usr/bin/install-catalog or sgml-common for the scriptlets.  Unless I'm
misunderstanding something, this is a blocker.

Other comments:
perl(File::Spec) is part of base perl, so technically don't need to BR: it
although it certainly isn't a problem to do so.  (I know you know this; I only
add it for posterity.)

This isn't really code, but I can't imagine the "code not content" rule would
apply here.

Review:
* package meets naming and packaging guidelines (given the
parallel-installability argument I see the need to put the version in the name).
* specfile is properly named, is cleanly written and uses macros consistently.
* dist tag is present.
* build root is correct.
* license field matches the actual license.
* license is open source-compatible.  License text not included upstream.
* source files match upstream:
   1ed76627ba80816079649f67023ec7ab  html40.tgz
* latest version is being packaged.
* BuildRequires are proper.
* package builds in mock (development, x86_64).
O rpmlint is silent except for invalid license warning.
* final provides and requires are sane:
   html401-dtds = 4.01-19991224.1.fc6
  =
   /bin/sh
   sgml-common
* no shared libraries are present.
* package is not relocatable.
* owns the directories it creates.
* doesn't own any directories it shouldn't.
* no duplicates in %files.
* file permissions are appropriate.
* %clean is present.
* %check is not present; test suite wouldn't make much sense.
X scriptlets present but don't seem to have appropriate dependencies.
O code, not content.
* documentation is small, so no -docs subpackage is necessary.
* %docs are not necessary for the proper functioning of the package (the links
go from the doc directory into /usr/share/sgml and not the other way around).
* no headers.
* no pkgconfig files.
* no libtool .la droppings.
* not a GUI app.

Comment 19 Ville Skyttä 2006-06-20 16:47:32 UTC
http://cachalot.mine.nu/5/SRPMS/html401-dtds-4.01-19991224.2.src.rpm

* Tue Jun 20 2006 Ville Skyttä <ville.skytta at iki.fi> - 4.01-19991224.2
- Require install-catalog at post-install and pre-uninstall time (#181068).


Comment 20 Jason Tibbitts 2006-06-21 01:21:44 UTC
Looks good to me; the only issue I had is fixed.

APPROVED

Comment 21 Ville Skyttä 2006-06-21 14:40:42 UTC
Imported, built for devel, and FC5 branch requested.  Thanks!

Comment 22 Ville Skyttä 2007-08-14 11:39:54 UTC
Package Change Request
======================
Package Name: html401-dtds
Updated Description: HTML 4.01 document type definitions

Remove reference to the spec from package description as the docs are no longer
shipped due to the W3C Documentation License being non-free:
http://fedoraproject.org/wiki/Licensing

Comment 23 Daniel Veillard 2007-08-14 12:35:21 UTC
Can you explain Comment #22, problem as well as effects on the package.
The W3C documentation is freely copiable what is the problem ?

Daniel

Comment 24 Ville Skyttä 2007-08-14 12:48:19 UTC
See link in comment 22, the W3C Documentation License does not permit
modification of content licensed under it, making it "not okay" for Fedora.  The
effect on this package is that the documentation will no longer be shipped in
the binary rpm.  This is already implemented in the html401-dtds package in F7
updates-testing.

The DTDs and related items will stay because the W3C Software (not
Documentation) License can be used for them, and that license is ok. 
http://www.w3.org/Consortium/Legal/IPR-FAQ-20000620#DTD

Comment 25 Kevin Fenzi 2007-08-14 19:55:03 UTC
cvs done.


Note You need to log in before you can comment on or make changes to this bug.