Bug 788576 - Publican generating duplicate id labels in html output
Summary: Publican generating duplicate id labels in html output
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Publican
Classification: Community
Component: publican
Version: 2.8
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: 3.0
Assignee: Jeff Fearn 🐞
QA Contact: Ruediger Landmann
URL:
Whiteboard:
Depends On:
Blocks: 820023
TreeView+ depends on / blocked
 
Reported: 2012-02-08 14:04 UTC by William Cohen
Modified: 2012-10-31 03:11 UTC (History)
5 users (show)

Fixed In Version: 3.0.0
Clone Of:
Environment:
Last Closed: 2012-10-31 03:11:47 UTC
Embargoed:


Attachments (Terms of Use)

Description William Cohen 2012-02-08 14:04:04 UTC
Description of problem:

The resulting publican html output for the systemtap beginners guide has duplicated id="<label>". Each label on a web page should be unique.


Version-Release number of selected component (if applicable):
publican-2.8-1.fc16.noarch
systemtap-1.6-1.fc16.x86_64


How reproducible:
always

Steps to Reproduce:
1. yumdownloader --source systemtap
2. yum-builddep ./systemtap-1.6-1.fc16.src.rpm
3. rpm -Uvh systemtap-1.6-1.fc16.src.rpm
4. cd rpmbuild/SPECS/; rpmbuild --define "with_publican 1" -ba systemtap.spec
5. cd ~/rpmbuild/BUILD/systemtap-1.6/doc/beginners/SystemTap_Beginners_Guide


  
Actual results:

Many of the .html pages have duplicated 'id="<label>"' such as the 'id="goal"' in instrouction.html. This can be checked with http://validator.w3.org/check


Expected results:

No duplicated labels in the generated html.


Additional info:

Can also see the same problem on a number of the Red Hat documentation pages such as:

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.0_Release_Notes/installer.html

http://validator.w3.org/check?uri=http%3A%2F%2Fdocs.redhat.com%2Fdocs%2Fen-US%2FRed_Hat_Enterprise_Linux%2F6%2Fhtml%2F6.0_Release_Notes%2Finstaller.html&charset=%28detect+automatically%29&doctype=Inline&group=0

Comment 1 William Cohen 2012-02-08 16:34:45 UTC
The duplicates all seem to be for <section id="<label>">

<section id="cross-compiling">
  <title>Generating Instrumentation for Other Computers</title>

</section>

Also looks like it might only be the first <section id="<label>"> that gets the problem html generated.

Comment 2 Ruediger Landmann 2012-03-12 23:40:26 UTC
Thanks William; moving this upstream

Comment 3 Jeff Fearn 🐞 2012-03-13 06:50:53 UTC
Removed duplicate IDs in HTML outputs.

Pushed To ssh://git.fedorahosted.org/git/publican.git
   55c8a86..a033b42  master -> master

Comment 4 Michael Hideo 2012-06-08 01:51:14 UTC
(In reply to comment #0)
> Description of problem:
> 
> The resulting publican html output for the systemtap beginners guide has
> duplicated id="<label>". Each label on a web page should be unique.
> 
> 
> Version-Release number of selected component (if applicable):
> publican-2.8-1.fc16.noarch
> systemtap-1.6-1.fc16.x86_64
> 
> 
> How reproducible:
> always
> 
> Steps to Reproduce:
> 1. yumdownloader --source systemtap
> 2. yum-builddep ./systemtap-1.6-1.fc16.src.rpm
> 3. rpm -Uvh systemtap-1.6-1.fc16.src.rpm
> 4. cd rpmbuild/SPECS/; rpmbuild --define "with_publican 1" -ba systemtap.spec
> 5. cd ~/rpmbuild/BUILD/systemtap-1.6/doc/beginners/SystemTap_Beginners_Guide
> 
> 
>   
> Actual results:
> 
> Many of the .html pages have duplicated 'id="<label>"' such as the
> 'id="goal"' in instrouction.html. This can be checked with
> http://validator.w3.org/check
> 
> 
> Expected results:
> 
> No duplicated labels in the generated html.
> 
> 
> Additional info:
> 
> Can also see the same problem on a number of the Red Hat documentation pages
> such as:
> 
> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.
> 0_Release_Notes/installer.html
> 
> http://validator.w3.org/check?uri=http%3A%2F%2Fdocs.redhat.com%2Fdocs%2Fen-
> US%2FRed_Hat_Enterprise_Linux%2F6%2Fhtml%2F6.0_Release_Notes%2Finstaller.
> html&charset=%28detect+automatically%29&doctype=Inline&group=0

follow Will's 5 steps and verify.

Comment 5 Michael Hideo 2012-06-08 01:58:23 UTC
(In reply to comment #0)
> Description of problem:
> 
> The resulting publican html output for the systemtap beginners guide has
> duplicated id="<label>". Each label on a web page should be unique.
> 
> 
> Version-Release number of selected component (if applicable):
> publican-2.8-1.fc16.noarch
> systemtap-1.6-1.fc16.x86_64
> 
> 
> How reproducible:
> always
> 
> Steps to Reproduce:
> 1. yumdownloader --source systemtap
> 2. yum-builddep ./systemtap-1.6-1.fc16.src.rpm
> 3. rpm -Uvh systemtap-1.6-1.fc16.src.rpm
> 4. cd rpmbuild/SPECS/; rpmbuild --define "with_publican 1" -ba systemtap.spec
> 5. cd ~/rpmbuild/BUILD/systemtap-1.6/doc/beginners/SystemTap_Beginners_Guide
> 
> 
>   
> Actual results:
> 
> Many of the .html pages have duplicated 'id="<label>"' such as the
> 'id="goal"' in instrouction.html. This can be checked with
> http://validator.w3.org/check
> 
> 
> Expected results:
> 
> No duplicated labels in the generated html.
> 
> 
> Additional info:
> 
> Can also see the same problem on a number of the Red Hat documentation pages
> such as:
> 
> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.
> 0_Release_Notes/installer.html
> 
> http://validator.w3.org/check?uri=http%3A%2F%2Fdocs.redhat.com%2Fdocs%2Fen-
> US%2FRed_Hat_Enterprise_Linux%2F6%2Fhtml%2F6.0_Release_Notes%2Finstaller.
> html&charset=%28detect+automatically%29&doctype=Inline&group=0

follow Will's 5 steps and verify.

Comment 6 Stephen Gordon 2012-06-08 18:09:36 UTC
There was actually an additional step required here because the revision history entries of the SystemTap Beginners Guide don't match the format expected by publican 3.0 (has 2.0 instead of 2-0). 

To get around this I had to extract the tar file in ~/rpmbuild/SOURCES/, modify the Revision_History.xml in the source tree, and then re-create the tar file. These actions should not however impact the validity of the test results.

Once the build completed I changed into the directory containing the html and ran a check to find duplicate IDs, none were returned (the sort is required because uniq -d only returns duplicates if they are 'touching'):

$ grep -o 'id=\"[^ ]*\"' *.html | sort | uniq -d
$

I also did a check specifically on one of the examples cited in the bug description and found only the one instance, no duplicates:

$ grep -o 'id=\"goals\"' *.html
introduction.html:id="goals"

Based on the above moving to VERIFIED.


Note You need to log in before you can comment on or make changes to this bug.