1095590 – Asciidoc import: alt text for figures is incomplete

Bug 1095590 - Asciidoc import: alt text for figures is incomplete

Summary: Asciidoc import: alt text for figures is incomplete

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	PressGang CCMS
Classification:	Community
Component:	ImportTool
Sub Component:
Version:	1.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	1.6
Assignee:	Matthew Casperson
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1091570 1092170
TreeView+	depends on / blocked

Reported:	2014-05-08 05:29 UTC by mmurray
Modified:	2014-08-04 22:28 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2014-05-28 21:56:19 UTC
Embargoed:

Attachments	(Terms of Use)
zip of maven project to build docbook and html from asciidoc sources (237.64 KB, application/octet-stream) 2014-05-12 14:54 UTC, Nick Boldt	no flags	Details
zip of maven project to build docbook and html from asciidoc sources - with commas in the alt tag (237.69 KB, application/zip) 2014-05-12 15:09 UTC, Nick Boldt	no flags	Details
zip of maven project to build docbook and html from asciidoc sources - with commas in the alt tag, FIXED by adding quotes in the tag (237.67 KB, application/zip) 2014-05-12 15:12 UTC, Nick Boldt	no flags	Details
View All

Description mmurray 2014-05-08 05:29:20 UTC

Asciidoc:
.My great figure
image::images/4116.png[Web Page Open on Simulated Device.]

Resulting docbook:
	<figure>
		<title>My great figure</title>
		<mediaobject>
			<imageobject>
				<imagedata contentdepth="pen o" contentwidth="age O" fileref="images/4116.png"/>
			</imageobject>
			<textobject> 
				<phrase>Web P</phrase>
			</textobject>
		</mediaobject>
	</figure>

Issues:
* contentwidth and contentdepth are wrongly being pulled out of alt text
* <phrase> only contains part of alt text

I understand that commas in the [] of asciidoc image are used to specify width, links etc. But i don't have any commas here. So I don't see why the text is being broken up as it is.

Comment 1 Nick Boldt 2014-05-12 14:54:41 UTC

Created attachment 894761 [details]
zip of maven project to build docbook and html from asciidoc sources

Seems to be treating the alt tag as 3 blocks of 5 chars each, and truncating the rest:

[Web Page Open on Simulated Device.] ==> Web P,age O,pen o

-----

If I create a sample .adoc file like this:

<<test.zip!/test/src/main/asciidoc/test.adoc>>

And I render it using this pom.xml and the maven asciidoc plugin

<<test.zip!/test/pom.xml>>

Then the resulting docbook looks like this:

<<test.zip!/test/target/generated-docs/test.xml>>

In that case, the docbook XML looks fine, as does the resulting HTML.

So, clearly there's a problem w/ the way asciidoctor is being used here, as the maven wrapper for asciidoctor works fine on your sample .adoc.

Comment 2 Nick Boldt 2014-05-12 15:09:31 UTC

Created attachment 894763 [details]
zip of maven project to build docbook and html from asciidoc sources - with commas in the alt tag

This version of the project (test-commas.zip), with commas in the alt tag, results in broken docbook (but does NOT truncate the content):

.My great figure
image::images/4116.png[This description, containing commas, is an alt tag.]

becomes:

<figure>
<title>My great figure</title>
<mediaobject>
<imageobject>
<imagedata fileref="./images/4116.png" contentwidth="containing commas" contentdepth="is an alt tag."/>
</imageobject>
<textobject><phrase>This description</phrase></textobject>
</mediaobject>
</figure>

Comment 3 Nick Boldt 2014-05-12 15:12:01 UTC

Created attachment 894765 [details]
zip of maven project to build docbook and html from asciidoc sources - with commas in the alt tag, FIXED by adding quotes in the tag

If we add quotes into the alt tag, the resulting docbook is once again clean:

.My great figure
image::images/4116.png["This description, containing commas, is an alt tag."]

becomes

<figure>
<title>My great figure</title>
<mediaobject>
<imageobject>
<imagedata fileref="./images/4116.png"/>
</imageobject>
<textobject><phrase>This description, containing commas, is an alt tag.</phrase></textobject>
</mediaobject>
</figure>

Comment 4 Matthew Casperson 2014-05-18 22:21:01 UTC

I have confirmed that the Ascidoctor application (i.e. the Ruby one) will convert the asciidoc from https://bugzilla.redhat.com/show_bug.cgi?id=1095590#c1 correctly, so this must be an issue with Asciidoctor.js.

Comment 5 Matthew Casperson 2014-05-18 22:53:31 UTC

Fixed in Build 1.6-SNAPSHOT 201405190849

This was caused by changing the $ identifier in Asciidoctor.js to opal$ so it would play well with angularjs. I missed two cases where this find and replace modified a regex where the $ was used as a string end marker.

opal$opal.cdecl(opal$scope, 'BoundaryRxs', opal$hash2(["\"", "'", ","], {"\"": /.*?[^\\](?=")/, "'": /.*?[^\\](?=')/, ",": /.*?(?=[ \t]*(,|opal$))/}));

opal$opal.cdecl(opal$scope, 'SkipRxs', opal$hash2(["blank", ","], {"blank": opal$scope.BlankRx, ",": /[ \t]*(,|opal$)/}));

Fixing these fixed the image importing.

Comment 6 Matthew Casperson 2014-05-18 22:55:01 UTC

The supplied test case now imports as 

<section>
	<title>Test Page</title>
	<figure>
		<title>My great figure</title>
		<mediaobject>
			<imageobject>
				<imagedata fileref="images/4116.png" />
			</imageobject>
			<textobject> 
				<phrase>Web Page Open on Simulated Device.</phrase>
			</textobject>
		</mediaobject>
	</figure>
</section>

Comment 7 Lee Newson 2014-05-21 00:15:23 UTC

Verified that alt text is correctly populated now assuming the correct syntax is used. In the example Nick gave in Comment #2 the content was still imported with the incorrect contentdepth and contentwidth values, however this happened with the maven build as well, so marking this as VERIFIED.

Note You need to log in before you can comment on or make changes to this bug.