Bug 742400

Summary: Special characters in topic name not handled gracefully.
Product: [Other] Topic Tool Reporter: Stephen Gordon <sgordon>
Component: cli-Topic_ToolAssignee: Stephen Gordon <sgordon>
Status: CLOSED WONTFIX QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 0.0.xCC: topic-tool-list
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-03 18:58:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Stephen Gordon 2011-09-30 00:17:37 UTC
Description of problem:

Tool will happily create a topic with, for example, a ? in the title. When using the provided link text however:

- The topic will not be found, as the ? appears in the xi:include, should be encoded as %3F.
- Once the above has been corrected processing still fails as the ? is in the ID attribute of the root node. This character is not valid in an XML ID.

Version-Release number of selected component (if applicable):

0.0.8

Comment 1 Stephen Gordon 2011-12-05 22:01:22 UTC
XML IDs must start with NameStartChar [1] followed by any number of NameChar [2]. I do not think it is desirable to restrict the content users place in the title elements of the topic (which can be basically anything, not nearly as restrictive as the ID). 

What needs to be done is to update the routine that generates the ID so that in addition to the current basic sanitation checks it also replaces any chars which aren't valid in the ID attribute. The sanitized version should be used both for the ID and for the filename.

This would fix the two problems:

1) That the IDs generated when special characters are used in the topic name currently aren't valid under the XML spec.

2) That even those that are, are escaped when saving to the filesystem. Rather than try to code around these in a platform dependent way taking the requirements of the ID in the XML spec gives a very restrictive naming which should be valid on most/all platforms.

[1] http://www.w3.org/TR/REC-xml/#NT-NameStartChar

[2] http://www.w3.org/TR/REC-xml/#NT-NameChar