Created attachment 976464 [details] characters Description of problem: When special characters are inputted into specific fields, they are not preserved. Some inputs are corrupting process completely - i.e. unable to open. Version-Release number of selected component (if applicable): JBDS Version: 7.1.1.GA + BPMN2 Modeler - Diagram Editor 1.0.2.201402102317 org.eclipse.bpmn2.modeler.feature.feature.group Eclipse.org How reproducible: always Steps to Reproduce: 1) Set Characters(See attached fie: Characters.txt) into "Description -> Attributes -> Name" or "Process -> Attributes -> Id". 2) Save the process, close it, open it again Following cases are causing the issue: Characters1) Characters1 is changed to Characters2. Characters3) Nothing in Name field. Characters3 is changed Characters4 in Id field. Characters5) I can't open BPMN file.(See attached fie: Fig1.png) Actual results: Process is getting modified/corrupted Expected results: Special characters are not causing any "harm" to the process definition Additional info:
Looks like this is going to require special validation inside the text field editors, but I'll need some additional info: The jBPM engine references some types of objects by their IDs, these are: Property DataObject Message Signal Error Escalation GlobalType DataInput By definition, XML IDs must conform to the NCName production rule. But I suspect that these IDs, which are essentially treated as "variable names" in the engine, must conform to the more restrictive Java variable name syntax. In other words, NCNames allow the special characters "-" and "." (and some others), whereas Java variable names do not. And I'm assuming that the Process ID is ALSO NOT an NCName but rather conforms to the Java variable name syntax. Is this correct?
As it turns out, IDs and process variable names (and errors, escalations, messages, etc.) must conform to Java variable naming conventions. This has been enforced in the editor. When attempting to insert an invalid character into the ID or name field, an error message is displayed in the Status Bar (at bottom of Eclipse Workbench window) and the character is ignored. Fixed in community build for Luna 1.1.1.201501262023
Still reproducible on: jbds-8.0.2.GA_jbdsis-8.0.0.CR2 - BPMN2 Modeler 1.1.1.201501081320 Verified on: Luna: BPMN2 Modeler 1.1.2.201502101729
Behavior of jbds-8.0.2.GA_jbdsis-8.0.0.CR2 - BPMN2 Modeler 1.1.1.201501081320 1.) put for example '@' into process id 2.) character '@' won't be omitted 3.) save and validate file 4.) error about invalid process ID appears @Anton are you satisfied with such behaviour for jbdsis-8.0.0.CR2 ? Can I mark this as verified?
Anton, you are right. Characters '[U+2A082]' and '[U+20B9F]' are valid for java names. The issue you have described is still reproducible. *** Reproducible on: jbds-8.0.2.GA_jbdsis-8.0.0.CR3 - BPMN2 Modeler 1.1.1.201501081320 ***Way to reproduce a bug: 1.) Put Characters1 from attachment as process name or id for any process. 2.) Reopen process. 3.) Characters1 have changed to Characters2 The same issue for pair Characters3, Characters4 Issue with Characters5 is related to this [1] in my opinion. ***On BPMN2 Modeler 1.1.2 is issue with change of Characters1 to Characters2 and Characters3 to Characters4 is reproducible too. Modeler 1.1.2 cut Characters5 to given max length, see [1], but reopen of process is not possible. I am marking this issue as assigned again. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1179057
I just verified that this is a bug in the Sun/Oracle implementation of the Apache xerces parser found in the Java rt.jar library. These libraries are actually really old, forked versions of Apache xerces. By using the java.endorsed.dirs command-line switch (add it to your eclipse.ini file) you can specify an override for these libraries. For Windows OS it would be something like: -Djava.endorsed.dirs="C:\Program Files (x86)\Java\xerces-2_11_0" -Djava.endorsed.dirs="C:\Program Files (x86)\Java\xalan-j_2_7_2" The problems with pasting a line that is too long (the Characters5 example in the attachment) and is truncated by the text input field line length limit, is that these are multi-byte unicode characters, but the SWT Text input field widget truncates at 1-byte boundaries, not 1-character boundaries. I'm not sure I can do anything about that, as it is an SWT bug.
Bob, thanks for your investigation of problem. Addition of xerces and xalan to ini file solved problem with change of characters. But I am now obtaining this error, for example when I want to close project: An internal error occurred during: "Periodic workspace save.". Provider org.apache.xalan.processor.TransformerFactoryImpl not found Could you please tell me what version of java are you using? I am using jdk1.8.0_31. And could you please give me precise links, which you have used to download xalan and xerces? Thanks.
I used xerces 2.11.0 and xalan 2.7.2 downloaded from here: http://xerces.apache.org/mirrors.cgi However, I did run into another issue using these libraries; either these are incompatible with EMF, or I need to set some parser flags to get this version of xerces to behave the same as the Sun/Oracle parser. This needs some more investigation.