Bug 779180 (SOA-1575)

Summary: Beta 1 QE Review: Smooks Guide
Product: [JBoss] JBoss Enterprise SOA Platform 5 Reporter: David Le Sage <dlesage>
Component: DocumentationAssignee: David Le Sage <dlesage>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 5.0.0 GA   
Target Milestone: ---   
Target Release: 5.0.0 GA   
Hardware: Unspecified   
OS: Unspecified   
URL: http://jira.jboss.org/jira/browse/SOA-1575
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-02-24 22:10:18 UTC Type: Task
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Smooks_User_Guide.pdf
none
smooks_1_Screenshot.png
none
smooks_2_Screenshot.png
none
smooks_3_Screenshot.png
none
smooks_4_Screenshot.png
none
smooks_5_Screenshot.png none

Description David Le Sage 2009-10-30 00:32:16 UTC
Affects: Documentation (Ref Guide, User Guide, etc.)
Date of First Response: 2009-11-06 11:40:16
project_key: SOA

Len, please find attached the first draft of the Smooks Guide. At this stage, we need to know if any information is factually erroneous.  We also need to know which sections should be removed (such as the link to the FAQ and the Maven and Ant information.)  We will work on formalising the tone of the writing and correcting grammatical errors, so do not worry about these.  We just need to know the content to include.

Comment 1 David Le Sage 2009-10-30 00:33:15 UTC
Attachment: Added: Smooks_User_Guide.pdf


Comment 2 Len DiMaggio 2009-11-06 16:40:16 UTC
Smooks defines multiple terms that are unique to the project.

Can we a Glossary?

1.  Smooks Core: Provides the basic Smooks infrastructure/framework. Smooks Core is responsible for processing a set of Smooks Resource configurations. From these configurations, Smooks Core constructs and manages "Execution Context", "Content Delivery Configuration" and "Filter" (DOM/SAX) components. Smooks Core is the subject of this section.

2. Smooks Cartridges: Provide the real functionality on top of Smooks Core e.g. Javabean/POJO population, Templating support (XSLT, FreeMarker, StringTemplate), Scripting Support (Groovy) etc. These are the resources that will be applied by Smooks Core during filtering (see below). 

3. Visitor: Visitor logic is central to how Smooks works. A Visitor is a simple piece of Java logic that can perform a specific task on the message fragment at which it is targeted, such as applying an XSLT stylesheet. You have the choice of supporting your logic through either the SAX or DOM Filters by implementing one or both of the following interfaces: org.milyn.delivery.sax.SAXElementVisitor or org.milyn.delivery.dom.DOMElementVisitor.

4. Java Binding: This is the ability to populate a Java Object Model from the source message.

5. Message Splitting and Routing: This the ability to perform complex splitting and routing operations on the source message. It includes the ability to route to multiple destinations concurrently. Different data formats can also be routed concurrently. These formats include XML, EDI, CSV and Java.

6. Huge Message Processing: This the ability to be able to declaratively "consume" (that is, transform or split-and-route) very large messages without the need for one to write substantial amounts of high-maintenance code.

7. Cartridge - A cartridge is a Java Archive (JAR) that contains reusable Content Handlers. Smooks includes pre-built and ready to use visitor logic so that users can implement solutions with minimal java code. This visitor logic is bundled into categories. These bundles are called cartridges.

8. Content Handler - Can't find a definition in the doc for this.

Content Handlers are the cornerstone of the Smooks component model. The following Content Handler types are currently in existance:

   1. Stream Readers: Smooks supports filtering of both XML and non-XML data because it allows you to configure a "Stream Reader" for each filter process. If no reader is configured, it defaults to XML. So, the Stream Reader resource is responsible for generating a stream of SAX events from a hierarchical data stream (e.g. XML, CSV, EDI etc.). This stream of SAX events can then be processed by XML Element Visitors (via Smooks). 

   2. Element Visitors: After Smooks hooks a Stream Parser to the data Stream, it starts receiving a stream of SAX events i.e. "startElement", "endElement" etc. These events are then used by Smooks to select an "ElementVisitor" implementation, which process the event in some way. This is the primary extension point in Smooks, as well as being the mechanism though which Smooks supports a fragment based processing model. See the list of Smooks Cartridges for examples of ElementVisitor implementations already available with Smooks. 

   3. Element Serializers: DOM based processing supports implementation of DOM Element "Serialization Units", allowing you to implement custom serialization at a fragment level.


Comment 3 Len DiMaggio 2009-11-06 16:46:31 UTC
Multiple cases of images that are too unclear to be read.

Comment 4 Len DiMaggio 2009-11-06 16:46:31 UTC
Attachment: Added: smooks_1_Screenshot.png
Attachment: Added: smooks_2_Screenshot.png
Attachment: Added: smooks_3_Screenshot.png


Comment 5 Len DiMaggio 2009-11-06 16:51:20 UTC
Multiple cases of images that are too unclear to be read.

Comment 6 Len DiMaggio 2009-11-06 16:51:20 UTC
Attachment: Added: smooks_4_Screenshot.png
Attachment: Added: smooks_5_Screenshot.png


Comment 7 Len DiMaggio 2009-11-06 20:29:16 UTC
General comments

* Some of the illustrations are impossible to read. See attached files.

-------------------------------------------------------

Section 1 - Overview

Text:

Java Binding
   This feature is used to populate a Java Object Model from a data source (such as a CSV, EDI,
   XML or Java file.) The populated object models can then be used either as transformation results
   in themselves or, alternatively, as a templating resources from which XML or other character-
   based results can be generated. This feature also supports Virtual Object Models (maps and lists
   of typed data), which can be used by both the ETL and templating functionality.

Comment:

   This is the first place in the book where the term "ETL" is used. We should define the term: Extract Transform Load (ETL) here. It's currently defined 2 pages later.

-------------------------------------------------------

Section 1 - Overview

Comment: Invalid links in PDF file:

http://www.smooks.org/mediawiki/index.php%3Ftitle=FAQ
should be: http://www.smooks.org/mediawiki/index.php/title=FAQ

http://www.smooks.org/mediawiki/index.php%3Ftitle=Maven_%2526_Ant
should be: http://www.smooks.org/mediawiki/index.php?title=Maven_%26_Ant

http://www.smooks.org/mediawiki/index.php%3Ftitle=Smooks_v1.2_Examples
should be: http://www.smooks.org/mediawiki/index.php?title=Smooks_v1.2_Examples

<<<Open issue: We really do not want to direct Platform customers to project URLs for examples, documentation, etc. In the case of Smooks, avoiding this will be difficult as the Smooks project documentation is in the form of a tree of wiki pages. Can the Smooks project provide the same information in a standalone PDF file? In the case of the example code, if we include Smooks in the Platform, we should also include the examples' code. >>>

-------------------------------------------------------

Section 2.1. Basic Processing Model

Comment: Another invalid link

http://www.smooks.org/mediawiki/index.php%3Ftitle=V1.2%3ASmooks_v1.2_Developer_Guide
should be: http://www.smooks.org/mediawiki/index.php?title=V1.2:Smooks_v1.2_Developer_Guide

<<<Another open issue: It's one thing to send Platform customers to a project URL for wiki pages. But if we are shipping a user guide that references a developer guide, then we should also ship the developer guide doc.>>>

-------------------------------------------------------

2.2. Simple Example

Comment: I know that the document is adapated from the Smooks' project on-line docs, but I think that it is a mistake to retain this example. The problems with the example are that the code fragement is incomplete and the example does not generate any ouput. What do you think about replacing the example with the example from:  http://www.smooks.org/mediawiki/index.php?title=V1.2:xml-to-xml ?

And - the name of the interface is: ExecutionContext not EvolutionContext   ;-)

Text:

In this example, no result is produced. There is also no interaction with the execution of the filtering
process. This is because the example didn't create an EvolutionContext and supply it to the
Smooks.filterSource method call. See http://www.milyn.org/javadoc/v1.2/smooks/org/milyn/
container/ExecutionContext.html

Comment: This is the first time that the Smooks "ExecutionContext" interface is mentioned in the document. We really should define the need for this interface, and any others that are needed in the example before we show the user the example.

-------------------------------------------------------

2.3. Smooks Cartridges

Text:

Smooks includes pre-built and ready to use visitor logic so that users can implement solutions with
minimal java code. This visitor logic is bundled into categories. These bundles are called cartridges.

Comment:

We should not use the term "bundle" as it may cause user confusion with Java bundles: http://java.sun.com/j2se/1.4.2/docs/api/java/util/ResourceBundle.html

How about "This visitor logic is combined into groups called "cartridges."

Comment:

Incomplete link:  http://www.smooks.org/cartri

-------------------------------------------------------

Section: 2.4. Filtering Process Selection (Should DOM or SAX be Used?)

Comment: We should remove the "(Should DOM or SAX be Used?)" question from the section title and add it in an opening sentence for the section.

Text: This does not include non-element Visitor resources, like, for example, readers.

Comment: What is a "non-element Visitor resource?" Also, in this context, what is a "reader?"

Comment 8 Dana Mison 2009-11-09 04:59:09 UTC
* The URLs are correct, I'm guessing you are viewing the PDF using Evince on RHEL5, see https://bugzilla.redhat.com/show_bug.cgi?id=475159

* References to the Smooks developer guide should be removed, it is concerned with developing smooks add-ons, which is out-of-scope for this project.

* A glossary would be nice, but not workable currently due to issues in translation.

* Images may get some work towards the end.  Issue is due to scaling in the PDF, they are more reasonable in the html/html-single versions.

* However, like the drools examples, the Smooks examples are not complete in the documentation, the documentation only includes snippets of code to highlight key points from the example.  You need to download the example maven projects to actually be able to use the examples.  3 options:
  1 - include Smooks examples with SOA Platform
  2 - leave as is
  3 - leave examples in, but remove URL references.



Comment 9 Dana Mison 2009-11-09 05:10:40 UTC
cartridges list URL on the Smooks wiki has been fixed, but now points to their maven/ant page which does contain a catridge list.

List as follows:

Calc: "milyn-smooks-calc"
CSV: "milyn-smooks-csv"
EDI: "milyn-smooks-edi"
Javabean: "milyn-smooks-javabean"
JSON: "milyn-smooks-json"
Routing: "milyn-smooks-routing"
Templating: "milyn-smooks-templating"
CSS: "milyn-smooks-css"
Servlet: "milyn-smooks-servlet"
Persistence: "milyn-smooks-persistence" (version 1.2 or higher)
Validation: "milyn-smooks-validation" (version 1.2 or higher)


Comment 10 Len DiMaggio 2009-11-10 02:24:33 UTC
Section 3.1. An Introduction to Smooks Resources

This section is unclear on just what a Smooks Resource is. We should add this definition:

Smooks resource - A "Smooks Resource" is anything that can be used by Smooks in the process of analyzing or transforming a data stream. They could be pieces of Java logic (DOMElementVisitor), some text or script resource, or perhaps simply a configuration parameter.

From: http://docs.codehaus.org/display/MILYN/Smooks+Resource

-------------------------------------------------------

Section 3.1. An Introduction to Smooks Resources

Text: When one compares the above examples to their equivalents in pre-Smooks v1.1 releases, it can be seen that...

Comment: We should add at last one pre-Smooks v1.1 example here - to show how much easier it is to configure a resource type.

Comment 12 Len DiMaggio 2009-11-10 18:59:55 UTC
Section: 7.1. Processing CSV

Comment - The data samples include the actual name of a Smooks developer. This is acceptable for the open source project, but for the Platforms, we don't want to imply that customers should contact developers directly for support. 

-------------------------------------------------------

Section: 7.1.2. Binding CSV Records to Java

References to the on-line user guide (e.g., http://www.smooks.org/mediawiki/index.php?title=V1.2:Smooks_v1.2_User_Guide#Virtual_Object_Models_.28Maps_.26_Lists.29) should be removed from this document. The information is in this section of the document:
4.4.2. Virtual Object Models (Maps and Lists)

Comment 13 Len DiMaggio 2009-11-10 19:59:56 UTC
Section: Example 7.1. A Simple Configuration

Comment: There is a duplicate record in the output.

-------------------------------------------------------

Section: 7.1.4.1.3. Segment Matching

Comment: The reference to Java 1.5 (http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/package-summary.html) should be replaced with this refenence to Java 1.6 (http://java.sun.com/javase/6/docs/api/java/util/regex/package-summary.html)

-------------------------------------------------------

Section: 7.1.4.3. Type Support

Comment: Reference to this:
http://www.smooks.org/mediawiki/index.php?title=V1.2:Smooks_v1.2_User_Guide#EJC_-_Edifact_Java_Compiler  URL should be changed to reference section 7.1.4.5. EJC (Edifact Java Compiler) in this document.

-------------------------------------------------------

Section: 7.1.4.6. EJC Maven Plug-In

Comment: Reference to this: http://www.smooks.org/mediawiki/index.php?title=V1.2:Smooks_v1.2_User_Guide#EDI_Mapping_Models URL should be removed - the corresponding information should be added to this document.

-------------------------------------------------------

Section: 7.1.4.7.1. Using EJC

Text:

The easiest way in which to commence using EJC is to learn from the example at http://
www.smooks.org/mediawiki/index.php?title=V1.2:ejc.

Comment:

We should add this example to the guide. The example code is already in the examples .zip file.

-------------------------------------------------------

Section 9: Rules

Comment - We should add a note indicating that in this context, "rules" does not refer to JBoss Rules or Drools.

-------------------------------------------------------

Section: 9.2.1.1. Useful Regular Expressions

Text: A list of useful regular expressions is maintained on the Smooks Wiki at http://www.smooks.org/
mediawiki/index.php?title=Useful_Regular_Expressions.

Comment: It's actually a short list. Can we add it to the document.

-------------------------------------------------------

Section: 10.4. Example10.4. Example
A detailed validation example can be see at http://www.smooks.org/mediawiki/index.php?
title=V1.2:validation-basic


Text:
A detailed validation example can be see at http://www.smooks.org/mediawiki/index.php?
title=V1.2:validation-basic

Comment: We should include the example in the user guide.

-------------------------------------------------------

Chapter 11: Note on 1st page of the chapter:

Text:
   Be sure to read Chapter 4, Java Binding .

Comment: Why?

-------------------------------------------------------

Chapter 12 - the entire chapter consists of:

Message Splitting and Routing
Please refer to Section 11.2, "Splitting and Routing".

Comment: We should remove this chapter.

-------------------------------------------------------

Chapter 14 - the entire chapter consists of:

Message Enrichment
When using the Persistence features of Smooks, the queried data is bound to the bean context
(ExecutionContext). You can use the bound query data to enrich your messages e.g. where you are
splitting and routing.
Refer to Chapter 13, Persistence (Database Reading and Writing) for additional details.

Comment: Why not just add this to chapter 13?

-------------------------------------------------------

Section: 15.2.1. Global Filter Setting Parameters

Text:
The following global configuration options are available for configuring Smooks filtering.
See http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/transform/stream/StreamResult.html and http://
www.milyn.org/javadoc/v1.2/smooks/org/milyn/Smooks.html#filterSource(javax.xml.transform.Source,
%20javax.xml.transform.Result...).

Comment: Replace the Java 1.5 reference with:  http://java.sun.com/javase/6/docs/api/javax/xml/transform/stream/StreamResult.html

-------------------------------------------------------

Chapter 18:

ESB Integration
Smooks plugins are available for a number of ESBs:
JBoss ESB
   http://wiki.jboss.org/wiki/Wiki.jsp?page=MessageTransformation
Mule
   http://www.mulesource.org/display/SMOOKS/Home
Apache Synapse/WS02
   http://esbsite.org/resources.jsp?path=/mediators/upul/Smooks%20Transform%20Mediator

Comment: We should remove this section and replace it with a section that covers the JBossESB integration in the SOA Platform.  (I'll even help you write it!  ;-)


Comment 14 Len DiMaggio 2009-11-10 20:02:14 UTC
Assigning back to David - I completed a first pass on the doc - what I have not yet done is to check that all the example code is correct and working. I'd like to look at the other books before coming back to do this.

Some global comments on the Smooks User Guide:

* References to Smooks project assets (javadocs). For references to pages in Smooks javadocs, we should be able to reference files installed as part of the SOA-P 5.0 distribution, as the javadocs should be included in the distribution. We'll need to add a separate JIRA/task to add the javadocs to the distribution. They are not there in ER2.

* References to Smooks project assets (other information). I haven't kept a count, but there are many instances where the user guide directs the user to a smooks wiki page for additional information. It's disruptive to have so many of these, but we may not have a choice, given the large number of wiki pages used by smooks.

* Addition of a glossary - What is the technical challenge that blocks the addition of a glossary? Can we simply create a glossary out of the the most important 10-20 terms?

* Best practices information for selecting transformation methods - We need to fulfill the SOA-P 5.0 PRD requirement to provide users with guidance in selecting transformation methods.

* Example code from projects that we incorporate into platforms - In BRMS 5.0.1, Drools 5.0 added an expanded set of examples. The examples were referenced in the Drools user guide, but the examples.zip file was not shipped as part of the BRMS distribution. There is a similar situation with Smooks 1.2 as the user guide references example programs that are not included in the SOA-P 5.0 distribution. Recommendation: Add the Smooks examples to the SOA-P 5.0 distribution. http://downloads.sourceforge.net/smooks/smooks-examples-1.2.2.zip   We will have to verify that the examples can be built and run.

* Removing references to project team members - Some of the code examples in the book include the actual name of a smooks developer. This is acceptable for the open source project, but for the Platforms, we don't want to imply that customers should contact developers directly for support. 

* References to the "user guide" that are also present in the on-line documentation should be removed from the user guide document. (The document IS the user guide as far as customers should be concerned.)

* Graphic images are fuzzy/unclear.

Comment 15 David Le Sage 2009-11-19 06:19:37 UTC
Done:
1.  Content Handler definition added.

2. Definition of acronym "ETL" moved to first occurrence.

3. "Simple Example" replaced with that for XML-to-XML transformations.

4. Definition of ExecutionContext added.

5. Reworded section to avoid use of word "bundle."

6. Question "(Should DOM or SAX be Used?)"  removed from header.

7. Broken URL removed and cartridge list added to book.

8.  Smooks Resource definition added.

9. Added mention of new XSL Transformation function in SOA-P 5.

10. Removed name of Smooks developer from example code and replaced with made-up names.

11.  Clarified that "rules" does not refer to JBoss Rules.

12.  References to Online User Guide removed.  Added link to section in our own document.

13.  Updated reference to Java 1.6

14. Link to section on EJC added.

15.  EJC example added to document.

16.  Regular expression list added to document.

17. Validation example added.

18.  Java Binding chapter reference removed.  

19.  Splitting and Routing chapter removed.  

20. Message Enrichment chapter merged.  (May be broken out again in the future if upstream expands this section.)

21. Reference to Java replaced with reference to 1.6

Not done:
1.  Glossary cannot be added due to technical limitations/specific business process rule.

2.  Images have not been altered, as this a PDF rendering issue only.

3.  Addition of a pre-Smooks 1.1 example.  (Please point me in the direction of an appropriate one.)

4. Len to rewrite ESB integration chapter, as mentioned above.



Comment 16 David Le Sage 2009-11-19 06:20:49 UTC
Len, please see comments above. The latest version of the Smooks document can be obtained from:


http://documentation-stage.bne.redhat.com/docs/en-US/index.html

Comment 17 Len DiMaggio 2009-11-30 18:26:15 UTC
Will get back to this doc for a 2nd review after a first round of the other docs

Comment 18 Dana Mison 2009-12-07 05:22:52 UTC
I reworded the section about the resource configuration features as to not refer to pre-V1.1 examples, thus negating that issue.

Comment 20 Len DiMaggio 2009-12-07 15:30:27 UTC
Link: Added: This issue related SOA-1675


Comment 22 David Le Sage 2009-12-07 22:27:23 UTC
Done.

Comment 23 Len DiMaggio 2009-12-10 19:23:53 UTC
3. "Simple Example" replaced with that for XML-to-XML transformations. 
------> The only problem is that the example uses FreeMarker - which we don't support in SOA-P 5 - it's uncommitted in the ERD - fix this before GA.

That completes the QE pre-beta doc review - assigning back to David.

Comment 24 Dana Mison 2009-12-14 01:05:11 UTC
I think there are also examples using Freemarker in the ESB docs as well

Comment 25 Tom Fennelly 2009-12-14 09:37:35 UTC
Hey Len... when you say "... the example uses FreeMarker - which we don't support in SOA-P 5...", what do we mean?  Only XSLT is supported?  If so, there are issues with this because XSLT is only supported via the Smooks DOM Filter.  If this is what we mean, then I don't think it's a good idea.

Comment 26 Kevin Conner 2009-12-14 09:53:23 UTC
We do not support FreeMarker via any ESB integration, which is what he is referring to.  If this is standalone Smooks documentation then the users can choose to use it if necessary, but not in a supported manner.

Comment 27 Kevin Conner 2009-12-14 09:54:05 UTC
Also, if there are any examples of using FreeMarker in our docs then we need to remove them until we actually support that integration.

Comment 28 Dana Mison 2009-12-17 08:18:05 UTC
5.0.0 Beta1 doc is available on http://redhat.com/docs/JBoss_SOA_Platform/

Comment 30 David Le Sage 2010-02-05 05:33:52 UTC
Added a note which clearly states that Free Marker is not supported.

Updated examples reference to point to the Zip file address that Len supplied above.




NOTE:  We are not accepting any further QE feedback, unless they are mission-critical issues that would result in data loss.  We are closing this stage of the process for this release.  Thanks.

Comment 31 Dana Mison 2010-02-05 06:54:41 UTC
reverted last update so the content points to the examples page as it did before.