Bug 839129 - Duplicate target IDs causes incorrect indentation error
Summary: Duplicate target IDs causes incorrect indentation error
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: PressGang CCMS
Classification: Community
Component: CSProcessor
Version: 1.x
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Lee Newson
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-07-11 04:56 UTC by Joshua Wulf
Modified: 2014-10-19 23:01 UTC (History)
3 users (show)

Fixed In Version: 0.25.2
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-06-07 01:29:12 UTC
Embargoed:


Attachments (Terms of Use)
Test case that demonstrates the behaviour (10.88 KB, application/octet-stream)
2012-07-11 05:00 UTC, Joshua Wulf
no flags Details

Description Joshua Wulf 2012-07-11 04:56:20 UTC
CSProcessor client version: 0.25.1
Loading configuration from /home/jwulf/.config/csprocessor.ini
Loading project configuration from csprocessor.cfg
Connecting to Skynet server: http://skynet.usersys.redhat.com:8080/TopicIndex/

Starting to parse...
Attempting to download all the latest topics...
Starting to validate...
ERROR: Target ID is duplicated. Target ID's must be unique.
       -> Line 108:  Chapter: Queues [T1]
       -> Line 316:  Appendix: Exchange and Queue Declaration Arguments [T1]
ERROR: Line 317: Invalid Content Specification! Indentation is invalid.
       -> Exchange and Queue Argument Reference [8380]
ERROR: The Content Specification is not valid.
ERROR: The Content Specification is not valid.



The duplicate IDs error is correct. The indentation error is incorrect, and disappears when the duplicate IDs are fixed. No big deal.

==============================
Generated Details:

Package: cspclient-0.25.1-1.noarch

OS: Fedora release 16 (Verne)

JAVA: java version "1.7.0"
Java(TM) SE Runtime Environment (build 1.7.0-b147)
Java HotSpot(TM) Client VM (build 21.0-b17, mixed mode, sharing)


CSProcessor client version: 0.25.1
Loading configuration from /home/jwulf/.config/csprocessor.ini
Loading project configuration from csprocessor.cfg
Connecting to Skynet server: http://skynet.usersys.redhat.com:8080/TopicIndex/

Content Specification ID: 8025
Content Specification Revision: 158040
Content Specification Title: Messaging Programming Reference

Starting to calculate the statistics...
Total Number of Topics: 229
Number of Topics with XML: 48
Percentage Complete: 20.96%

Comment 1 Joshua Wulf 2012-07-11 04:59:47 UTC
Generating a test case spec to attach.

Also: it throws this error after downloading all the topics. Could this validation be moved to before downloading the topics?

Comment 2 Joshua Wulf 2012-07-11 05:00:39 UTC
Created attachment 597454 [details]
Test case that demonstrates the behaviour

Comment 3 Lee Newson 2012-07-11 05:16:12 UTC
I suspect that the error is actually coming from the line above as that's the case in most Indentation errors, I'll have a proper look later.

As for the error message it is picked up before downloading all the topics but validating, validates the entire spec first and then reports errors. We could add a "--syntax" option that would only valid the syntax of the content spec rather then the syntax and content.

Comment 4 Joshua Wulf 2012-07-11 06:10:47 UTC
If Line X throws an error "Duplicate ID", then line X+1 throws error "incorrect indentation", even when indentation is correct.

Making the --syntax option the default would be good. Otherwise atm it goes:

"Please wait, downloading heaps of data from a distant server over your dial-up connection.... Please wait..... Please wait.... please wait.... OK, I have them now.

Oh, LOL - I didn't need them anyway: you made a typing error. Please correct it and try that again. Hahaha."

Essentially what I do is create new topics through the content spec, rather than the web interface, because it's less steps than creating, saving, getting the title and ID and pasting into the spec. With the CSP I write the title and topic type, then push, then rebuild in the web interface, and use my web editor.

I would rather have it quit as soon as possible with a useful error message if it's not going to succeed. My workflow involves rebuilding or pushing every ten minutes or so to be able to read the book I'm working on, or to create a new topic, so it adds up.

Comment 5 Lee Newson 2012-07-11 06:37:19 UTC
Lol we originally had it that way (basically) and you were the one who asked for it to be changed.

Anyways I have no issue with changing it so that if a validation error is found during parsing then it'll stop before downloading the topics. However I see no point in having it only check the syntax and then having to always specify a option to validate further. Instead it should do this sequence:

Parse
  Syntax Valid?
    No: 
      Stop and print errors
    Yes:
      Download Topics
      Content Valid?
        Yes: 
          Print valid message
        No:
          Stop and print errors

Comment 6 Joshua Wulf 2012-07-12 00:00:33 UTC
That pattern sounds perfect.

If I remember correctly (and I can't find the bug that documents the decision); what we did previously is fix the situation where it tells you:

"I found one error, please fix and try again"
...
"I found one error, please fix and try again"
...
"I found one error, please fix and try again"

To say: "I found these three errors, please fix and try again".

That fixes the problem that you can't find out all the errors and fix them all at once.

The situation now is subtly different. We're drawing a line between errors that require a potentially expensive network operation and errors that can be detected without accessing the network.

Maybe there is some further performance optimization that could be achieved with caching?

Comment 7 Lee Newson 2012-07-12 00:23:49 UTC
Yeah that was before we started using bugzilla but I think you might be right.

As for the "performance optimization achieved with caching", what exactly do you mean by that? The only thing that could be done atm is to store the data across calls. To do that however the JVM would have to continue running and not stop after each call. That however brings up many more issues about how to tell when the data has changed on the server side. As such at this point in time that option isn't feasible.

Comment 8 Joshua Wulf 2012-07-12 00:42:34 UTC
File reads are less expensive than network reads. So you could write the topic titles and ID to disk, for example. They could go into a .cache file in the current directory, with a time stamp.

The question about what information might change on the server and what makes sense to read from the cache, and how long the cache should be valid is the crux of the matter. 

In this case it may or may not make sense, but it's something to keep in mind. Caching information that can be cached (with a concept of cache lifetime validity) will make the app faster and more responsive.

The key question to determine if caching could help is: "If this data were available on the local file system (with an age of X seconds), could i use it?"

In situations where the answer to that question is "yes", then caching will speed up the operation.

Remember that when a user runs an operation once an hour, they don't notice the network overhead so much. It's when they run the same operation three times one immediately after the other that they go: "omg the fifteen seconds of network access for that piece is killing me!" - because with three operations one after the other it's 45 seconds right when they are paying attention and need the operation to complete.

Generally when doing this the user is trying to write a book, and the time taken to rebuild / push / whatever, is an interruption to the writing flow.

So you might find some area where you can use local caching to speed things up.

The cases where the cache is not going to be useful is when the user changes something on the server. 

Here's a suggestion:

SCENARIO A:
1. The user runs the operation.
2. csprocessor downloads the topic information
3. The operation fails because the topic titles don't match
4. The user re-runs the operation with -p within 3 seconds
5. Use the cache

SCENARIO B:
1. The user runs the operation
2. csprocessor downloads the topic information
3. The operation fails because the topic titles don't match
4. The user edits the spec and re-runs within 10 seconds (cache the spec timestamp)
5. Compare the spec with the cached topic titles

In Scenario B, in ten seconds I might have edited the topic title in Skynet, so it may not make sense, or there may be some way to detect this.

I'm sure there are some combinations of operations and lifetimes that caching on the local filesystem could speed up.

OK, this one would be further in the future because it relies on server-side functionality, but keep the idea in mind as a design pattern:

When the local agent requests information from the server it does it as a single transaction, rather than as individual requests. The local agent caches the result, and the server caches the request with an ID.

When the local agent runs again, it asks the server if anything has changed since the last request.
 If the server says no, then it uses the local cache. 
 If the server says "yes, these have", then it gets only those. 
 If some local elements have changed, then the agent requests only the parts that it needs, and uses the local cache for the parts that are invariant.

Comment 9 Lee Newson 2012-07-12 00:58:13 UTC
Sorry but I've looked at this before and there is no sensible time frame except for a few seconds to cache requests for. Ie you build something then notice a spelling error immediately, then you've got to wait for the cache to expire before you can see the changes. This just isn't a viable option. 

What is needed is a caching system on the server. This is normally done using ETag's (see http://en.wikipedia.org/wiki/HTTP_ETag). Using that we could then check if the data had changed and you could cache the data successfully. However storing to disk is a bad idea, as then you have no way to ensure that the data hasn't been changed by the user. There are other minor issues I can see with storing on the local machine but the previous point is enough to discount that possibility.

Comment 10 Lee Newson 2012-07-12 01:31:33 UTC
Fixed the initial component of this bug in 0.25.2 and also made the validator behave in the manor described above.

Cause:

When an error occurred the current indentation counter wasn't being incremented.

Consequence:

The Parser would throw an indentation error as it thought the that the line should of had one less set of indentation.

Fix:

Increment the counter when errors are thrown.

Comment 11 Lee Newson 2012-07-12 01:37:10 UTC
As for the caching that needs to be made into an RFE. Really it needs two RFE's one for this queue and one for the Skynet queue.

As such I've done just that:

Skynet Queue: Bug #839438
CSP Queue: Bug #839441

Comment 12 Lee Newson 2012-07-12 04:14:23 UTC
Note: I added the needs_documentation flag incase the new validation sequence from comment 5 needs documenting.

Comment 13 Lee Newson 2013-06-07 01:29:12 UTC
Closing and setting as current release as no QA was performed by the original reporter. If there is still an issue with this bug still than please re-open it.


Note You need to log in before you can comment on or make changes to this bug.