Bug 1017256 - Incorrectly calculated transaction statistics when recovery proceeds
Incorrectly calculated transaction statistics when recovery proceeds
Status: CLOSED NOTABUG
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: Transaction Manager (Show other bugs)
6.2.0
Unspecified Unspecified
unspecified Severity medium
: ---
: EAP 6.4.0
Assigned To: Michael
Ondrej Chaloupka
Russell Dickenson
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-09 09:51 EDT by Ondrej Chaloupka
Modified: 2015-01-26 20:34 EST (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
This release of JBoss EAP 6 carries a bug that shows incorrect transaction statistics when recovery is used when processing in-doubt prepared transactions. The total count of processed transaction is incorrectly increased prior to a crash of the server and also when the recovery fixes the in-doubt state after the server is restarted. In these cases, a transaction could be counted twice. This issue is under investingation and is expected to be resolved in a future release of the product.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-09-05 03:55:37 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
smumford: needinfo-


Attachments (Terms of Use)
Screenshot from the transactions statistic (58.44 KB, image/jpeg)
2013-10-09 09:51 EDT, Ondrej Chaloupka
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
JBoss Issue Tracker HAL-437 Major Resolved Incorrectly calculated transaction statistics when recovery proceeds 2015-10-30 04:54 EDT

  None (edit)
Description Ondrej Chaloupka 2013-10-09 09:51:43 EDT
Created attachment 809970 [details]
Screenshot from the transactions statistic

When recovery of transaction is provided then calculation of the transaction statistics does not work well. You can get percentage of commited transactions over 100%.

I think that's because of the recovered transaction is not counted to the sum of all transactions but when they are committed then the commit number is increased.

You can check the attachment what the statistics shows in my case.
Comment 2 JBoss JIRA Server 2014-07-02 10:16:59 EDT
Heiko Braun <ike.braun@googlemail.com> updated the status of jira HAL-437 to Coding In Progress
Comment 4 Heiko Braun 2014-07-03 04:53:54 EDT
the problem is actually not related to the GUI. here's an explanation why we get to see bogus numbers:


hbraun: jhalliday: it seems QE tests recovery scenarios. Does that ring a bell?
[10:50am] jhalliday: hmm. if you're measuring the stats post crash only then you can't expect them to be equal. the sum of the stats from the container runs pre and post crash should be though. essentially the crash resets the counters since it's just in-memory.
[10:50am] hbraun: jhalliday: ah, makes sense
[10:51am] hbraun: jhalliday: you mean post crash only the commited value will be increased?
[10:52am] jhalliday: in the pre crash segment of the test the total count is increased, since that is done at tx begin. in the post crash segment the commit count is increased by recovery. overall they stay equal, but if you look at only one or the other you'll get weirdness.
[10:52am] hbraun: ok, i think that explains it
Comment 5 Heiko Braun 2014-07-03 04:55:33 EDT
IMO this a not a bug, but a limitation of the management layer in general. However I am assigning it to Brian to comment in it.
Comment 6 JBoss JIRA Server 2014-07-03 04:57:37 EDT
Heiko Braun <ike.braun@googlemail.com> updated the status of jira HAL-437 to Resolved
Comment 7 Brian Stansberry 2014-07-07 09:14:31 EDT
Reassigning to the subsystem component as this is purely a subsystem issue, either in what data the underlying component provides or in how what the subsystem exposes based on that.
Comment 8 Ondrej Chaloupka 2014-07-07 09:57:00 EDT
I put flag requires_doc_text to ? as I think that it should be documented in known issues. I put some *draft text* to the doc text field. Would you be so kind and revise it?
Comment 11 Michael 2014-09-01 15:27:34 EDT
Hi Ondrej,

As Heiko and Jonathan point out, we initialize statistics during boot so recovering transactions that were started by previous runs will not have a corresponding begin. The complexity involved in tracking transactions created by prior runs (we would have to modify the structure of our transaction log records) does not warrant changing the semantics of "number of transactions" versus "number of commits". As long as it is clear what these two statistics mean then there is no issue and we can simply mark it as a "feature" of how transaction statistics are reported and reject the bug accordingly.
Comment 12 Ondrej Chaloupka 2014-09-02 03:57:27 EDT
Hi Mike,

I fully understand your point just I have to say that I would rather have this fixed than leave it as "feature".
If there is no way how to fix it would be probably then the best way to create jira feature request and document this for EAP.

But before that I would have (maybe silly as I do not understand the guts) question/points.

Would not help to get increased number of committed transaction at the end of commit phase of 2PC? It seems to me that it's increased at the beginning of the commit phase of 2PC and it causes the trouble here.

If it's not good idea then, please, would be possible at least somelike handle (do workaround) that webconsole statistics would not show numbers over 100%?

Ondra
Comment 13 Michael 2014-09-02 05:17:22 EDT
This is not a bug in the code but rather it is people misinterpreting the meaning of the statistic (in which case, as you suggest, we need a doc JIRA for it).

If we didn't reset the counters during boot then they would eventually wrap which is not helpful when most app monitors are expecting the counters to apply to the current application server instance. It would also cause discrepancies when we have proxy recovery (where transactions are recovered elsewhere from where the transactions are created). 

If we artificially cap how the statistic is reported in the admin console then that would loose valid information.

I did not understand your point about "It seems to me that it's increased at the beginning of the commit phase of 2PC and it causes the trouble here". I checked the code and we only update the stat after the transaction has successfully ended.
Comment 14 Ondrej Chaloupka 2014-09-02 07:28:59 EDT
Yeap, it was just my assumption from the test point of view.

The test runs like:
1. prepare phase
1a. prepare first XA resource
1b. prepare second XA resource
2. commit phase
2a. crash of server
3. recover phase
3a. commit first XA resource
3b. commit second XA resource

The console shows 200% as result of committed transaction which should mean that total number of transaction is counted as 1 and number of committed transactions is counted as 2. 
My point was why the number of committed transaction is increased. I could be wrong but I assumed that it's sometime during 2PC and not at the full end.
Question is why the commit of transaction is counted twice?

I test it with 6.3.1.CP.CR1 (Narayana 4.17.22.Final) and I'm getting 118% when test finishes. It means that recovery is fully finished and there should not be reason why to get such numbers. I think that #c4 is not explanation of my problem.

Ondra
Comment 15 tom.jenkinson 2014-09-04 10:02:20 EDT
Hi,

Mike and I chatted about this on Tuesday and we can look to add something into Narayana like this: https://github.com/jbosstm/narayana/pull/719

That being said, its quite artificial. All it does is increment the total number of transactions started if we determine this is a recovery scenario.

Ondra, maybe you could take a look and let me know if this is what you are looking for?

Thanks,
Tom
Comment 16 Ondrej Chaloupka 2014-09-04 12:13:15 EDT
Hi Tom, Hi Mike,

that is what I was talking about, yes. Just I was rethinking my point of view. Especially I started to be scared on word 'artificial' :) I agree that such fix would change the way that Narayana worked with statistics for years.

I hope that I wasn't too prim. 

The thing that I was fighting for is fact that I can get number over 100%.

If you would not be against I would suggest to put this issue back to 'WebConsole' and ask for changing format of showing transaction statistics. Percents are not the good way how to display statistics of transactions. I would, then, ask for putting some information as user hint which would explain why the statistics could be a bit "strange" after server crash.

Would that be better solution for you or do you consider other one?

Thanks
Ondra
Comment 17 tom.jenkinson 2014-09-04 16:03:55 EDT
I agree with your recommendation, thanks for pursuing this Ondra.
Comment 18 Ondrej Chaloupka 2014-09-05 03:55:37 EDT
Per discussion above closing this bz as not a bug and creating new bz#1138561 to change the mode how the statistics are displayed in web console.

Note You need to log in before you can comment on or make changes to this bug.