Bug 1321325 - stompTests.StompTests test_echo(4096, False) ERROR
Summary: stompTests.StompTests test_echo(4096, False) ERROR
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: ---
Hardware: Unspecified
OS: Unspecified
Target Milestone: ovirt-4.0.0-beta
: 4.17.999
Assignee: Piotr Kliczewski
QA Contact: Pavel Stehlik
Depends On:
TreeView+ depends on / blocked
Reported: 2016-03-25 13:40 UTC by Sandro Bonazzola
Modified: 2017-05-11 09:23 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2016-07-26 11:07:11 UTC
oVirt Team: Infra
mperina: ovirt-4.0.0?
rule-engine: planning_ack+
mperina: devel_ack+
rule-engine: testing_ack?

Attachments (Terms of Use)
logs of the failing job (91.90 KB, application/x-gzip)
2016-03-25 13:40 UTC, Sandro Bonazzola
no flags Details

System ID Private Priority Status Summary Last Updated
oVirt gerrit 55872 0 master ABANDONED jsonrpc: make sure not to block when processing i/o 2016-05-04 07:39:45 UTC
oVirt gerrit 56996 0 master MERGED stomp: dispatcher can return empty string 2016-05-07 15:25:12 UTC
oVirt gerrit 56997 0 master MERGED stomp: make sure to handle eagain 2016-05-07 15:25:30 UTC

Description Sandro Bonazzola 2016-03-25 13:40:49 UTC
Created attachment 1140313 [details]
logs of the failing job


Failed on stompTests.StompTests test_echo(4096, False) ERROR

Attached logs for further investigation.

Comment 1 Sandro Bonazzola 2016-04-29 07:44:35 UTC
Raising severity according to mailing list discussion:

I'd suggest to investigate this a bit more since it can hide a serious issue.
I'm moving hosted-engine-setup from XMLRPC to JsonRPC and I'm facing
exactly this kind of issue: it seams that some requests got lost and I
just receive a JsonRpcNoResponseError after a long time.
The real issue is that my request never reached VDSM getting lost
somehow in the queuing mechanism.

The issue you are seeing is very interesting - a message we add to a deque disappear next time we check and according the log you provided there is no code  accessing the deque. It happens only for one specific message. All the other messages work ok. Can you please gather logs so we could see what is really happening with it?

Adding needinfo on Simone.

Comment 2 Piotr Kliczewski 2016-04-29 07:58:50 UTC
According to the conversation that I had with Simone it seems that the message is not sent due to vdsm being restarted and as a result connection was lost. I have doubts that both issues are connected.

Comment 3 Simone Tiraboschi 2016-04-29 08:04:27 UTC
Yes, the issue I was talking about was due to sending a message on an unconnected client.
By the way I think that in that case quickly trowing an explicit exception instead of relying on the response timeout could really help detecting it.

Comment 4 Piotr Kliczewski 2016-04-29 08:07:54 UTC
I agree, we are missing that. I will add this behavior soon.

Comment 5 Sandro Bonazzola 2016-05-02 09:48:33 UTC
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.

Comment 6 Lukas Svaty 2016-07-26 11:07:11 UTC
This bug was fixed and is slated to be in the upcoming version. As we
are focusing our testing at this phase on severe bugs, this bug was
closed without going through its verification step. If you think this
bug should be verified by QE, please set its severity to high and move
it back to ON_QA

Note You need to log in before you can comment on or make changes to this bug.