Bug 1395700

Summary: qdrouterd 0.4-19 segfault when qpidd down for longer time and goferd restarted
Product: Red Hat Satellite Reporter: Bryan Kearney <bkearney>
Component: katello-agentAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: Jitendra Yejare <jyejare>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.2.4CC: aperotti, bbuckingham, jcallaha, jentrena, jhutar, jyejare, ktordeur, oshtaier, paul.seymour, pdwyer, pmoravec, swadeley
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
URL: https://bugzilla.redhat.com/show_bug.cgi?id=1367735
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Under certain conditions, build 19 of the dispatch router can terminate unexpectedly with a segmentation fault. The memory management has been improved to prevent this happening.
Story Points: ---
Clone Of: 1393128 Environment:
Last Closed: 2018-02-21 12:57:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1393128, 1396568    
Bug Blocks: 1399395    

Comment 2 Jitendra Yejare 2017-10-04 15:12:07 UTC
Verifying this bug using steps mentioned in comment of similar bug: 
https://bugzilla.redhat.com/show_bug.cgi?id=1393128#c11

Steps Followed:

1. Had all Sat services running

2. On satellite6, freezed qpidd process for at least 11 seconds and then unfreezed it:
# kill -SIGSTOP $(pgrep qpidd); sleep 11; kill -SIGCONT $(pgrep qpidd)

3. Immediately after the expect script is running, restarted goferd on only one content(DIDNT TRY WITH MULTIPLE) host :
# service goferd restart; sleep 3; service goferd restart

4. Check qdrouterd service status


Behavior:

I don't see any segfault in qdrouterd status.

@pavel, Is this ok if I can mark this bug as verified ?

Comment 3 Pavel Moravec 2017-10-05 11:14:39 UTC
(In reply to Jitendra Yejare from comment #2)
> Verifying this bug using steps mentioned in comment of similar bug: 
> https://bugzilla.redhat.com/show_bug.cgi?id=1393128#c11
> 

Yes, that is the most reliable reproducer of the race condition (though using artificial steps not triggered by a regular activity - just mimicking some unknown real activity). One Content Host (with 2 goferd restarts in sequence) is sufficient.

Sat 6.3 uses quite higher version of qdrouterd, so it is probable that this segfault has never been in that rebased version of qdrouterd.

Comment 4 Jitendra Yejare 2017-10-05 12:22:15 UTC
As per comment 2 and 3, changing the bug status to Verified!

Comment 7 errata-xmlrpc 2018-02-21 12:57:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0338