Bug 1572075
| Summary: | glusterfsd crashing because of RHGS WA? | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Daniel Horák <dahorak> | ||||
| Component: | core | Assignee: | hari gowtham <hgowtham> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Rajesh Madaka <rmadaka> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | rhgs-3.4 | CC: | amukherj, dahorak, mbukatov, nthomas, rhinduja, rhs-bugs, rmadaka, sankarshan, sheggodu, storage-qa-internal, vbellur | ||||
| Target Milestone: | --- | ||||||
| Target Release: | RHGS 3.4.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | glusterfs-3.12.2-10 | Doc Type: | No Doc Update | ||||
| Doc Text: |
undefined
|
Story Points: | --- | ||||
| Clone Of: | |||||||
| : | 1575864 (view as bug list) | Environment: | |||||
| Last Closed: | 2018-09-04 06:47:18 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1503137 | ||||||
| Attachments: |
|
||||||
|
Comment 3
Daniel Horák
2018-04-26 10:03:23 UTC
Created attachment 1427120 [details]
Full backtrace
Looking at the data available, the crash has happened at server_priv_to_dict where the value for xprt->xl_private is empty. which means the client is not linked to the xprt. From the crash, we can see that it crashed when "gluster get-state detail" was issued. Tried the same, but I'm not able to crash it as the xprt has the xl_private value filled every time. and things work fine. I can see that the same command (get-state) was executed a number of times and didn't crash. while looking further, I suspected it could be a race, so I checked the commands that are executed around the same time. The commands that were executed were gluster profile commands, gluster pool list and gluster get-state volumeoptions. They are executed in a random order for each crash. tried executing these commands from a number a machines for a number of times and I still couldn't crash the brick. To debug, Tried attaching gdb to the server, by then the xprt itself is null and its skipping the whole check without crashing. Not sure why xprt was null even when we received the server xlator as "this" here. Followed the steps mentioned in above description. After Gluster import into Web-Admin, i didn't find any crashes of glusterd and no bricks went to offline. verified with below version: glusterfs-3.12.2-11 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607 |