Bug 1734692

Summary: brick process coredump while running bug-1432542-mpx-restart-crash.t in a virtual machine
Product: [Community] GlusterFS Reporter: ziyi cheng <chengziyi>
Component: coreAssignee: Mohit Agrawal <moagrawa>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 6CC: bugs
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-02-24 06:13:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
backtrace of the core none

Description ziyi cheng 2019-07-31 08:30:56 UTC
Created attachment 1594973 [details]
backtrace of the core

+++ This bug was initially created as a clone of Bug #1459400 +++

Description of problem:
Brick process coredump in posix-acl xlator while running test 
bug-1432542-mpx-restart-crash.t in a virtual machine

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
Run test file bug-1432542-mpx-restart-crash.t in a 5 times loops
(for i in {1..5};do prove -vf tests/bugs/core/bug-1432542-mpx-restart-crash.t;done)

Actual results:
brick process is getting crash

Expected results:
brick process should not crash

Additional info:
VM Configuration:
cpu:
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 62
Model name:            Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz
Stepping:              4
CPU MHz:               2100.000
BogoMIPS:              4200.00
Hypervisor vendor:     VMware
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              15360K
NUMA node0 CPU(s):     0
memory:
              total        used        free      shared  buff/cache   available
Mem:        1884512      121020     1401352       87528      362140     1015920
Swap:       1679356       88424     1590932

Comment 1 ziyi cheng 2019-07-31 09:15:43 UTC
From the backtrace of the core file,
we can see that in function r00t(),
after conf=THIS->private, conf == NULL, so it crashed when return conf->super_uid;
the reason, i guess :
iot_worker call_rerume GF_FOP_GETXATTR ,it wind to posix_acl_getxattr,
in the meantime, fini() has been executed, so this->private = NULL and conf has been free.
in this situation, i think we should check if conf is NULL before return conf->super_uid in r00t()

Comment 2 Mohit Agrawal 2020-02-24 06:13:30 UTC
Hi Ziyi,

 I have tried to reproduce after executed the test case on latest master release in centos VM
 for i in {1..10};do prove -vf tests/bugs/core/bug-1432542-mpx-restart-crash.t;done
 
 I am not able to reproduce it so I am closing the bug.
 Please reopen it if you are able to reproduce it on latest release.


Thanks,
Mohit Agrawal