Bug 1744881

Summary: transport end point error seen when deleting files
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: fuseAssignee: Mohit Agrawal <moagrawa>
Status: CLOSED CURRENTRELEASE QA Contact: Rahul Hinduja <rhinduja>
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: amukherj, moagrawa, rhinduja, rhs-bugs, sabose, saraut, sheggodu, storage-qa-internal, ubansal, vdas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-6.0-18 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-12-18 16:43:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Nag Pavan Chilakam 2019-08-23 05:55:57 UTC
Description of problem:
======================
I was working on needinfo of BZ# 1400071 - OOM kill of glusterfs fuse mount process seen on client where i was doing deletes

while the rm -rf was going on, one of the clients reported 'transport end point error'

I  have actually seen this a few times in this release, but previously was not sure if I had performed some brick down scenarios or downtime ops.
This time there was no such action for sure.

[root@rhs-gp-srv7 rhs-gp-srv7.lab.eng.blr.redhat.com]# time rm -rf dir.{1..49}
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/phy/hisilicon’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/phy/mediatek’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/phy/st’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/phy/allwinner’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/phy/amlogic’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/phy/broadcom’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/phy/cadence’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/phy/rockchip’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/pinctrl/bcm’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/pinctrl/cirrus’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/pinctrl/intel’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/pinctrl/meson’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/pinctrl/mvebu’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/pinctrl/sh-pfc’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/pinctrl/tegra’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/pinctrl/ti’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/pinctrl/spear’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/pinctrl/uniphier’: Transport endpoint is not connected
rm: cannot remove ‘dir.34/linux-5.2.9/drivers/pinctrl/mediatek’: Directory not empty
rm: cannot remove ‘dir.36/linux-5.2.9/drivers/gpu/drm/gma500’: Transport endpoint is not connected
rm: cannot remove ‘dir.36/linux-5.2.9/drivers/gpu/drm/lima’: Transport endpoint is not connected
rm: cannot remove ‘dir.36/linux-5.2.9/drivers/gpu/drm/mga’: Transport endpoint is not connected
rm: cannot remove ‘dir.36/linux-5.2.9/drivers/gpu/drm/qxl’: Transport endpoint is not connected
rm: cannot remove ‘dir.36/linux-5.2.9/drivers/gpu/drm/rockchip’: Transport endpoint is not connected
rm: cannot remove ‘dir.36/linux-5.2.9/drivers/gpu/drm/scheduler’: Transport endpoint is not connected
rm: cannot remove ‘dir.36/linux-5.2.9/drivers/gpu/drm/sti’: Transport endpoint is not connected
rm: cannot remove ‘dir.36/linux-5.2.9/drivers/gpu/drm/tilcdc’: Transport endpoint is not connected
rm: cannot remove ‘dir.37/linux-5.2.9/security/keys/encrypted-keys’: Transport endpoint is not connected
rm: cannot remove ‘dir.37/linux-5.2.9/security/smack’: Transport endpoint is not connected
rm: cannot remove ‘dir.37/linux-5.2.9/security/apparmor/include’: Transport endpoint is not connected
rm: cannot remove ‘dir.38/linux-5.2.9’: Directory not empty


will be attaching the statedumps which were being collected for the oomkill reproducer.



Version-Release number of selected component (if applicable):
-====================
6.0.12



Steps to Reproduce:
======================
1) 3 node cluster, 10x3 volume(see vol info)
2) triggered below IOs on each of 4 clients:
   a) linux untar for about 50 times from each client
   b) top o/p captured to a file every 2 min
   c) lookups using find *|xargs stat  continously
3) after about a day or so (ie after 50 time untar was done): started below IO too:
  a) linux untar was now from {51..100}
  b) top and lookups were happening from previous step
  c) rm -rf of all the directories which had the untarred images ie rm -rf dir.{1..50}

Actual results:
=================
saw  above transport endpoint errors on one client after about half a day


logs will be attached

Expected results:


Additional info: