Before you record your issue, ensure you are using the latest version of Gluster. Provide version-Release number of selected component (if applicable): --------------------------------------------------------------------- # rpm -qa | grep glusterfs glusterfs-libs-6.0-62.el7rhgs.x86_64 glusterfs-6.0-62.el7rhgs.x86_64 glusterfs-fuse-6.0-62.el7rhgs.x86_64 glusterfs-cli-6.0-62.el7rhgs.x86_64 glusterfs-geo-replication-6.0-62.el7rhgs.x86_64 glusterfs-client-xlators-6.0-62.el7rhgs.x86_64 glusterfs-api-6.0-62.el7rhgs.x86_64 glusterfs-server-6.0-62.el7rhgs.x86_64 Have you searched the Bugzilla archives for same/similar issues reported. Did you run SoS report with Insights tool?. Have you discovered any workarounds?. If not, Read the troubleshooting documentation to help solve your issue. (https://mojo.redhat.com/groups/gss-gluster (Gluster feature and its troubleshooting) https://access.redhat.com/articles/1365073 (Specific debug data that needs to be collected for GlusterFS to help troubleshooting) Please provide the below Mandatory Information: ----------------------------------------------- 1 - gluster v <volname> info # gluster v info Volume Name: nas Type: Distributed-Replicate Volume ID: 524e80d9-a063-46ed-9446-56b1e47356c3 Status: Started Snapshot Count: 0 Number of Bricks: 3 x (2 + 1) = 9 Transport-type: tcp Bricks: Brick1: birdman.lab.eng.blr.redhat.com:/brick/brick1/nas-b1 Brick2: tettnang.lab.eng.blr.redhat.com:/brick/brick1/nas-b2 Brick3: transformers.lab.eng.blr.redhat.com:/brick/brick1/nas-b3 (arbiter) Brick4: birdman.lab.eng.blr.redhat.com:/brick/brick2/nas-b4 Brick5: tettnang.lab.eng.blr.redhat.com:/brick/brick2/nas-b5 Brick6: transformers.lab.eng.blr.redhat.com:/brick/brick2/nas-b6 (arbiter) Brick7: birdman.lab.eng.blr.redhat.com:/brick/brick3/nas-b7 Brick8: tettnang.lab.eng.blr.redhat.com:/brick/brick3/nas-b8 Brick9: transformers.lab.eng.blr.redhat.com:/brick/brick3/nas-b9 (arbiter) Options Reconfigured: cluster.granular-entry-heal: on storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on performance.client-io-threads: off 2 - gluster v <volname> heal info # gluster v heal nas info Brick birdman.lab.eng.blr.redhat.com:/brick/brick1/nas-b1 Status: Connected Number of entries: 0 Brick tettnang.lab.eng.blr.redhat.com:/brick/brick1/nas-b2 <gfid:808911c0-e1fa-43c0-a566-1171c49b715b> <gfid:129bc49f-752a-4969-ad28-7724ba00f243> <gfid:941cdcce-9d78-4813-910e-56925b6141e0> <gfid:7c3e11d5-b6ea-445d-8a39-6d83bb78b26a> <gfid:f240221e-e296-4354-9973-5b9b7de1b400> <gfid:135e498c-73df-417c-8345-0adc8dc02dc9> Status: Connected Number of entries: 6 Brick transformers.lab.eng.blr.redhat.com:/brick/brick1/nas-b3 <gfid:129bc49f-752a-4969-ad28-7724ba00f243> <gfid:808911c0-e1fa-43c0-a566-1171c49b715b> <gfid:941cdcce-9d78-4813-910e-56925b6141e0> <gfid:7c3e11d5-b6ea-445d-8a39-6d83bb78b26a> <gfid:f240221e-e296-4354-9973-5b9b7de1b400> <gfid:135e498c-73df-417c-8345-0adc8dc02dc9> Status: Connected Number of entries: 6 Brick birdman.lab.eng.blr.redhat.com:/brick/brick2/nas-b4 Status: Connected Number of entries: 0 Brick tettnang.lab.eng.blr.redhat.com:/brick/brick2/nas-b5 Status: Connected Number of entries: 0 Brick transformers.lab.eng.blr.redhat.com:/brick/brick2/nas-b6 Status: Connected Number of entries: 0 Brick birdman.lab.eng.blr.redhat.com:/brick/brick3/nas-b7 Status: Connected Number of entries: 0 Brick tettnang.lab.eng.blr.redhat.com:/brick/brick3/nas-b8 <gfid:37e0451d-6e94-46df-92f7-744e18051891> <gfid:1a974215-7e35-4703-b020-8df042989274> <gfid:6435c64f-8d87-4217-ac4a-c5bb33f72b33> details required. Regards, <gfid:35d96a31-1bbc-42e0-9990-c516e9fe3e97> <gfid:cdfaef44-9468-4243-80b1-415f2b8ef9b2> <gfid:17ca6e92-9f94-42c9-9da7-d6862fe72f30> <gfid:f34c5377-b340-44e8-865a-6ad56eff1e23> <gfid:135e498c-73df-417c-8345-0adc8dc02dc9> Status: Connected Number of entries: 8 Brick transformers.lab.eng.blr.redhat.com:/brick/brick3/nas-b9 <gfid:37e0451d-6e94-46df-92f7-744e18051891> <gfid:1a974215-7e35-4703-b020-8df042989274> <gfid:6435c64f-8d87-4217-ac4a-c5bb33f72b33> <gfid:35d96a31-1bbc-42e0-9990-c516e9fe3e97> <gfid:cdfaef44-9468-4243-80b1-415f2b8ef9b2> <gfid:17ca6e92-9f94-42c9-9da7-d6862fe72f30> <gfid:f34c5377-b340-44e8-865a-6ad56eff1e23> <gfid:135e498c-73df-417c-8345-0adc8dc02dc9> Status: Connected Number of entries: 8 3 - gluster v <volname> status # gluster v status nas Status of volume: nas Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick birdman.lab.eng.blr.redhat.com:/brick /brick1/nas-b1 49152 0 Y 11911 Brick tettnang.lab.eng.blr.redhat.com:/bric k/brick1/nas-b2 49152 0 Y 31259 Brick transformers.lab.eng.blr.redhat.com:/ brick/brick1/nas-b3 49152 0 Y 41997 Brick birdman.lab.eng.blr.redhat.com:/brick /brick2/nas-b4 49153 0 Y 11912 Brick tettnang.lab.eng.blr.redhat.com:/bric k/brick2/nas-b5 49153 0 Y 31274 Brick transformers.lab.eng.blr.redhat.com:/ brick/brick2/nas-b6 49153 0 Y 42012 Brick birdman.lab.eng.blr.redhat.com:/brick /brick3/nas-b7 49154 0 Y 11923 Brick tettnang.lab.eng.blr.redhat.com:/bric k/brick3/nas-b8 49154 0 Y 31289 Brick transformers.lab.eng.blr.redhat.com:/ brick/brick3/nas-b9 49154 0 Y 42027 Self-heal Daemon on localhost N/A N/A Y 11937 Self-heal Daemon on transformers.lab.eng.bl r.redhat.com N/A N/A Y 42043 Self-heal Daemon on tettnang.lab.eng.blr.re dhat.com N/A N/A Y 3816 Task Status of Volume nas ------------------------------------------------------------------------------ There are no active volume tasks 4 - Fuse Mount # df -hT Filesystem Type Size Used Avail Use% Mounted on devtmpfs devtmpfs 7.8G 0 7.8G 0% /dev tmpfs tmpfs 7.8G 0 7.8G 0% /dev/shm tmpfs tmpfs 7.8G 9.6M 7.8G 1% /run tmpfs tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup /dev/mapper/rhel_rhs--client21-root xfs 50G 6.1G 44G 13% / /dev/sda1 xfs 1014M 240M 775M 24% /boot /dev/mapper/rhel_rhs--client21-home xfs 1.8T 33M 1.8T 1% /home tmpfs tmpfs 1.6G 0 1.6G 0% /run/user/0 birdman.lab.eng.blr.redhat.com:/nas fuse.glusterfs 1.2T 485G 709G 41% /mnt/nas Describe the issue:(please be detailed as possible and provide log snippets) [Provide TimeStamp when the issue is seen] ------------------- While performing a node reboot scenario (with running IO's) on the above mentioned volume, there are multiple files pending heal. The gfid2file.sh script is also not returning any output for the above files and has been in a hung state. Tried the gfid2file for a good file and got the following output: # cat newgfid.txt | ./gfid2path.sh /brick/brick2/nas-b6/ 0688e672-be12-48c3-aa54-b094214d7e9a /dir.1/linux-5.4.180/kernel/relay.c File: ‘/brick/brick2/nas-b6///dir.1/linux-5.4.180/kernel/relay.c’ Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: fd2ah/64810d Inode: 537295365 Links: 2 Access: (0664/-rw-rw-r--) Uid: ( 0/ root) Gid: ( 0/ root) Context: system_u:object_r:glusterd_brick_t:s0 Access: 2022-04-07 07:17:55.744280014 +0530 Modify: 2022-02-16 17:22:54.000000000 +0530 Change: 2022-04-07 07:17:54.461976963 +0530 Birth: - getfattr: Removing leading '/' from absolute path names # file: brick/brick2/nas-b6///dir.1/linux-5.4.180/kernel/relay.c security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0x0688e672be1248c3aa54b094214d7e9a trusted.gfid2path.63c34589695e7154=0x64613134613161382d643736322d343565312d623762322d6662336133353630313131352f72656c61792e63 trusted.glusterfs.dht=0x00000000000000000000000055555554 -rw-rw-r--. 2 root root 0 Feb 16 17:22 /brick/brick2/nas-b6///dir.1/linux-5.4.180/kernel/relay.c Is this issue reproducible? If yes, share more details.: 1/1 Steps to Reproduce: ------------------- 1. Create 5 node cluster with RHEL 7.9 + RHGS 3.5.7 2. Create 3 x (2 + 1) arbiter volume 3. Mount the volume on 2 clients using node1 and start the IO (kernel untar, dd, rm, ls -lRt, renames) 4. Perform node reboot for node1 and trigger heal 5. Check for heal info Actual results: --------------- Heal info shows pending heals as listed above. Expected results: ----------------- Heal info must show 0 files pending heal. Any Additional info: -------------------- # cat /mnt/nas/.meta/graphs/active/nas-client-*/private | egrep -i 'connected' connected = 1 connected = 1 connected = 1 connected = 1 connected = 1 connected = 1 connected = 1 connected = 1 connected = 1
Bug will remain opened until discuss https://bugzilla.redhat.com/show_bug.cgi?id=2015551
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days