Bug 1648783
Summary: | client mount point is hung on gluster-NFS volume | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Vijay Avuthu <vavuthu> |
Component: | gluster-nfs | Assignee: | Jiffin <jthottan> |
Status: | CLOSED WONTFIX | QA Contact: | Jilju Joy <jijoy> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | rhgs-3.4 | CC: | amukherj, apaladug, dang, grajoria, jiyin, jthottan, kkeithle, mbenjamin, mchangir, rcyriac, rhinduja, rhs-bugs, sanandpa, sankarshan, skoduri, storage-qa-internal, ubansal, vavuthu, ykaul |
Target Milestone: | --- | Keywords: | AutomationBlocker, AutomationTriaged, ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | 3.5-qe-proposed | ||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-07-23 04:57:03 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1655129 | ||
Bug Blocks: |
Description
Vijay Avuthu
2018-11-12 06:45:48 UTC
Update: ========== From the nfs.log-20181111, test case started on 2018-11-09 12:32:43 Starting Test : functional.bvt.test_cvt.TestGlusterExpandVolumeSanity_cplex_dispersed_nfs.test_expanding_volume_when_io_in_progress : 06_25_09_11_2018 [2018-11-09 12:32:43.517958] I [MSGID: 100030] [glusterfsd.c:2504:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.12.2 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/run/gluster/nfs/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/gluster/427e2a195b8f1bc9.socket) From the glusto logs ( glusto logs are in ESt time zone ), nfs volume is mounted on 2018-11-09 12:33:16 UTC and writes started on 2018-11-09 12:33:17 UTC Hung is observed on 2018-11-09 12:36:29 UTC > From above time lines, hung happens between 2018-11-09 12:33:16 UTC to 2018-11-09 12:36:29 UTC > I can able to mount same nfs volume on different client [root@dhcp47-46 ~]# mount -t nfs -o vers=3 rhsauto052.lab.eng.blr.redhat.com:/testvol_dispersed /mnt/nfs_hung [root@dhcp47-46 ~]# [root@dhcp47-46 ~]# df -h | grep -i nfs rhsauto052.lab.eng.blr.redhat.com:/testvol_dispersed 398G 4.8G 394G 2% /mnt/nfs_hung [root@dhcp47-46 ~]# Glusto logs: http://jenkins-rhs.lab.eng.blr.redhat.com:8080/view/Auto%20RHEL%207.6/job/auto-RHGS_Downstream_BVT_RHEL_7_6_RHGS_3_4_2_brew/ws/glusto_2.log In steps to reproduce, I mentioned create Distributed-Disperse- 2 x (4 + 2) volume but its Disperse 1 x (4 + 2) and all the steps remains same. Update: ======== > Reproduced the issue on another setup with Debug log level enabled for brick-log-level and client-log-level at server side. > enabled "rpcdebug -m nfs -s all" on both clients. > started capturing packets before adding-bricks to the volume and ended after client hung > tcpdumps are uploaded to http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/vavuthu/nfs_hung_on_new-setup/ > System are kept in the same hung state. [root@rhsauto030 ~]# time df -h ^C real 13m9.348s user 0m0.000s sys 0m0.005s [root@rhsauto027 ~]# gluster vol info Volume Name: testvol_dispersed Type: Distributed-Disperse Volume ID: 46280d4d-a2cd-4886-a07e-5075c59deb2d Status: Started Snapshot Count: 0 Number of Bricks: 2 x (4 + 2) = 12 Transport-type: tcp Bricks: Brick1: rhsauto027.lab.eng.blr.redhat.com:/bricks/brick0/testvol_dispersed_brick0 Brick2: rhsauto025.lab.eng.blr.redhat.com:/bricks/brick0/testvol_dispersed_brick1 Brick3: rhsauto021.lab.eng.blr.redhat.com:/bricks/brick0/testvol_dispersed_brick2 Brick4: rhsauto022.lab.eng.blr.redhat.com:/bricks/brick0/testvol_dispersed_brick3 Brick5: rhsauto024.lab.eng.blr.redhat.com:/bricks/brick0/testvol_dispersed_brick4 Brick6: rhsauto029.lab.eng.blr.redhat.com:/bricks/brick0/testvol_dispersed_brick5 Brick7: rhsauto027.lab.eng.blr.redhat.com:/bricks/brick1/testvol_dispersed_brick6 Brick8: rhsauto025.lab.eng.blr.redhat.com:/bricks/brick1/testvol_dispersed_brick7 Brick9: rhsauto021.lab.eng.blr.redhat.com:/bricks/brick1/testvol_dispersed_brick8 Brick10: rhsauto022.lab.eng.blr.redhat.com:/bricks/brick1/testvol_dispersed_brick9 Brick11: rhsauto024.lab.eng.blr.redhat.com:/bricks/brick1/testvol_dispersed_brick10 Brick12: rhsauto029.lab.eng.blr.redhat.com:/bricks/brick1/testvol_dispersed_brick11 Options Reconfigured: diagnostics.client-log-level: DEBUG diagnostics.brick-log-level: DEBUG transport.address-family: inet nfs.disable: off [root@rhsauto027 ~]# Jiffin - Could you please take a look at this and see if this is indeed a regression or not? The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |