Bug 1378978 - [RFE] Glusterd seems to be ignoring that the underling filesystem got missing
Summary: [RFE] Glusterd seems to be ignoring that the underling filesystem got missing
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: posix
Version: 3.7.13
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-23 18:38 UTC by Luca Gervasi
Modified: 2023-09-14 03:31 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-08 10:53:43 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Luca Gervasi 2016-09-23 18:38:31 UTC
Description of problem:
glusterd continues writing (and creates all the missing directories) when the brick device get umounted (due to high latency and mdraid, in my scenario). Here a full description:

19:56:50: disk is detached from my system. This disk is actually the brick of the volume V.
19:56:50: LVM sees the disk as unreachable and starts its maintenance procedures
19:56:50: LVM umounts my thin provisioned volumes
19:57:02: Health check on specific bricks fails thus moving the brick to a down state
19:57:32: XFS filesystem umounts

At this point, the brick filesystem is no longer mounted. The underlying filesystems is empty (misses the brick directory too). My assumption is that gluster would stop itself in such conditions: it is not.


MD (yes, i use md to aggregate 4 disks into a single 4Tb volume):
/dev/md128:
        Version : 1.2
  Creation Time : Mon Aug 29 18:10:45 2016
     Raid Level : raid0
     Array Size : 4290248704 (4091.50 GiB 4393.21 GB)
   Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Mon Aug 29 18:10:45 2016
          State : clean 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 512K

           Name : 128
           UUID : d5c51214:43e48da9:49086616:c1371514
         Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       80        0      active sync   /dev/sdf
       1       8       96        1      active sync   /dev/sdg
       2       8      112        2      active sync   /dev/sdh
       3       8      128        3      active sync   /dev/sdi

PV, VG, LV status
  PV         VG      Fmt  Attr PSize PFree DevSize PV UUID                               
  /dev/md127 VGdata  lvm2 a--  2.00t 2.00t   2.00t Kxb6C0-FLIH-4rB1-DKyf-IQuR-bbPE-jm2mu0
  /dev/md128 gluster lvm2 a--  4.00t 1.07t   4.00t lDazuw-zBPf-Duis-ZDg1-3zfg-53Ba-2ZF34m
 
 VG      Attr   Ext   #PV #LV #SN VSize VFree VG UUID                                VProfile
  VGdata  wz--n- 4.00m   1   0   0 2.00t 2.00t XI2V2X-hdxU-0Jrn-TN7f-GSEk-7aNs-GCdTtn         
  gluster wz--n- 4.00m   1   6   0 4.00t 1.07t ztxX4f-vTgN-IKop-XePU-OwqW-T9k6-A6uDk0  

 LV                  VG      #Seg Attr       LSize   Maj Min KMaj KMin Pool     Origin Data%  Meta%  Move Cpy%Sync Log Convert LV UUID                                LProfile
  apps-data           gluster    1 Vwi-aotz--  50.00g  -1  -1  253   12 thinpool        0.08                                    znUMbm-ax1N-R7aj-dxLc-gtif-WOvk-9QC8tq         
  feed                gluster    1 Vwi-aotz-- 100.00g  -1  -1  253   14 thinpool        0.08                                    hZ4Isk-dELG-lgFs-2hJ6-aYid-8VKg-3jJko9         
  homes               gluster    1 Vwi-aotz--   1.46t  -1  -1  253   11 thinpool        58.58                                   salIPF-XvsA-kMnm-etjf-Uaqy-2vA9-9WHPkH         
  search-data         gluster    1 Vwi-aotz-- 100.00g  -1  -1  253   13 thinpool        16.41                                   Z5hoa3-yI8D-dk5Q-2jWH-N5R2-ge09-RSjPpQ         
  thinpool            gluster    1 twi-aotz--   2.93t  -1  -1  253    9                 29.85  60.00                            oHTbgW-tiPh-yDfj-dNOm-vqsF-fBNH-o1izx2         
  video-asset-manager gluster    1 Vwi-aotz-- 100.00g  -1  -1  253   15 thinpool        0.07                                    4dOXga-96Wa-u3mh-HMmE-iX1I-o7ov-dtJ8lZ 

Gluster volume configuration (all volumes use the same exact configuration, listing them all would be redundant)
Volume Name: vol-homes
Type: Replicate
Volume ID: 0c8fa62e-dd7e-429c-a19a-479404b5e9c6
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: glu01.prd.azr:/bricks/vol-homes/brick1
Brick2: glu02.prd.azr:/bricks/vol-homes/brick1
Brick3: glu03.prd.azr:/bricks/vol-homes/brick1
Options Reconfigured:
performance.readdir-ahead: on
cluster.server-quorum-type: server
nfs.disable: disable
cluster.lookup-unhashed: auto
performance.nfs.quick-read: on
performance.nfs.read-ahead: on
performance.cache-size: 4096MB
cluster.self-heal-daemon: enable
diagnostics.brick-log-level: ERROR
diagnostics.client-log-level: ERROR
nfs.rpc-auth-unix: off
nfs.acl: off
performance.nfs.io-cache: on
performance.client-io-threads: on
performance.nfs.stat-prefetch: on
performance.nfs.io-threads: on
diagnostics.latency-measurement: on
diagnostics.count-fop-hits: on
performance.md-cache-timeout: 1
performance.cache-refresh-timeout: 1
performance.io-thread-count: 16
performance.high-prio-threads: 16
performance.normal-prio-threads: 16
performance.low-prio-threads: 16
performance.least-prio-threads: 1
cluster.server-quorum-ratio: 60

fstab:
/dev/gluster/homes                              /bricks/vol-homes                   xfs defaults,noatime,nobarrier,nofail 0 2

Logs:

Sep 22 19:56:50 glu03 lvm[868]: WARNING: Device for PV lDazuw-zBPf-Duis-ZDg1-3zfg-53Ba-2ZF34m not found or rejected by a filter.
Sep 22 19:56:50 glu03 lvm[868]: Cannot change VG gluster while PVs are missing.
Sep 22 19:56:50 glu03 lvm[868]: Consider vgreduce --removemissing.
Sep 22 19:56:50 glu03 lvm[868]: Failed to extend thin metadata gluster-thinpool-tpool.
Sep 22 19:56:50 glu03 lvm[868]: Unmounting thin volume gluster-thinpool-tpool from /bricks/vol-homes.
Sep 22 19:56:50 glu03 lvm[868]: Unmounting thin volume gluster-thinpool-tpool from /bricks/vol-search-data.
Sep 22 19:56:50 glu03 lvm[868]: Unmounting thin volume gluster-thinpool-tpool from /bricks/vol-apps-data.
Sep 22 19:56:50 glu03 lvm[868]: Unmounting thin volume gluster-thinpool-tpool from /bricks/vol-video-asset-manager.
Sep 22 19:57:02 glu03 bricks-vol-video-asset-manager-brick1[45162]: [2016-09-22 17:57:02.713428] M [MSGID: 113075] [posix-helpers.c:1844:posix_health_check_thread_proc] 0-vol-video-asset-manager-posix: health-check failed, going down
Sep 22 19:57:05 glu03 bricks-vol-apps-data-brick1[44536]: [2016-09-22 17:57:05.186146] M [MSGID: 113075] [posix-helpers.c:1844:posix_health_check_thread_proc] 0-vol-apps-data-posix: health-check failed, going down
Sep 22 19:57:18 glu03 bricks-vol-search-data-brick1[40928]: [2016-09-22 17:57:18.674279] M [MSGID: 113075] [posix-helpers.c:1844:posix_health_check_thread_proc] 0-vol-search-data-posix: health-check failed, going down
Sep 22 19:57:32 glu03 bricks-vol-video-asset-manager-brick1[45162]: [2016-09-22 17:57:32.714461] M [MSGID: 113075] [posix-helpers.c:1850:posix_health_check_thread_proc] 0-vol-video-asset-manager-posix: still alive! -> SIGTERM
Sep 22 19:57:32 glu03 kernel: XFS (dm-15): Unmounting Filesystem
Sep 22 19:57:35 glu03 bricks-vol-apps-data-brick1[44536]: [2016-09-22 17:57:35.186352] M [MSGID: 113075] [posix-helpers.c:1850:posix_health_check_thread_proc] 0-vol-apps-data-posix: still alive! -> SIGTERM
Sep 22 19:57:35 glu03 kernel: XFS (dm-12): Unmounting Filesystem
Sep 22 19:57:48 glu03 bricks-vol-search-data-brick1[40928]: [2016-09-22 17:57:48.674444] M [MSGID: 113075] [posix-helpers.c:1850:posix_health_check_thread_proc] 0-vol-search-data-posix: still alive! -> SIGTERM
Sep 22 19:57:48 glu03 kernel: XFS (dm-13): Unmounting Filesystem


Version-Release number of selected component (if applicable):
CentOS Linux release 7.1.1503 (Core) 
glusterfs-api-3.7.13-1.el7.x86_64
glusterfs-libs-3.7.13-1.el7.x86_64
glusterfs-3.7.13-1.el7.x86_64
glusterfs-fuse-3.7.13-1.el7.x86_64
glusterfs-server-3.7.13-1.el7.x86_64
glusterfs-client-xlators-3.7.13-1.el7.x86_64
glusterfs-cli-3.7.13-1.el7.x86_64

How reproducible:


Steps to Reproduce:
1. Create filesystem over a network mounted disk (iscsi works fine)
2. Severe the link to the target
3. Write data on the volume such as it get replicated into the severed disk
4. Observe gluster writing on the local filesystem

Actual results:
Glusterd continues writing on the local disk, creating the full directory structure

Expected results:
Glusterd refuses to write to a filesystem which misses the root brick structure

Additional info:
Could be useful a new flag that allows glusterfs to write (or force it to bring the bricks down) when the underling filesystem gets missing.

Comment 1 Niels de Vos 2016-09-27 12:20:37 UTC
Part of the email thread is here:

   http://www.gluster.org/pipermail/gluster-users/2016-September/028445.html

Please reply to the question that was posted (how were the brick processes restarted and is there anything in the brick logs). It would be best to reply to the email and post the response here as well.

Thanks!

Comment 2 Kaushal 2017-03-08 10:53:43 UTC
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.

Comment 3 Red Hat Bugzilla 2023-09-14 03:31:21 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.