Bug 1342830
| Summary: | [Tiering]: when hot tier's subvol is brought down and up later, files from those subvols aren't listed in mountpoint | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | krishnaram Karthick <kramdoss> | |
| Component: | tier | Assignee: | Nithya Balachandran <nbalacha> | |
| Status: | CLOSED WONTFIX | QA Contact: | krishnaram Karthick <kramdoss> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | rhgs-3.1 | CC: | nbalacha, rhs-bugs | |
| Target Milestone: | --- | Keywords: | ZStream | |
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1343002 (view as bug list) | Environment: | ||
| Last Closed: | 2018-02-06 17:52:32 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1343002 | |||
|
Description
krishnaram Karthick
2016-06-05 16:51:49 UTC
The linkto files on the corresponding cold tier seems to be missing.
======= from mountpoint =========
before subvol was brought down
-rw-r--r--. 1 root root 10485760 Jun 6 10:39 file-1
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-10
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-2
-rw-r--r--. 1 root root 10485760 Jun 6 10:39 file-3
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-4
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-5
-rw-r--r--. 1 root root 10485760 Jun 6 10:39 file-6
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-7
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-8
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-9
after subvol was down and retore
ll
total 30720
-rw-r--r--. 1 root root 10485760 Jun 6 10:39 file-1
-rw-r--r--. 1 root root 10485760 Jun 6 10:39 file-3
-rw-r--r--. 1 root root 10485760 Jun 6 10:39 file-6
====== backend bricks =====
[root@dhcp47-28 files]# ll /bricks/brick{0,1}/abcd/files/
/bricks/brick0/abcd/files/:
total 0
---------T. 2 root root 0 Jun 6 10:39 file-1
---------T. 2 root root 0 Jun 6 10:39 file-3
---------T. 2 root root 0 Jun 6 10:39 file-6
/bricks/brick1/abcd/files/:
total 30720
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-1
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-3
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-6
[root@dhcp46-142 ~]# ll /bricks/brick{0,1}/abcd/files/
/bricks/brick0/abcd/files/:
total 0
---------T. 2 root root 0 Jun 6 10:39 file-1
---------T. 2 root root 0 Jun 6 10:39 file-3
---------T. 2 root root 0 Jun 6 10:39 file-6
/bricks/brick1/abcd/files/:
total 71680
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-10
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-2
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-4
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-5
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-7
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-8
-rw-r--r--. 2 root root 10485760 Jun 6 10:39 file-9
[root@dhcp46-44 ~]# ll /bricks/brick0/abcd/files/
total 0
[root@dhcp46-44 ~]#
[root@dhcp46-58 ~]# ll /bricks/brick0/abcd/files/
total 0
====== volume configuration =======
gluster v status
Status of volume: sd-down
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick 10.70.46.142:/bricks/brick1/abcd 49410 0 Y 22308
Brick 10.70.47.28:/bricks/brick1/abcd 49410 0 Y 2768
Cold Bricks:
Brick 10.70.47.28:/bricks/brick0/abcd 49409 0 Y 2689
Brick 10.70.46.142:/bricks/brick0/abcd 49409 0 Y 22070
Brick 10.70.46.44:/bricks/brick0/abcd 49400 0 Y 2390
Brick 10.70.46.58:/bricks/brick0/abcd 49401 0 Y 3809
NFS Server on localhost 2049 0 Y 2996
Self-heal Daemon on localhost N/A N/A Y 3004
NFS Server on 10.70.46.44 2049 0 Y 2571
Self-heal Daemon on 10.70.46.44 N/A N/A Y 2579
NFS Server on 10.70.46.142 2049 0 Y 22328
Self-heal Daemon on 10.70.46.142 N/A N/A Y 22336
NFS Server on 10.70.46.58 2049 0 Y 3991
Self-heal Daemon on 10.70.46.58 N/A N/A Y 3999
Task Status of Volume sd-down
------------------------------------------------------------------------------
Task : Tier migration
ID : 120068cd-a8fb-4dc7-a9f0-957c52d2015d
Status : in progress
[root@dhcp47-28 ~]# gluster v info
Volume Name: sd-down
Type: Tier
Volume ID: ae7562dc-c199-4f1d-8866-a2b5a183a7be
Status: Started
Number of Bricks: 6
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distribute
Number of Bricks: 2
Brick1: 10.70.46.142:/bricks/brick1/abcd
Brick2: 10.70.47.28:/bricks/brick1/abcd
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick3: 10.70.47.28:/bricks/brick0/abcd
Brick4: 10.70.46.142:/bricks/brick0/abcd
Brick5: 10.70.46.44:/bricks/brick0/abcd
Brick6: 10.70.46.58:/bricks/brick0/abcd
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
performance.readdir-ahead: on
RCA: The hot tier is a pure distribute volume so an entire DHT subvol is unavailable when a single brick is brought down. Tier_readdirp lists files by reading only the cold tier and then doing lookups on any linkto files found. The lookup on the linkto file for the data file on the brick that is down fails with ENOENT (as the ENOTCONN op_errno is overwritten by the ENOENT from the brick that is up), the linkto file is considered stale and deleted. So even after the brick is brought back up again, as the linkto files are no longer present, they are no longer listed from the mount point. Workaround: Once all bricks are up: ls <filename> will recreate the linkto file. This should not happen with a dist-rep hot tier. Thank you for your bug report. We are not further root causing this bug, as a result this bug is being closed as WONTFIX. Please reopen if the problem continues to be observed after upgrading to a latest version. |