Description of problem: ========================== In a replicate volume with snapshots taken on the volume , when one of the storage node is rebooted , the volume-id of the brick on that storage-node is changed. The changed volume-id on the brick matches with the first snapshot volume id. Version-Release number of selected component (if applicable): ============================================================== glusterfs-server-3.6.0.5-1.el6rhs.x86_64 root@fan [May-22-2014-14:56:17] >gluster --version glusterfs 3.6.0.5 built on May 20 2014 10:52:06 How reproducible: ======================= Tried once Steps to Reproduce: ======================= 1. Create 1 x 2 replicate volume. Start the volume. 2. Create a snapshot of the volume snap_0. 3. Create a fuse mount and start "dd" on a 20GB file. 4. While dd is in progress and the size of the file is 1GB, create snapshot snap_1. 5. continue dd from mount. While dd is in progress and the size of the file is 3.5GB , create snapshot snap_2. 6. while dd is still in progress reboot node1. 7. From the node2, set self-heal-daemon to off. Actual results: ================= After the reboot, the volume-id of the brick1 is changed hence the brick process of the volume is not started. root@fan [May-22-2014-12:38:06] >gluster v info Volume Name: vol_rep Type: Replicate Volume ID: 68a62e9c-073b-4343-b10b-a5d934aac6f9 Status: Created Snap Volume: no Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: fan:/rhs/bricks/b1 Brick2: mia:/rhs/bricks/b2 root@fan [May-22-2014-12:38:10] > getfattr on the brick before reboot: ======================================= root@fan [May-22-2014-12:40:53] >getfattr -d -e hex -m . /rhs/bricks/b1/ getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b1/ trusted.afr.vol_rep-client-0=0x000000000000000000000000 trusted.afr.vol_rep-client-1=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x68a62e9c073b4343b10ba5d934aac6f9 getfattr on the brick after reboot: ========================================= root@fan [May-22-2014-13:15:57] >getfattr -d -e hex -m . /rhs/bricks/b1/ getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b1/ trusted.afr.vol_rep-client-0=0x000000000000000000000000 trusted.afr.vol_rep-client-1=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x129fb89816a74aa78a036282c771216d volume info after reboot:- ============================= root@fan [May-22-2014-14:53:21] >gluster v info Volume Name: vol_rep Type: Replicate Volume ID: 68a62e9c-073b-4343-b10b-a5d934aac6f9 Status: Started Snap Volume: no Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: fan:/rhs/bricks/b1 Brick2: mia:/rhs/bricks/b2 Options Reconfigured: features.barrier: disable cluster.self-heal-daemon: off root@fan [May-22-2014-14:53:23] > volume status after reboot: ============================ root@fan [May-22-2014-14:53:58] >gluster v status Status of volume: vol_rep Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick fan:/rhs/bricks/b1 N/A N N/A Brick mia:/rhs/bricks/b2 49152 Y 2698 NFS Server on localhost 2049 Y 1625 NFS Server on mia 2049 Y 2710 Task Status of Volume vol_rep ------------------------------------------------------------------------------ There are no active volume tasks root@fan [May-22-2014-14:54:00] > getfattr on the brick which was always online: ============================================== root@mia [May-22-2014-13:09:42] >getfattr -d -e hex -m . /rhs/bricks/b2/ getfattr: Removing leading '/' from absolute path names # file: rhs/bricks/b2/ trusted.afr.vol_rep-client-0=0x000000000000000000000000 trusted.afr.vol_rep-client-1=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x68a62e9c073b4343b10ba5d934aac6f9 Expected results: ===================== volume-id's of the bricks shouldn't change Additional info: ======================= root@fan [May-22-2014-13:14:10] >gluster snapshot status volume vol_rep Snap Name : snap_0 Snap UUID : 64539bb5-e501-4701-aa5c-0239a7728735 Brick Path : fan:/var/run/gluster/snaps/129fb89816a74aa78a036282c771216d/brick1/b1 Volume Group : RHS_vg1 Brick Running : Yes Brick PID : 1563 Data Percentage : 1.26 LV Size : 59.00g Brick Path : mia:/var/run/gluster/snaps/129fb89816a74aa78a036282c771216d/brick2/b2 Volume Group : RHS_vg1 Brick Running : Yes Brick PID : 2796 Data Percentage : 1.26 LV Size : 59.00g Snap Name : snap_1 Snap UUID : d48bcae3-3a05-4c80-b2c9-57693fd7f295 Brick Path : fan:/var/run/gluster/snaps/8807fbdf289a439d80e0156b37905a7d/brick1/b1 Volume Group : RHS_vg1 Brick Running : Yes Brick PID : 1568 Data Percentage : 3.52 LV Size : 59.00g Brick Path : mia:/var/run/gluster/snaps/8807fbdf289a439d80e0156b37905a7d/brick2/b2 Volume Group : RHS_vg1 Brick Running : Yes Brick PID : 2920 Data Percentage : 3.60 LV Size : 59.00g Snap Name : snap_2 Snap UUID : 2c974724-7fc0-4589-bed9-ad3696f5d2d9 Brick Path : fan:/var/run/gluster/snaps/2c3f716e67a947b4afed5a9df9a80ef5/brick1/b1 Volume Group : RHS_vg1 Brick Running : Yes Brick PID : 1575 Data Percentage : 6.71 LV Size : 59.00g Brick Path : mia:/var/run/gluster/snaps/2c3f716e67a947b4afed5a9df9a80ef5/brick2/b2 Volume Group : RHS_vg1 Brick Running : Yes Brick PID : 3018 Data Percentage : 6.76 LV Size : 59.00g root@fan [May-22-2014-13:14:20] > root@fan [May-22-2014-13:12:39] >cat /var/lib/glusterd/vols/vol_rep/info type=2 count=2 status=1 sub_count=2 stripe_count=1 replica_count=2 version=9 transport-type=0 parent_volname=N/A volume-id=68a62e9c-073b-4343-b10b-a5d934aac6f9 username=c4031efe-d623-44d9-9e21-6c27d06181cf password=5fd73e10-76f2-450c-aa13-75d1b4849ce2 op-version=2 client-op-version=2 restored_from_snap=00000000-0000-0000-0000-000000000000 snap-max-hard-limit=256 features.barrier=disable cluster.self-heal-daemon=off brick-0=fan:-rhs-bricks-b1 brick-1=mia:-rhs-bricks-b2 root@fan [May-22-2014-13:12:59] > root@fan [May-22-2014-14:56:20] >cat /var/lib/glusterd/snaps/snap_0/129fb89816a74aa78a036282c771216d/info type=2 count=2 status=1 sub_count=2 stripe_count=1 replica_count=2 version=2 transport-type=0 parent_volname=vol_rep volume-id=129fb898-16a7-4aa7-8a03-6282c771216d username=c15648d1-5b1b-4591-8e23-cfcede13ea62 password=ba6efb04-3c46-47d4-bb6d-fbfe43bd3ea0 op-version=4 client-op-version=2 restored_from_snap=00000000-0000-0000-0000-000000000000 snap-max-hard-limit=256 features.barrier=enable brick-0=fan:-var-run-gluster-snaps-129fb89816a74aa78a036282c771216d-brick1-b1 brick-1=mia:-var-run-gluster-snaps-129fb89816a74aa78a036282c771216d-brick2-b2 root@fan [May-22-2014-15:02:13] >
SOS Reports : http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1100211/
NOTE : The volume-id of the brick is not alone changed. The contents of the brick is also lost.
The logs does not of have entries when the problem occurred. It only has the latest logs where you are getting the symptom. Is it possible for you to reproduce it again? In our local setup it is not getting reproduced. From the mount logs it seems that the main volume brick is mounted with the snapshot brick. Did you perform any snapshot restore operation or mounted the volume brick explicitly, or any other activity which can cause issues with the brick mount point?
I didn't perform anything else other than mentioned in the steps to recreate the issue. I will try to recreate the issue.
I am able to recreate this issue on build "glusterfs 3.6.0.12 built on Jun 3 2014 11:03:28". This time, the volume-id of the brick on the rebooted node changed along with, the brick which was always online got killed. Because of this I/O failed on the mount with Input/Output Error. Log messages from brick which was always online and got shutdown: ================================================================= [2014-06-04 08:17:10.700408] E [posix.c:4274:_posix_handle_xattr_keyvalue_pair] 0-vol_rep-posix: fgetxattr failed on fd=18 while doing xattrop: Key:trusted.afr.vol_rep-client-1 (Input/output error) [2014-06-04 08:17:10.700450] I [server-rpc-fops.c:1867:server_fxattrop_cbk] 0-vol_rep-server: 65458: FXATTROP 0 (37267384-bf53-4d3a-8114-581d64090819) ==> (Success) [2014-06-04 08:17:24.543339] W [posix-helpers.c:1409:posix_health_check_thread_proc] 0-vol_rep-posix: stat() on /rhs/bricks/b2 returned: Input/output error [2014-06-04 08:17:24.543399] M [posix-helpers.c:1429:posix_health_check_thread_proc] 0-vol_rep-posix: health-check failed, going down [2014-06-04 08:17:34.555909] I [client_t.c:184:gf_client_get] 0-vol_rep-server: client_uid=fan.lab.eng.blr.redhat.com-1716-2014/06/04-08:17:33:650199-vol_rep-client-1-0-0 [2014-06-04 08:17:34.556005] I [server-handshake.c:578:server_setvolume] 0-vol_rep-server: accepted client from fan.lab.eng.blr.redhat.com-1716-2014/06/04-08:17:33:650199-vol_rep-client-1-0-0 (version: 3.6.0.12) [2014-06-04 08:17:34.556416] I [client_t.c:184:gf_client_get] 0-vol_rep-server: client_uid=fan.lab.eng.blr.redhat.com-1716-2014/06/04-08:17:33:650199-vol_rep-client-1-0-0 [2014-06-04 08:17:34.558531] W [posix-helpers.c:538:posix_pstat] 0-vol_rep-posix: lstat failed on /rhs/bricks/b2/ (Input/output error) [2014-06-04 08:17:34.685769] I [client_t.c:184:gf_client_get] 0-vol_rep-server: client_uid=fan.lab.eng.blr.redhat.com-1691-2014/06/04-08:17:32:646939-vol_rep-client-1-0-0 [2014-06-04 08:17:34.685854] I [server-handshake.c:578:server_setvolume] 0-vol_rep-server: accepted client from fan.lab.eng.blr.redhat.com-1691-2014/06/04-08:17:32:646939-vol_rep-client-1-0-0 (version: 3.6.0.12) [2014-06-04 08:17:34.686235] I [client_t.c:184:gf_client_get] 0-vol_rep-server: client_uid=fan.lab.eng.blr.redhat.com-1691-2014/06/04-08:17:32:646939-vol_rep-client-1-0-0 [2014-06-04 08:17:34.686433] W [posix-helpers.c:538:posix_pstat] 0-vol_rep-posix: lstat failed on /rhs/bricks/b2/ (Input/output error) [2014-06-04 08:17:34.686799] W [posix-helpers.c:538:posix_pstat] 0-vol_rep-posix: lstat failed on /rhs/bricks/b2/ (Input/output error) [2014-06-04 08:17:34.686826] E [posix.c:148:posix_lookup] 0-vol_rep-posix: lstat on /rhs/bricks/b2/ failed: Input/output error [2014-06-04 08:17:34.686941] W [posix-helpers.c:538:posix_pstat] 0-vol_rep-posix: lstat failed on /rhs/bricks/b2/ (Input/output error) [2014-06-04 08:17:34.686972] E [posix.c:148:posix_lookup] 0-vol_rep-posix: lstat on /rhs/bricks/b2/ failed: Input/output error [2014-06-04 08:17:34.687012] E [server-rpc-fops.c:190:server_lookup_cbk] 0-vol_rep-server: 8: LOOKUP / (00000000-0000-0000-0000-000000000001) ==> (Input/output error) [2014-06-04 08:17:54.543680] M [posix-helpers.c:1434:posix_health_check_thread_proc] 0-vol_rep-posix: still alive! -> SIGTERM [2014-06-04 08:17:54.544080] W [glusterfsd.c:1182:cleanup_and_exit] (--> 0-: received signum (15), shutting down
This issue is seen because file-system UUID of origin volume and snapshot volume are same. When we take LVM snapshot file-system UUID is also replicated. There are file-system specific tools available to fix this issue, but AFAIK no file-system agnostic solution is available as of now. Will be sending a patch soon after some more investigation.
Review posted in downstream https://code.engineering.redhat.com/gerrit/#/c/26739/
Verified the fix on the build "glusterfs 3.6.0.17 built on Jun 13 2014 11:01:21" using the steps as mentioned in the bug description. Bug is fixed. Moving the bug to Verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html