Bug 1384297 - glusterfs can't self heal character dev file for invalid dev_t parameters
Summary: glusterfs can't self heal character dev file for invalid dev_t parameters
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: selfheal
Version: mainline
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1388734 1388912 1388948 1388949
TreeView+ depends on / blocked
 
Reported: 2016-10-13 03:27 UTC by xiaopwu
Modified: 2017-03-06 17:29 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.10.0
Clone Of:
: 1388734 1388912 (view as bug list)
Environment:
Last Closed: 2017-03-06 17:29:11 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description xiaopwu 2016-10-13 03:27:15 UTC
Description of problem:
For replicate volume, if a character dev file only exit on one brick, it can't heal to other brick. there are below error logs.

1. glusterfs server side log:
bricks/mnt-bricks-export-brick.log:[2016-08-29 06:25:55.380204] E [posix.c:1145:posix_mknod] 0-export-posix: mknod on /mnt/bricks/export/brick/myzero failed: Invalid argument
bricks/mnt-bricks-export-brick.log:[2016-08-29 06:25:55.380223] I [server-rpc-fops.c:522:server_mknod_cbk] 0-export-server: 27: MKNOD /myzero (00000000-0000-0000-0000-000000000001/myzero) ==> (Invalid argument)

2. glusterfs client side log:
glusterfs/glustershd.log:[2016-08-29 06:25:56.481530] W [client-rpc-fops.c:240:client3_3_mknod_cbk] 0-export-client-1: remote operation failed: Invalid argument. Path: (null)


Version-Release number of selected component (if applicable):
3.6.9


How reproducible:
For replicate volume.
1. shutdown one brick of the volume.
2. write a character dev file in the volume.
mknod myzero c 1 5
3. startup the volume.
4. check if the character dev file is healed.


Additional info:
I print the parameters of mknod, it isn't correct.

[2016-08-29 08:44:48.015571] E [posix.c:1150:posix_mknod] 0-export-posix: mknod on /mnt/bricks/export/brick/myzero failed: Invalid argument
[2016-08-29 08:44:48.015589] I [server-rpc-fops.c:522:server_mknod_cbk] 0-export-server: 2950: MKNOD /myzero (00000000-0000-0000-0000-000000000001/myzero) ==> (Invalid argument)
[2016-08-29 08:45:33.330540] E [posix.c:1129:posix_mknod] 0-export-posix: mknod on /mnt/bricks/export/brick/myzero , mode: 0x21a4, dev: 0x5, major 16777216, minor 5

Comment 1 xiaopwu 2016-10-13 03:31:23 UTC
the root cause of the issue as below:
--- old/afr-self-heal-entry.c
+++ new/afr-self-heal-entry.c
@@ -142,8 +142,10 @@
                ret = dict_set_int32 (xdata, GLUSTERFS_INTERNAL_FOP_KEY, 1);
                if (ret)
                        goto out;
+
                ret = syncop_mknod (priv->children[dst], &loc, mode,
-                                   iatt->ia_rdev, xdata, &newent);
+                   makedev (ia_major (iatt->ia_rdev), ia_minor (iatt->ia_rdev)), xdata, &newent);
+
                if (ret == 0 && newent.ia_nlink == 1) {
                        /* New entry created. Mark @dst pending on all sources */
                         newentry[dst] = 1;

Comment 2 Pranith Kumar K 2016-10-25 12:48:23 UTC
This is a very good catch. We have same bug in EC too. I will send out the patches thanks a lot!!

Comment 3 xiaopwu 2016-10-26 01:15:06 UTC
Could you merge the patch to glusterfs 3.6.9?

Comment 4 Pranith Kumar K 2016-10-26 01:20:31 UTC
hi,
  3.6.x is nearing EOL, I will make sure the patch reaches 3.9.x, 3.8.x and 3.7.x

Pranith

Comment 5 xiaopwu 2016-10-26 01:23:02 UTC
ok, thanks.

Comment 6 Worker Ant 2016-10-26 01:49:40 UTC
REVIEW: http://review.gluster.org/15728 (afr,ec: Heal device files with correct major, minor numbers) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 7 Worker Ant 2016-10-26 02:47:39 UTC
REVIEW: http://review.gluster.org/15728 (afr,ec: Heal device files with correct major, minor numbers) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 8 Worker Ant 2016-10-26 12:22:54 UTC
COMMIT: http://review.gluster.org/15728 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 3a540cc12f171393751467e2de436311bdf9be6d
Author: Pranith Kumar K <pkarampu>
Date:   Wed Oct 26 06:51:18 2016 +0530

    afr,ec: Heal device files with correct major, minor numbers
    
    Thanks a lot to xiaoping.wu from Nokia for the bug and the
    fix.
    
    BUG: 1384297
    Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/15728
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Xavier Hernandez <xhernandez>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 9 Shyamsundar 2017-03-06 17:29:11 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.