Bug 1388949 - glusterfs can't self heal character dev file for invalid dev_t parameters
Summary: glusterfs can't self heal character dev file for invalid dev_t parameters
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.7.15
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1384297 1388912 1388948
Blocks: 1388734
TreeView+ depends on / blocked
 
Reported: 2016-10-26 14:22 UTC by Pranith Kumar K
Modified: 2016-11-16 10:51 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.7.17
Clone Of: 1388948
Environment:
Last Closed: 2016-11-16 10:51:48 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Pranith Kumar K 2016-10-26 14:22:23 UTC
+++ This bug was initially created as a clone of Bug #1388948 +++

+++ This bug was initially created as a clone of Bug #1388912 +++

+++ This bug was initially created as a clone of Bug #1384297 +++

Description of problem:
For replicate volume, if a character dev file only exit on one brick, it can't heal to other brick. there are below error logs.

1. glusterfs server side log:
bricks/mnt-bricks-export-brick.log:[2016-08-29 06:25:55.380204] E [posix.c:1145:posix_mknod] 0-export-posix: mknod on /mnt/bricks/export/brick/myzero failed: Invalid argument
bricks/mnt-bricks-export-brick.log:[2016-08-29 06:25:55.380223] I [server-rpc-fops.c:522:server_mknod_cbk] 0-export-server: 27: MKNOD /myzero (00000000-0000-0000-0000-000000000001/myzero) ==> (Invalid argument)

2. glusterfs client side log:
glusterfs/glustershd.log:[2016-08-29 06:25:56.481530] W [client-rpc-fops.c:240:client3_3_mknod_cbk] 0-export-client-1: remote operation failed: Invalid argument. Path: (null)


Version-Release number of selected component (if applicable):
3.6.9


How reproducible:
For replicate volume.
1. shutdown one brick of the volume.
2. write a character dev file in the volume.
mknod myzero c 1 5
3. startup the volume.
4. check if the character dev file is healed.


Additional info:
I print the parameters of mknod, it isn't correct.

[2016-08-29 08:44:48.015571] E [posix.c:1150:posix_mknod] 0-export-posix: mknod on /mnt/bricks/export/brick/myzero failed: Invalid argument
[2016-08-29 08:44:48.015589] I [server-rpc-fops.c:522:server_mknod_cbk] 0-export-server: 2950: MKNOD /myzero (00000000-0000-0000-0000-000000000001/myzero) ==> (Invalid argument)
[2016-08-29 08:45:33.330540] E [posix.c:1129:posix_mknod] 0-export-posix: mknod on /mnt/bricks/export/brick/myzero , mode: 0x21a4, dev: 0x5, major 16777216, minor 5

--- Additional comment from xiaopwu on 2016-10-12 23:31:23 EDT ---

the root cause of the issue as below:
--- old/afr-self-heal-entry.c
+++ new/afr-self-heal-entry.c
@@ -142,8 +142,10 @@
                ret = dict_set_int32 (xdata, GLUSTERFS_INTERNAL_FOP_KEY, 1);
                if (ret)
                        goto out;
+
                ret = syncop_mknod (priv->children[dst], &loc, mode,
-                                   iatt->ia_rdev, xdata, &newent);
+                   makedev (ia_major (iatt->ia_rdev), ia_minor (iatt->ia_rdev)), xdata, &newent);
+
                if (ret == 0 && newent.ia_nlink == 1) {
                        /* New entry created. Mark @dst pending on all sources */
                         newentry[dst] = 1;

--- Additional comment from Pranith Kumar K on 2016-10-25 08:48:23 EDT ---

This is a very good catch. We have same bug in EC too. I will send out the patches thanks a lot!!

--- Additional comment from xiaopwu on 2016-10-25 21:15:06 EDT ---

Could you merge the patch to glusterfs 3.6.9?

--- Additional comment from Pranith Kumar K on 2016-10-25 21:20:31 EDT ---

hi,
  3.6.x is nearing EOL, I will make sure the patch reaches 3.9.x, 3.8.x and 3.7.x

Pranith

--- Additional comment from xiaopwu on 2016-10-25 21:23:02 EDT ---

ok, thanks.

--- Additional comment from Worker Ant on 2016-10-25 21:49:40 EDT ---

REVIEW: http://review.gluster.org/15728 (afr,ec: Heal device files with correct major, minor numbers) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Worker Ant on 2016-10-25 22:47:39 EDT ---

REVIEW: http://review.gluster.org/15728 (afr,ec: Heal device files with correct major, minor numbers) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Worker Ant on 2016-10-26 08:22:54 EDT ---

COMMIT: http://review.gluster.org/15728 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 3a540cc12f171393751467e2de436311bdf9be6d
Author: Pranith Kumar K <pkarampu>
Date:   Wed Oct 26 06:51:18 2016 +0530

    afr,ec: Heal device files with correct major, minor numbers
    
    Thanks a lot to xiaoping.wu from Nokia for the bug and the
    fix.
    
    BUG: 1384297
    Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/15728
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Xavier Hernandez <xhernandez>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 1 Worker Ant 2016-10-26 14:49:18 UTC
REVIEW: http://review.gluster.org/15736 (afr,ec: Heal device files with correct major, minor numbers) posted (#1) for review on release-3.7 by Pranith Kumar Karampuri (pkarampu)

Comment 2 Worker Ant 2016-10-27 06:23:23 UTC
COMMIT: http://review.gluster.org/15736 committed in release-3.7 by Xavier Hernandez (xhernandez) 
------
commit 24adb0607683cfec784933f332252a4cd53b8cb7
Author: Pranith Kumar K <pkarampu>
Date:   Wed Oct 26 06:51:18 2016 +0530

    afr,ec: Heal device files with correct major, minor numbers
    
    Thanks a lot to xiaoping.wu from Nokia for the bug and the
    fix.
    
     >BUG: 1384297
     >Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
     >Signed-off-by: Pranith Kumar K <pkarampu>
     >Reviewed-on: http://review.gluster.org/15728
     >Smoke: Gluster Build System <jenkins.org>
     >NetBSD-regression: NetBSD Build System <jenkins.org>
     >Reviewed-by: Xavier Hernandez <xhernandez>
     >CentOS-regression: Gluster Build System <jenkins.org>
    
    Change-Id: I28636a741592335cebcaa1abc2af8460ebc740e1
    BUG: 1388949
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/15736
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Xavier Hernandez <xhernandez>

Comment 3 Samikshan Bairagya 2016-11-16 10:51:48 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.17, please open a new bug report.

glusterfs-3.7.17 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/gluster-devel/2016-November/051414.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.