Bug 1388948 - glusterfs can't self heal character dev file for invalid dev_t parameters
Summary: glusterfs can't self heal character dev file for invalid dev_t parameters
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.8
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1384297 1388912
Blocks: 1388734 1388949
TreeView+ depends on / blocked
 
Reported: 2016-10-26 14:21 UTC by Pranith Kumar K
Modified: 2016-11-29 09:36 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.8.6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1388912
: 1388949 (view as bug list)
Environment:
Last Closed: 2016-11-29 09:36:46 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Pranith Kumar K 2016-10-26 14:21:39 UTC
+++ This bug was initially created as a clone of Bug #1388912 +++

+++ This bug was initially created as a clone of Bug #1384297 +++

Description of problem:
For replicate volume, if a character dev file only exit on one brick, it can't heal to other brick. there are below error logs.

1. glusterfs server side log:
bricks/mnt-bricks-export-brick.log:[2016-08-29 06:25:55.380204] E [posix.c:1145:posix_mknod] 0-export-posix: mknod on /mnt/bricks/export/brick/myzero failed: Invalid argument
bricks/mnt-bricks-export-brick.log:[2016-08-29 06:25:55.380223] I [server-rpc-fops.c:522:server_mknod_cbk] 0-export-server: 27: MKNOD /myzero (00000000-0000-0000-0000-000000000001/myzero) ==> (Invalid argument)

2. glusterfs client side log:
glusterfs/glustershd.log:[2016-08-29 06:25:56.481530] W [client-rpc-fops.c:240:client3_3_mknod_cbk] 0-export-client-1: remote operation failed: Invalid argument. Path: (null)


Version-Release number of selected component (if applicable):
3.6.9


How reproducible:
For replicate volume.
1. shutdown one brick of the volume.
2. write a character dev file in the volume.
mknod myzero c 1 5
3. startup the volume.
4. check if the character dev file is healed.


Additional info:
I print the parameters of mknod, it isn't correct.

[2016-08-29 08:44:48.015571] E [posix.c:1150:posix_mknod] 0-export-posix: mknod on /mnt/bricks/export/brick/myzero failed: Invalid argument
[2016-08-29 08:44:48.015589] I [server-rpc-fops.c:522:server_mknod_cbk] 0-export-server: 2950: MKNOD /myzero (00000000-0000-0000-0000-000000000001/myzero) ==> (Invalid argument)
[2016-08-29 08:45:33.330540] E [posix.c:1129:posix_mknod] 0-export-posix: mknod on /mnt/bricks/export/brick/myzero , mode: 0x21a4, dev: 0x5, major 16777216, minor 5

--- Additional comment from xiaopwu on 2016-10-12 23:31:23 EDT ---

the root cause of the issue as below:
--- old/afr-self-heal-entry.c
+++ new/afr-self-heal-entry.c
@@ -142,8 +142,10 @@
                ret = dict_set_int32 (xdata, GLUSTERFS_INTERNAL_FOP_KEY, 1);
                if (ret)
                        goto out;
+
                ret = syncop_mknod (priv->children[dst], &loc, mode,
-                                   iatt->ia_rdev, xdata, &newent);
+                   makedev (ia_major (iatt->ia_rdev), ia_minor (iatt->ia_rdev)), xdata, &newent);
+
                if (ret == 0 && newent.ia_nlink == 1) {
                        /* New entry created. Mark @dst pending on all sources */
                         newentry[dst] = 1;

--- Additional comment from Pranith Kumar K on 2016-10-25 08:48:23 EDT ---

This is a very good catch. We have same bug in EC too. I will send out the patches thanks a lot!!

--- Additional comment from xiaopwu on 2016-10-25 21:15:06 EDT ---

Could you merge the patch to glusterfs 3.6.9?

--- Additional comment from Pranith Kumar K on 2016-10-25 21:20:31 EDT ---

hi,
  3.6.x is nearing EOL, I will make sure the patch reaches 3.9.x, 3.8.x and 3.7.x

Pranith

--- Additional comment from xiaopwu on 2016-10-25 21:23:02 EDT ---

ok, thanks.

--- Additional comment from Worker Ant on 2016-10-25 21:49:40 EDT ---

REVIEW: http://review.gluster.org/15728 (afr,ec: Heal device files with correct major, minor numbers) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Worker Ant on 2016-10-25 22:47:39 EDT ---

REVIEW: http://review.gluster.org/15728 (afr,ec: Heal device files with correct major, minor numbers) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Worker Ant on 2016-10-26 08:22:54 EDT ---

COMMIT: http://review.gluster.org/15728 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 3a540cc12f171393751467e2de436311bdf9be6d
Author: Pranith Kumar K <pkarampu>
Date:   Wed Oct 26 06:51:18 2016 +0530

    afr,ec: Heal device files with correct major, minor numbers
    
    Thanks a lot to xiaoping.wu from Nokia for the bug and the
    fix.
    
    BUG: 1384297
    Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/15728
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Xavier Hernandez <xhernandez>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 1 Worker Ant 2016-10-26 14:48:10 UTC
REVIEW: http://review.gluster.org/15735 (afr,ec: Heal device files with correct major, minor numbers) posted (#1) for review on release-3.8 by Pranith Kumar Karampuri (pkarampu)

Comment 2 Worker Ant 2016-10-27 06:23:49 UTC
COMMIT: http://review.gluster.org/15735 committed in release-3.8 by Xavier Hernandez (xhernandez) 
------
commit 6e18d90b218dfa3d6ecdea8cb4f8a7ce56bde74a
Author: Pranith Kumar K <pkarampu>
Date:   Wed Oct 26 06:51:18 2016 +0530

    afr,ec: Heal device files with correct major, minor numbers
    
    Thanks a lot to xiaoping.wu from Nokia for the bug and the
    fix.
    
     >BUG: 1384297
     >Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
     >Signed-off-by: Pranith Kumar K <pkarampu>
     >Reviewed-on: http://review.gluster.org/15728
     >Smoke: Gluster Build System <jenkins.org>
     >NetBSD-regression: NetBSD Build System <jenkins.org>
     >Reviewed-by: Xavier Hernandez <xhernandez>
     >CentOS-regression: Gluster Build System <jenkins.org>
    
    Change-Id: I7646adc3771ff76cdf9c979b575bbcd0b3bc1b9a
    BUG: 1388948
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/15735
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Xavier Hernandez <xhernandez>

Comment 3 Niels de Vos 2016-11-29 09:36:46 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.6, please open a new bug report.

glusterfs-3.8.6 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/packaging/2016-November/000217.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.