Bug 1418650

Summary:	Samba crash when mounting a distributed dispersed volume over CIFS
Product:	[Community] GlusterFS	Reporter:	Xavi Hernandez <jahernan>
Component:	disperse	Assignee:	Xavi Hernandez <jahernan>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	3.10	CC:	aspandey, bugs, jahernan, nbalacha, nigelb, pkarampu
Target Milestone:	---	Keywords:	Triaged
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-3.10.0	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1402661	Environment:
Last Closed:	2017-03-06 17:44:58 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1402661
Bug Blocks:

Description Xavi Hernandez 2017-02-02 12:23:24 UTC

+++ This bug was initially created as a clone of Bug #1402661 +++

I noticed this when running glusto. Here's the output of the mount command:

COMMAND: mount -t cifs -o username=root,password=foobar \\\\172.19.2.47\\gluster-testvol_distributed-dispersed /mnt/testvol_distributed-dispersed_cifs
RETCODE: 32
STDERR:
mount error(11): Resource temporarily unavailable
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)

I've attached log.smbd here. I'll try to attach the core file or add a link to cores.

--- Additional comment from Nigel Babu on 2016-12-08 05:58:25 CET ---

The cores are here: http://slave1.cloud.gluster.org/smb-dispersed-cores/

Please copy them to another more permanent location as soon as possible. This will be autodeleted in 15 days.

--- Additional comment from Nigel Babu on 2016-12-08 06:09:26 CET ---



--- Additional comment from Nigel Babu on 2016-12-09 07:49:00 CET ---

I have a feeling this happens over NFS as well. See https://ci.centos.org/view/Gluster/job/gluster_glusto/67/console

Ashish, let me know what more logs you need for NFS and I can get them.

--- Additional comment from Nigel Babu on 2016-12-19 16:40:52 CET ---

Hey, any idea of what's going on and when we can fix? This is currently causing the Glusto tests to fail on master.

--- Additional comment from Ashish Pandey on 2016-12-20 05:41:27 CET ---

Hi,

I tried to look in to cores but could not find the binaries used while this issue occurred.

Can not debug it without respective binaries.

Could you please provide a complete sosreport from all the servers?

The configuration of volumes and all the steps reuired to reproduce this issue.

Ashish

--- Additional comment from Nigel Babu on 2016-12-20 07:19:37 CET ---

Were you able to reproduce the issue manually though? I can consistently reproduce the issue when trying to mount a disperse volume over CIFS on release-3.9 and master. You can either try it on your machine or I can redo the test and get you the binaries + cores. Let me know which works best.

--- Additional comment from Nigel Babu on 2017-01-02 15:12:13 CET ---

Fresh set of cores: http://slave25.cloud.gluster.org/logs/samba.tar.gz

Path to corresponding RPM: http://artifacts.ci.centos.org/gluster/nightly/master/7/x86_64/glusterfs-3.10dev-0.283.git0805642.el7.centos.x86_64.rpm

All gluster-related RPMs should be in the same folder and should have the same version string - "283.git0805642"

Link to SOS report from the server: http://slave25.cloud.gluster.org/logs/sosreport-n33.gusty.ci.centos.org.1402661-20170102140614.tar.xz

Samba version: samba-4.4.4-9.el7.x86_64

The tests are in https://github.com/gluster/glusto-tests. Please try to reproduce the bug. Let me know if you can't or need clearly steps.

--- Additional comment from Anoop C S on 2017-01-13 06:53:08 CET ---

Following is the backtrace seen in Samba logs:

[2017/01/02 13:40:19.965429,  0] ../source3/lib/util.c:902(log_stack_trace)
  BACKTRACE: 21 stack frames:
   #0 /lib64/libsmbconf.so.0(log_stack_trace+0x1a) [0x7f926b7a5efa]
   #1 /lib64/libsmbconf.so.0(smb_panic_s3+0x20) [0x7f926b7a5fd0]
   #2 /lib64/libsamba-util.so.0(smb_panic+0x2f) [0x7f926dcd259f]
   #3 /lib64/libsamba-util.so.0(+0x247b6) [0x7f926dcd27b6]
   #4 /lib64/libpthread.so.0(+0xf370) [0x7f926df35370]
   #5 /usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(+0x34c61) [0x7f924cd7ac61]
   #6 /usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(+0x33804) [0x7f924cd79804]
   #7 /usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(init+0x1f4) [0x7f924cd53154]
   #8 /lib64/libglusterfs.so.0(xlator_init+0x4b) [0x7f92542b550b]
   #9 /lib64/libglusterfs.so.0(glusterfs_graph_init+0x29) [0x7f92542ecb29]
   #10 /lib64/libglusterfs.so.0(glusterfs_graph_activate+0x3b) [0x7f92542ed44b]
   #11 /lib64/libgfapi.so.0(+0x97cd) [0x7f92547af7cd]
   #12 /lib64/libgfapi.so.0(+0x9986) [0x7f92547af986]
   #13 /lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90) [0x7f9254596720]
   #14 /lib64/libgfrpc.so.0(rpc_clnt_notify+0x1df) [0x7f92545969ff]
   #15 /lib64/libgfrpc.so.0(rpc_transport_notify+0x23) [0x7f92545928e3]
   #16 /usr/lib64/glusterfs/3.10dev/rpc-transport/socket.so(+0x72f4) [0x7f924d22d2f4]
   #17 /usr/lib64/glusterfs/3.10dev/rpc-transport/socket.so(+0x9795) [0x7f924d22f795]
   #18 /lib64/libglusterfs.so.0(+0x84590) [0x7f9254312590]
   #19 /lib64/libpthread.so.0(+0x7dc5) [0x7f926df2ddc5]
   #20 /lib64/libc.so.6(clone+0x6d) [0x7f9269eed73d]

While looking through various logs, I could see the same crash reported in glustershd.log as follows:

[2017-01-02 13:39:41.399861] I [MSGID: 100030] [glusterfsd.c:2455:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.10dev (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/8998cabdccb9dec791fa49c3fd0ca055.socket --xlator-option *replicate*.node-uuid=4d66676a-7e25-49c5-8c18-2d29db0d8d9a)
[2017-01-02 13:39:41.439745] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-01-02 13:39:41.454243] I [MSGID: 122067] [ec-code.c:896:ec_code_detect] 0-testvol_dispersed-disperse-0: Using 'sse' CPU extensions
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2017-01-02 13:39:41
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.10dev
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7fb1a0cd7da0]
/lib64/libglusterfs.so.0(gf_print_trace+0x324)[0x7fb1a0ce16a4]
/lib64/libc.so.6(+0x35250)[0x7fb19f393250]
/usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(+0x34c61)[0x7fb192fe2c61]
/usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(+0x33804)[0x7fb192fe1804]
/usr/lib64/glusterfs/3.10dev/xlator/cluster/disperse.so(init+0x1f4)[0x7fb192fbb154]
/lib64/libglusterfs.so.0(xlator_init+0x4b)[0x7fb1a0cd550b]
/lib64/libglusterfs.so.0(glusterfs_graph_init+0x29)[0x7fb1a0d0cb29]
/lib64/libglusterfs.so.0(glusterfs_graph_activate+0x3b)[0x7fb1a0d0d44b]
/usr/sbin/glusterfs(glusterfs_process_volfp+0x12d)[0x7fb1a11d858d]
/usr/sbin/glusterfs(mgmt_getspec_cbk+0x3c1)[0x7fb1a11ddf51]
/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7fb1a0a9e720]
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1df)[0x7fb1a0a9e9ff]
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fb1a0a9a8e3]
/usr/lib64/glusterfs/3.10dev/rpc-transport/socket.so(+0x72f4)[0x7fb19555d2f4]
/usr/lib64/glusterfs/3.10dev/rpc-transport/socket.so(+0x9795)[0x7fb19555f795]
/lib64/libglusterfs.so.0(+0x84590)[0x7fb1a0d32590]
/lib64/libpthread.so.0(+0x7dc5)[0x7fb19fb1ddc5]
/lib64/libc.so.6(clone+0x6d)[0x7fb19f45573d]
---------

Moreover I can't think of anything from Samba's perspective that could lead to this crash. So this particular crash is more likely an issue with EC translator.

I was able to reproduce this crash within self-heal daemon manually in a local setup with CentOS 7. I can share the same for further debugging.

@Nigel,
Since cores provided in the links cannot be analysed without exact debuginfo packages and binaries, it's better if you can run following basic commands after attaching the core files to gdb.

$ gdb smbd <path-to-coredump-file>

In case gdb complains about missing debuginfo packages run the suggested commands to have all dependant debuginfo packages(glusterfs-debuginfo and samba-debuginfo are required at very minimum).

While you are inside gdb save the output of following gdb commands:
(gdb) bt
. . .
(gdb) thread apply all bt
. . .

--- Additional comment from Ashish Pandey on 2017-01-13 09:04:53 CET ---

As Anoop mentioned that the crashed can be seen in self heal daemon while trying to start the volume -

Following is the backtrace and possible issue - 


[Thread debugging using libthread_db enabled]                                                                                                                                                  │··············································
Using host libthread_db library "/lib64/libthread_db.so.1".                                                                                                                                    │··············································
Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/gl'.                                                                                       │··············································
Program terminated with signal 11, Segmentation fault.                                                                                                                                         │··············································
#0  list_add_tail (head=<optimized out>, new=<optimized out>) at ../../../../libglusterfs/src/list.h:40                                                                                        │··············································
40              new->next = head;                                                                                                                                                              │··············································
(gdb) bt                                                                                                                                                                                       │··············································
#0  list_add_tail (head=<optimized out>, new=<optimized out>) at ../../../../libglusterfs/src/list.h:40                                                                                        │··············································
#1  ec_code_space_alloc (size=400, code=0x7fb44802a7f0) at ec-code.c:428                                                                                                                       │··············································
#2  ec_code_alloc (size=<optimized out>, code=0x7fb44802a7f0) at ec-code.c:448                                                                                                                 │··············································
#3  ec_code_compile (builder=0x7fb44802a890) at ec-code.c:522                                                                                                                                  │··············································
#4  ec_code_build (code=<optimized out>, width=width@entry=64, values=<optimized out>, count=<optimized out>, linear=linear@entry=_gf_true) at ec-code.c:631                                   │··············································
#5  0x00007fb44d860f5b in ec_code_build_linear (code=<optimized out>, width=width@entry=64, values=<optimized out>, count=<optimized out>) at ec-code.c:638                                    │··············································
#6  0x00007fb44d85f804 in ec_method_matrix_init (inverse=_gf_false, rows=0x7fb44e5038c0, mask=0, matrix=0x7fb44802a510, list=0x7fb448022228) at ec-method.c:106                                │··············································
#7  ec_method_setup (gen=<optimized out>, list=0x7fb448022228, xl=0x7fb44e503930) at ec-method.c:299                                                                                           │··············································
#8  ec_method_init (xl=xl@entry=0x7fb448017450, list=list@entry=0x7fb448022228, columns=<optimized out>, rows=<optimized out>, max=<optimized out>, gen=<optimized out>) at ec-method.c:343    │··············································
#9  0x00007fb44d839154 in init (this=0x7fb448017450) at ec.c:635                                                                                                                               │··············································
#10 0x00007fb45b4f650b in __xlator_init (xl=0x7fb448017450) at xlator.c:403                                                                                                                    │··············································
#11 xlator_init (xl=xl@entry=0x7fb448017450) at xlator.c:428                                                                                                                                   │··············································
#12 0x00007fb45b52db29 in glusterfs_graph_init (graph=graph@entry=0x7fb448000af0) at graph.c:320                                                                                               │··············································
#13 0x00007fb45b52e44b in glusterfs_graph_activate (graph=graph@entry=0x7fb448000af0, ctx=ctx@entry=0x7fb45bc5c010) at graph.c:670                                                             │··············································
#14 0x00007fb45b9f158d in glusterfs_process_volfp (ctx=ctx@entry=0x7fb45bc5c010, fp=fp@entry=0x7fb448002d30) at glusterfsd.c:2325                                                              │··············································
#15 0x00007fb45b9f6f51 in mgmt_getspec_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=0x7fb458fd306c) at glusterfsd-mgmt.c:1675                                 │····   


ec_code_space_alloc

 space = mmap(NULL, map_size, PROT_EXEC | PROT_READ | PROT_WRITE,                                     
                 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);                                                    
    if (space == NULL) {                                                                                 
        return NULL;                                                                                     
    }
    /* It's not important to check the return value of mlock(). If it fails                              
     * everything will continue to work normally. */                                                     
    mlock(space, map_size);                                                                              
    
    space->code = code;
    space->size = map_size;
    list_add_tail(&space->list, &code->spaces);      <<<<<<<<<                                                    
    INIT_LIST_HEAD(&space->chunks);            
                                                          
   Do we need to actually check the mlock return?
   I think that mmap was successful but something has messed up with the memory.

--- Additional comment from Xavier Hernandez on 2017-01-13 12:51:31 CET ---

I've tried to reproduce this issue but I've been unable to see the crash.

I've created a dispersed 2+1 volume and started it. Self heal daemon has started successfully without crashing.

Is there anything else to see the self-heal daemon issue ?

--- Additional comment from Anoop C S on 2017-01-13 13:06:00 CET ---

Hi Xavi,

I created a CentOS local VM and installed the exact version of glusterfs mentioned in comment #7. After starting a distribute-disperse volume 2x(4+2), I noticed that self-heal daemon is not online and found the core.

Please note that I was unable to reproduce the issue with corresponding source install on another VM.

--- Additional comment from Ashish Pandey on 2017-01-13 13:07:24 CET ---


Xavi,

Even we were not able to reproduce this issue using latest master source code
We tried the folloiwng rpm to reproduce it.

[root@centos /]#                                                                                                                                                                               │··············································
[root@centos /]# rpm -qa | grep gluster                                                                                                                                                        │··············································
glusterfs-client-xlators-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                            │··············································
glusterfs-server-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                                    │··············································
glusterfs-geo-replication-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                           │··············································
python-gluster-3.10dev-0.283.git0805642.el7.centos.noarch                                                                                                                                      │··············································
glusterfs-fuse-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                                      │··············································
glusterfs-devel-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                                     │··············································
glusterfs-rdma-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                                      │··············································
glusterfs-libs-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                                      │··············································
glusterfs-api-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                                       │··············································
glusterfs-extra-xlators-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                             │··············································
glusterfs-api-devel-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                                 │··············································
glusterfs-debuginfo-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                                 │··············································
glusterfs-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                                           │··············································
glusterfs-cli-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                                       │··············································
glusterfs-events-3.10dev-0.283.git0805642.el7.centos.x86_64                                                                                                                                    │··············································
samba-vfs-glusterfs-4.4.4-9.el7.x86_64

--- Additional comment from Xavier Hernandez on 2017-01-13 13:36:10 CET ---

I've just tried this exact version on a CentOS 7 and created a 2 x (4 + 2) distributed-dispersed volume. Start has worked fine and self-heal daemon is running normally.

I'll try to analyze the sos report. Maybe I see something there...

--- Additional comment from Xavier Hernandez on 2017-01-13 13:52:31 CET ---

I've something....

Analyzing the cores from smbd I've seen this:

   0x00007f924cd7ac33 <+531>:   callq  0x7f924cd4f8a0 <mmap64@plt>
   0x00007f924cd7ac38 <+536>:   test   %rax,%rax
   0x00007f924cd7ac3b <+539>:   mov    %rax,%r15
   0x00007f924cd7ac3e <+542>:   je     0x7f924cd7af45 <ec_code_build+1317>
   0x00007f924cd7ac44 <+548>:   mov    0x8(%rsp),%r10
   0x00007f924cd7ac49 <+553>:   mov    %rax,%rdi
   0x00007f924cd7ac4c <+556>:   mov    %r10,%rsi
   0x00007f924cd7ac4f <+559>:   callq  0x7f924cd4fdd0 <mlock@plt>
   0x00007f924cd7ac54 <+564>:   mov    0x30(%r13),%rax
   0x00007f924cd7ac58 <+568>:   mov    0x8(%rsp),%r10
   0x00007f924cd7ac5d <+573>:   lea    0x30(%r15),%rdx
=> 0x00007f924cd7ac61 <+577>:   mov    %r14,(%r15)

r15 = 0xffffffffffffffff

So mmap() has returned -1. Looking at the man page I've seen that if mmap() fails, it returns MAP_FAILED (-1) and not NULL as the code expects. This is a bug in ec. I'll send a patch.

--- Additional comment from Worker Ant on 2017-01-13 13:59:54 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check for mmap()) posted (#1) for review on master by Xavier Hernandez (xhernandez)

--- Additional comment from Anoop C S on 2017-01-13 16:13:45 CET ---

Hi Xavi,

Thanks for the quick analysis and resulting patch. Same is the reason for the crash from self-heal daemon. See below for core analysis from locally reproduced crash with self-heal daemon. Even though I noticed a -1 for (ec_code_space_t *) space pointer I never thought beyond.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/gl'.
Program terminated with signal 11, Segmentation fault.
#0  list_add_tail (head=<optimized out>, new=<optimized out>) at ../../../../libglusterfs/src/list.h:40
40		new->next = head;
(gdb) f 1
#1  ec_code_space_alloc (size=400, code=0x7fc99402a7f0) at ec-code.c:428
428	    list_add_tail(&space->list, &code->spaces);
(gdb) l 420
415	        map_size = size;
416	    }
417	    space = mmap(NULL, map_size, PROT_EXEC | PROT_READ | PROT_WRITE,
418	                 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
419	    if (space == NULL) {
420	        return NULL;
421	    }
422	    /* It's not important to check the return value of mlock(). If it fails
423	     * everything will continue to work normally. */
424	    mlock(space, map_size);
(gdb) l 430
425	
426	    space->code = code;
427	    space->size = map_size;
428	    list_add_tail(&space->list, &code->spaces);
429	    INIT_LIST_HEAD(&space->chunks);
430	
431	    chunk = ec_code_chunk_from_space(space);
432	    chunk->size = EC_CODE_SIZE - ec_code_space_size() - ec_code_chunk_size();
433	    list_add(&chunk->list, &space->chunks);
434	
(gdb) p space
$1 = (ec_code_space_t *) 0xffffffffffffffff
(gdb) p (int)space
$2 = -1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The above confirms your findings.

@Xavi,
Which tool was used to analyze the core? I hope its dbx judging from the backtrace you have provided in previous comment.

--- Additional comment from Worker Ant on 2017-01-16 11:33:11 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check for mmap()) posted (#2) for review on master by Xavier Hernandez (xhernandez)

--- Additional comment from Worker Ant on 2017-01-16 11:38:24 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check for mmap()) posted (#3) for review on master by Xavier Hernandez (xhernandez)

--- Additional comment from Nigel Babu on 2017-01-17 09:22:50 CET ---

The mmap code was being blocked by Selinux. I gathered the logs for Anoop today so he could figure out what's wrong.

We found 2 SYSCALL entries blocked
type=SYSCALL msg=audit(1484635264.766:2718): arch=c000003e syscall=49 success=no exit=-13 a0=15 a1=7ff6f81ff400 a2=10 a3=7e items=0 ppid=1 pid=25743 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="glusterd" exe="/usr/sbin/glusterfsd" subj=system_u:system_r:glusterd_t:s0 key=(null)

type=SYSCALL msg=audit(1484635712.229:3062): arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=10000 a2=7 a3=22 items=0 ppid=27586 pid=27592 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="glusterfs" exe="/usr/sbin/glusterfsd" subj=system_u:system_r:glusterd_t:s0 key=(null)

ausyscall says 49 is for bind and 9 for mmap. Anoop suspected that the second syscall might be the problem at hand. As per his suggestion, I ran the tests (just CIFS) with selinux in permissive mode and all tests passed. I'm running the full test suite with selinux in permissive mode to confirm all is well.

Now we'll have to figure out how to make sure we can apply a specific selinux policy for this particular access.

--- Additional comment from Xavier Hernandez on 2017-01-17 10:07:17 CET ---

I think the problem could be that the allocated memory will be used to store code, so the PROT_EXEC flag is passed to mmap. I think this is the only difference between this particular mmap() call and the other mmap() calls present in gluster code.

Probably this will be the cause that selinux makes mmap() to fail.

Does "exit=-13" mean that the errno returned by mmap() is 13 (EACCES) ? In that case I could add a specific error message in the patch to clearly show that selinux could be the cause.

--- Additional comment from Anoop C S on 2017-01-17 10:34:20 CET ---

Here is the AVC in question:
type=AVC msg=audit(1484635756.506:3152): avc:  denied  { execmem } for  pid=27918 comm="smbd" scontext=system_u:system_r:smbd_t:s0 tcontext=system_u:system_r:smbd_t:s0 tclass=process

(In reply to Xavier Hernandez from comment #20)
> I think the problem could be that the allocated memory will be used to store
> code, so the PROT_EXEC flag is passed to mmap. I think this is the only
> difference between this particular mmap() call and the other mmap() calls
> present in gluster code.
> 
> Probably this will be the cause that selinux makes mmap() to fail.
> 

So this assumption is correct as per the following one line explanation given for 'allow_execmem' selinux boolean on https://wiki.centos.org/TipsAndTricks/SelinuxBooleans:

. . .
allow_execmem (Memory Protection)
    Allow unconfined executables to map a memory region as both executable and writable, this is dangerous and the executable should be reported in bugzilla
. . .

So selinux will prevent this memory map by default as this particular call from EC specifies both PROT_EXEC and PROT_WRITE.

> Does "exit=-13" mean that the errno returned by mmap() is 13 (EACCES) ? In
> that case I could add a specific error message in the patch to clearly show
> that selinux could be the cause.

Yes.. you are right. See this:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/uapi/asm-generic/errno-base.h

--- Additional comment from Xavier Hernandez on 2017-01-17 10:45:47 CET ---

If enabling only PROT_EXEC without PROT_WRITE does not trigger the selinux check, maybe we could change the way it's done. For example we could mmap only with PROT_WRITE, generate the code and then change the protection to only PROT_EXEC.

That would require considerable changes since the current implementation uses the same allocated memory to create multiple dynamic fragments of code as they are needed. We would need to have a single mmap() for each fragment of code.

What do you think ?

--- Additional comment from Anoop C S on 2017-01-17 13:18:46 CET ---

(In reply to Xavier Hernandez from comment #22)
> If enabling only PROT_EXEC without PROT_WRITE does not trigger the selinux
> check, maybe we could change the way it's done.

I am pretty sure that this is the case. But I currently do not have an option to test this out and confirm my analysis.

> For example we could mmap
> only with PROT_WRITE, generate the code and then change the protection to
> only PROT_EXEC.
>
> That would require considerable changes since the current implementation
> uses the same allocated memory to create multiple dynamic fragments of code
> as they are needed. We would need to have a single mmap() for each fragment
> of code.
> 
> What do you think ?

If we can make such a change safely without affecting the overall functionality in EC, then I would say we try once. :)

I admit that it would be a time-consuming task and thus the following question:
Is it possible to have a small change(probably a hack) in this particular area so as to try it out and confirm? Then we can go for the bigger one.

So my request would be to go for it as and when you find time to do so. Till then we will use the custom selinux policy to get rid of AVCs.

--- Additional comment from Nigel Babu on 2017-01-17 14:17:30 CET ---

If there's a patch that needs testing, push it onto review.gluster.org and I'm happy to test it out (I need rpms).

--- Additional comment from Xavier Hernandez on 2017-01-18 13:56:56 CET ---

I'm trying to reproduce the problem to see if the issue can be avoided playing with the mmap() protection flags. However I'm unable to get the error.

I've used a CentOS 7.3.1611 with latest patches and default configuration, but it doesn't fail (selinux is enabled by default). Have you used any custom setup ?

I use this small program to try to reproduce the issue:

#include <stdio.h>
#include <sys/mman.h>
#include <errno.h>

#define MMAP_SIZE 4096

int main(void)
{
        void *ptr;

        ptr = mmap(NULL, MMAP_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC,
                   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        if (ptr == MAP_FAILED) {
                printf("mmap() error: %d\n", errno);
                return 1;
        }

        printf("mmap succeeded\n");

        munmap(ptr, MMAP_SIZE);

        return 0;
}

--- Additional comment from Xavier Hernandez on 2017-01-19 13:16:48 CET ---

I don't know selinux enough to configure it correctly, so I cannot test it, but I've modified the patch to do the mmap as it's recommended for selinux. It works fine

--- Additional comment from Xavier Hernandez on 2017-01-19 13:17:54 CET ---

...It works fine on my test machine, without selinux.

--- Additional comment from Worker Ant on 2017-01-19 13:18:49 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check for mmap()) posted (#4) for review on master by Xavier Hernandez (xhernandez)

--- Additional comment from Worker Ant on 2017-01-19 13:27:57 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check for mmap()) posted (#5) for review on master by Xavier Hernandez (xhernandez)

--- Additional comment from Nigel Babu on 2017-01-19 13:34:47 CET ---

I'm a little pre-occupied today. But I'll run a test first thing tomorrow.

--- Additional comment from Anoop C S on 2017-01-19 13:39:00 CET ---

(In reply to Xavier Hernandez from comment #25)
> I'm trying to reproduce the problem to see if the issue can be avoided
> playing with the mmap() protection flags. However I'm unable to get the
> error.
> 
> I've used a CentOS 7.3.1611 with latest patches and default configuration,
> but it doesn't fail (selinux is enabled by default). Have you used any
> custom setup ?
> 
> I use this small program to try to reproduce the issue:
> 
> #include <stdio.h>
> #include <sys/mman.h>
> #include <errno.h>
> 
> #define MMAP_SIZE 4096
> 
> int main(void)
> {
>         void *ptr;
> 
>         ptr = mmap(NULL, MMAP_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC,
>                    MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>         if (ptr == MAP_FAILED) {
>                 printf("mmap() error: %d\n", errno);
>                 return 1;
>         }
> 
>         printf("mmap succeeded\n");
> 
>         munmap(ptr, MMAP_SIZE);
> 
>         return 0;

To reproduce the AVC, please run the above program as below:

# > /var/log/audit/audit.log
# gcc mmap-selinux-test.c
# chcon -t glusterd_exec_t a.out
# runcon "system_u:system_r:glusterd_t:s0" ./a.out
# cat /var/log/audit/audit/log
type=AVC msg=audit(1484828797.810:996): avc:  denied  { execmem } for  pid=26592 comm="a.out" scontext=system_u:system_r:glusterd_t:s0 tcontext=system_u:system_r:glusterd_t:s0 tclass=process
type=SYSCALL msg=audit(1484828797.810:996): arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=1000 a2=7 a3=22 items=0 ppid=26437 pid=26592 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=16 comm="a.out" exe="/root/a.out" subj=system_u:system_r:glusterd_t:s0 key=(null)

Now remove either PROT_EXEC or PROT_WRITE from mmap call and repeat the above steps. AVCs must not be present.

Why we need to do all this?
===========================
Because gluster binaries are run with the following selinux context:
# ps auxZ | grep /usr/sbin/glusterd | grep -v grep
system_u:system_r:glusterd_t:s0 root      7216  0.0  1.8 602096 18748 ?        Ssl  12:55   0:01 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO

So we need to test our sample programs too in the same selinux context and thus we can be sure about it.

Why you couldn't reproduce it?
==============================
Run `ps auxZ | grep /usr/sbin/glusterd | grep -v grep` and check under which context it is running. The behaviour changes based on what context gluster daemon is running.

--- Additional comment from Xavier Hernandez on 2017-01-19 13:41:51 CET ---

Thank you very much, Anoop :) I'll run some tests with this.

--- Additional comment from Worker Ant on 2017-01-20 11:08:01 CET ---

REVIEW: http://review.gluster.org/16405 (cluster/ec: fix invalid error check for mmap()) posted (#6) for review on master by Xavier Hernandez (xhernandez)

--- Additional comment from Xavier Hernandez on 2017-01-20 11:49:03 CET ---

I've been playing with selinux and I've been able to make it work. However to do two mmaps with the same contents, a file needs to be created. In this case, mmap() fails if the file is not created in an executable directory. I've tried /tmp, /var/tmp, /var/run and /root, but all failed. If I use /sbin, it works.

The reason seems to be that files created in all those directories do not have the selinux' bin_t type. If I manually set this type on the file, it works whatever it's created.

Where's the best place to put that file without needing to manually set the selinux type from code ?

--- Additional comment from Anoop C S on 2017-01-21 10:10:01 CET ---

Hi Xavi,

I ran a search through the current policies as follows in order to see the SELinux allow rules for glusterd_t and highlighted those in which execute permission is granted for class 'file':
# sesearch --allow | grep -E 'allow glusterd_t [a-z|_]* : file { ' | grep execute

From the output I think it is safe to create the file under /usr/libexec/glusterfs/ based on the following allow rule:
allow glusterd_t glusterd_exec_t : file { ioctl read getattr lock execute execute_no_trans entrypoint open } ;

By default files under /usr/libexec/glusterfs will have system_u:object_r:bin_t as the SELinux context. I confirmed the same by modifying your sample C program. I don't know whether we have /usr/libexec already pre-defined in glusterfs source. But I guess its not a big deal.

Is this solution of creating the file under /usr/libexec/glusterfs/ for mmap() acceptable for you?

--- Additional comment from Worker Ant on 2017-01-25 13:59:07 CET ---

REVIEW: https://review.gluster.org/16405 (cluster/ec: fix selinux issues with mmap()) posted (#8) for review on master by Xavier Hernandez (xhernandez)

--- Additional comment from Xavier Hernandez on 2017-01-25 14:02:56 CET ---

The latest change should satisfy all selinux restrictions, so it should work without special rules.

Thanks Anoop for all the information you provided.

--- Additional comment from Worker Ant on 2017-02-02 13:02:32 CET ---

COMMIT: https://review.gluster.org/16405 committed in master by Jeff Darcy (jdarcy) 
------
commit db80efc8d5cc24597de636d8df2e5a9ce81d670d
Author: Xavier Hernandez <xhernandez>
Date:   Fri Jan 13 13:54:35 2017 +0100

    cluster/ec: fix selinux issues with mmap()
    
    EC uses mmap() to create a memory area for the dynamic code. Since
    the code is created on the fly and executed when needed, this region
    of memory needs to have write and execution privileges.
    
    This combination is not allowed by default by selinux. To solve the
    problem a file is used as a backend storage for the dynamic code and
    it's mapped into two distinct memory regions, one with write access
    and the other one with execution access. This approach is the
    recommended way to create dynamic code by a program in a more secure
    way, and selinux allows it.
    
    Additionally selinux requires that the backend file be stored in a
    directory marked with type bin_t to be able to map it in an executable
    area. To satisfy this condition, GLUSTERFS_LIBEXECDIR has been used.
    
    This fix also changes the error check for mmap(), that was done
    incorrectly (it checked against NULL instead of MAP_FAILED), and it
    also correctly propagates the error codes and makes sure they aren't
    silently ignored.
    
    Change-Id: I71c2f88be4e4d795b6cfff96ab3799c362c54291
    BUG: 1402661
    Signed-off-by: Xavier Hernandez <xhernandez>
    Reviewed-on: https://review.gluster.org/16405
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 1 Worker Ant 2017-02-02 12:32:10 UTC

REVIEW: https://review.gluster.org/16522 (cluster/ec: fix selinux issues with mmap()) posted (#1) for review on release-3.10 by Xavier Hernandez (xhernandez)

Comment 2 Worker Ant 2017-02-06 15:26:59 UTC

COMMIT: https://review.gluster.org/16522 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit cc0f3623c037529a3c3f3d1c81f2c8d281d64dba
Author: Xavier Hernandez <xhernandez>
Date:   Fri Jan 13 13:54:35 2017 +0100

    cluster/ec: fix selinux issues with mmap()
    
    EC uses mmap() to create a memory area for the dynamic code. Since
    the code is created on the fly and executed when needed, this region
    of memory needs to have write and execution privileges.
    
    This combination is not allowed by default by selinux. To solve the
    problem a file is used as a backend storage for the dynamic code and
    it's mapped into two distinct memory regions, one with write access
    and the other one with execution access. This approach is the
    recommended way to create dynamic code by a program in a more secure
    way, and selinux allows it.
    
    Additionally selinux requires that the backend file be stored in a
    directory marked with type bin_t to be able to map it in an executable
    area. To satisfy this condition, GLUSTERFS_LIBEXECDIR has been used.
    
    This fix also changes the error check for mmap(), that was done
    incorrectly (it checked against NULL instead of MAP_FAILED), and it
    also correctly propagates the error codes and makes sure they aren't
    silently ignored.
    
    > Change-Id: I71c2f88be4e4d795b6cfff96ab3799c362c54291
    > BUG: 1402661
    > Signed-off-by: Xavier Hernandez <xhernandez>
    > Reviewed-on: https://review.gluster.org/16405
    > Smoke: Gluster Build System <jenkins.org>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: Jeff Darcy <jdarcy>
    
    Change-Id: I5c2dd51b1161505316c8f78b73e9a585d0c115d0
    BUG: 1418650
    Signed-off-by: Xavier Hernandez <xhernandez>
    Reviewed-on: https://review.gluster.org/16522
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Shyamsundar Ranganathan <srangana>

Comment 3 Shyamsundar 2017-03-06 17:44:58 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/