Bug 1073314

Summary:	Reshape was stuck
Product:	Red Hat Enterprise Linux 7	Reporter:	XiaoNi <xni>
Component:	selinux-policy	Assignee:	Lukas Vrabec <lvrabec>
Status:	CLOSED ERRATA	QA Contact:	Zhang Yi <yizhan>
Severity:	unspecified	Docs Contact:
Priority:	high
Version:	7.0	CC:	dhowells, dledford, eguan, eparis, Jes.Sorensen, lvrabec, mgrepl, mmalik, plautrba, pvrabec, ssekidde, xni, xzhou
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	selinux-policy-3.13.1-50.el7	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1246035 (view as bug list)		Environment:
Last Closed:	2015-11-19 10:22:04 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1246035

Description XiaoNi 2014-03-06 08:02:25 UTC

Description of problem:


Version-Release number of selected component (if applicable):
The kernel is 3.10.0-101.el7

How reproducible:

100% on ppc64 and s390x platform. It's not easy to reproduce this on x86_64 platform

Steps to Reproduce:


[root@ibm-p730-03-lp2 ~]# mdadm -CR /dev/md0 -l5 -n3 /dev/loop[0-2] --assume-clean
mdadm: /dev/loop0 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Mar  4 04:00:35 2014
mdadm: /dev/loop1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Mar  4 04:00:35 2014
mdadm: /dev/loop2 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Mar  4 04:00:35 2014
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
[root@ibm-p730-03-lp2 ~]# mdadm /dev/md0  -a /dev/loop3
mdadm: added /dev/loop3
[root@ibm-p730-03-lp2 ~]# mdadm --grow --raid-devices 4 /dev/md0
mdadm: Need to backup 3072K of critical section..
[root@ibm-p730-03-lp2 ~]# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md0 : active raid5 loop3[3] loop2[2] loop1[1] loop0[0]
      1022976 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      [>....................]  reshape =  0.3% (1536/511488) finish=10092.8min speed=0K/sec
      
unused devices: <none>

   The speed of reshape is zero! But it works well on x86_64 plat form. I'll give the dmesg information below:


[root@ibm-p730-03-lp2 ~]# dmesg
[ 8937.798375] md: bind<loop0>
[ 8937.798432] md: bind<loop1>
[ 8937.798475] md: bind<loop2>
[ 8937.801173] md/raid:md0: device loop2 operational as raid disk 2
[ 8937.801181] md/raid:md0: device loop1 operational as raid disk 1
[ 8937.801184] md/raid:md0: device loop0 operational as raid disk 0
[ 8937.801736] md/raid:md0: allocated 49362kB
[ 8937.801797] md/raid:md0: raid level 5 active with 3 out of 3 devices, algorithm 2
[ 8937.801802] RAID conf printout:
[ 8937.801804]  --- level:5 rd:3 wd:3
[ 8937.801806]  disk 0, o:1, dev:loop0
[ 8937.801807]  disk 1, o:1, dev:loop1
[ 8937.801808]  disk 2, o:1, dev:loop2
[ 8937.801829] md0: detected capacity change from 0 to 1047527424
[ 8937.805712]  md0: unknown partition table
[ 8944.306297] md: bind<loop3>
[ 8944.326330] RAID conf printout:
[ 8944.326334]  --- level:5 rd:3 wd:3
[ 8944.326336]  disk 0, o:1, dev:loop0
[ 8944.326338]  disk 1, o:1, dev:loop1
[ 8944.326340]  disk 2, o:1, dev:loop2
[ 8948.374433] RAID conf printout:
[ 8948.374436]  --- level:5 rd:4 wd:4
[ 8948.374439]  disk 0, o:1, dev:loop0
[ 8948.374440]  disk 1, o:1, dev:loop1
[ 8948.374441]  disk 2, o:1, dev:loop2
[ 8948.374443]  disk 3, o:1, dev:loop3
[ 8948.374501] md: reshape of RAID array md0
[ 8948.374514] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[ 8948.374521] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
[ 8948.374584] md: using 2048k window, over a total of 511488k.
[ 8948.618472] md: md_do_sync() got signal ... exiting
[ 8948.635154] md: reshape of RAID array md0
[ 8948.635160] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[ 8948.635166] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
[ 8948.635236] md: using 2048k window, over a total of 511488k.

Actual results:


Expected results:


Additional info:

Comment 1 Eryu Guan 2014-12-18 03:03:32 UTC

We can hit this on x86_64 host too while testing RHEL7.1 Alhpa and Beta compose, mdadm-3.3.2-1.el7 with upstream 3.18+ kernel, but 7.1 Beta kernel could reproduce too (-210)

[root@dhcp-66-86-11 ~]# cat /proc/mdstat 
Personalities : [faulty] [raid6] [raid5] [raid4] 
md127 : active raid5 loop3[4] loop2[3] loop1[1] loop0[0]
      63488 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      [>....................]  reshape =  0.0% (0/31744) finish=395.2min speed=1K/sec
      
unused devices: <none>

dmesg

[61241.044600] md: bind<loop0>
[61241.044830] md: bind<loop1>
[61241.044962] md: bind<loop2>
[61241.072327] async_tx: api initialized (async)
[61241.210359] md: raid6 personality registered for level 6
[61241.210361] md: raid5 personality registered for level 5
[61241.210362] md: raid4 personality registered for level 4
[61241.211482] md/raid:md127: device loop1 operational as raid disk 1
[61241.211487] md/raid:md127: device loop0 operational as raid disk 0
[61241.212215] md/raid:md127: allocated 0kB
[61241.212335] md/raid:md127: raid level 5 active with 2 out of 3 devices, algorithm 2
[61241.213984] RAID conf printout:
[61241.213989]  --- level:5 rd:3 wd:2
[61241.214004]  disk 0, o:1, dev:loop0
[61241.214018]  disk 1, o:1, dev:loop1
[61241.214029] md/raid456: discard support disabled due to uncertainty.
[61241.214030] Set raid456.devices_handle_discard_safely=Y to override.
[61241.214057] md127: detected capacity change from 0 to 65011712
[61241.214297] RAID conf printout:
[61241.214299]  --- level:5 rd:3 wd:2
[61241.214301]  disk 0, o:1, dev:loop0
[61241.214302]  disk 1, o:1, dev:loop1
[61241.214303]  disk 2, o:1, dev:loop2
[61241.215931] md: recovery of RAID array md127
[61241.215939] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[61241.215942] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
[61241.215948] md: using 128k window, over a total of 31744k.
[61241.230132]  md127: unknown partition table
[61241.471855] md: md127: recovery done.
[61241.481646] RAID conf printout:
[61241.481657]  --- level:5 rd:3 wd:3
[61241.481661]  disk 0, o:1, dev:loop0
[61241.481664]  disk 1, o:1, dev:loop1
[61241.481666]  disk 2, o:1, dev:loop2
[61251.409507] EXT4-fs (md127): mounting ext3 file system using the ext4 subsystem
[61251.412189] EXT4-fs (md127): mounted filesystem with writeback data mode. Opts: data=writeback
[61251.412197] SELinux: initialized (dev md127, type ext3), uses xattr
[61251.466479] md: bind<loop3>
[61251.473324] RAID conf printout:
[61251.473331]  --- level:5 rd:3 wd:3
[61251.473334]  disk 0, o:1, dev:loop0
[61251.473337]  disk 1, o:1, dev:loop1
[61251.473339]  disk 2, o:1, dev:loop2
[61251.490192] RAID conf printout:
[61251.490195]  --- level:5 rd:4 wd:4
[61251.490196]  disk 0, o:1, dev:loop0
[61251.490197]  disk 1, o:1, dev:loop1
[61251.490198]  disk 2, o:1, dev:loop2
[61251.490199]  disk 3, o:1, dev:loop3
[61251.490667] md: reshape of RAID array md127
[61251.490670] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[61251.490672] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
[61251.490675] md: using 128k window, over a total of 31744k.

Comment 2 Jes Sorensen 2014-12-18 20:30:52 UTC

Hi,

Just to be clear I understand you correctly, you were able to reproduce this
with 3.18+ and mdadm-3.3.2?

If this is the case, we should report the problem upstream.

Thanks,
Jes

Comment 3 Eryu Guan 2014-12-19 03:42:51 UTC

(In reply to Jes Sorensen from comment #2)
> Hi,
> 
> Just to be clear I understand you correctly, you were able to reproduce this
> with 3.18+ and mdadm-3.3.2?

Yes, I tested on RHEL7 Beta with upstream 3.18+ kernel, which means except the kernel, everything else is from RHEL7 Beta compose.

> 
> If this is the case, we should report the problem upstream.

I'm not sure how mdadm upstream works, report to some bugzilla or just to mail list (and which mail list)?

Thanks,
Eryu

Comment 4 XiaoNi 2014-12-19 05:19:22 UTC

(In reply to Eryu Guan from comment #3)
> (In reply to Jes Sorensen from comment #2)
> > Hi,
> > 
> > Just to be clear I understand you correctly, you were able to reproduce this
> > with 3.18+ and mdadm-3.3.2?
> 
> Yes, I tested on RHEL7 Beta with upstream 3.18+ kernel, which means except
> the kernel, everything else is from RHEL7 Beta compose.

Hi Eryu
   
   Can you reproduce this every time? By the way, can we update upstream kernel with RHEL7 kernel directly. I remeber I tried to do it, but it failed.

> 
> > 
> > If this is the case, we should report the problem upstream.
> 
> I'm not sure how mdadm upstream works, report to some bugzilla or just to
> mail list (and which mail list)?

Yes, there is a maillist linux-raid.org. I'm trying to do test again for this. I'll send mail to it soon.
> 
> Thanks,
> Eryu

Comment 5 Eryu Guan 2014-12-19 06:30:59 UTC

(In reply to XiaoNi from comment #4)
> 
> Hi Eryu
>    
>    Can you reproduce this every time? By the way, can we update upstream
> kernel with RHEL7 kernel directly. I remeber I tried to do it, but it failed.

I tried once so far and hit the hang, but Xiong Zhou (xzhou@) hit this quite often when running beaker tasks.

I compiled upstream kernel manually with config file from RHEL7 (with some additional configrations).

> 
> Yes, there is a maillist linux-raid.org. I'm trying to do test
> again for this. I'll send mail to it soon.

Please cc me too, thanks!

Eryu

Comment 6 XiaoNi 2014-12-19 09:48:25 UTC

Hi all

   The problem can reproduce 100% and already send the mail to upstream.

Thanks
Xiao

Comment 20 Milos Malik 2015-08-05 10:22:04 UTC

Are there any files in /dev/mqueue directory during the reshaping in permissive mode?

# ls -Z /dev/mqueue

Comment 21 Miroslav Grepl 2015-08-07 08:51:51 UTC

I added some additional fixes -40.el7. PLease test it with this release.
If it does not work, switch to permissive mode and see if it works in permissive mode. If so, please attach AVC msgs from permissive mode.

Thanks.

Comment 29 Lukas Vrabec 2015-08-18 10:41:59 UTC

[root@dhcp-10-40-2-201 ~]# audit2allow -i avc 


#============= mdadm_t ==============

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t apm_bios_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t autofs_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t cpu_device_t:chr_file getattr;

#!!!! This avc can be allowed using the boolean 'daemons_use_tty'
allow mdadm_t devpts_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t dri_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t event_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t fuse_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t initctl_t:fifo_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t kmsg_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t loop_control_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t lvm_control_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t mouse_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t netcontrol_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t ppp_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t ptmx_t:chr_file getattr;

#!!!! This avc is allowed in the current policy
allow mdadm_t random_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t sound_device_t:chr_file getattr;

#!!!! This avc is allowed in the current policy
allow mdadm_t tmpfs_t:dir read;

#!!!! This avc can be allowed using the boolean 'daemons_use_tty'
allow mdadm_t tty_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t tun_tap_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t uhid_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t usbmon_device_t:chr_file getattr;

#!!!! This avc can be allowed using the boolean 'daemons_use_tty'
allow mdadm_t user_devpts_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t vfio_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t vhost_device_t:chr_file getattr;

#!!!! This avc has a dontaudit rule in the current policy
allow mdadm_t xserver_misc_device_t:chr_file getattr;
[root@dhcp-10-40-2-201 ~]# rpm -q selinux-policy
selinux-policy-3.13.1-42.el7.noarch

Comment 36 Lukas Vrabec 2015-08-25 10:49:21 UTC

Hi, 

Please, use:
 
# sesearch -D -s mdadm_t -t device_t -c chr_file

# rpm -qa | grep selinux-policy
selinux-policy-3.13.1-45.el7.noarch
selinux-policy-devel-3.13.1-37.el7.noarch
selinux-policy-targeted-3.13.1-45.el7.noarch

This is weird. On rhel7.2, I see dontaudit rules for this AVCs. Could you setup some beaker machine with this issue?

Comment 37 Milos Malik 2015-08-25 14:49:36 UTC

Almost all AVCs listed in the last attachment are related to a chr_file class, except for this one:
----
type=SYSCALL msg=audit(08/25/2015 05:13:51.695:109) : arch=x86_64 syscall=stat success=yes exit=0 a0=0x21c9920 a1=0x7fffb5572e20 a2=0x7fffb5572e20 a3=0x100 items=0 ppid=30243 pid=30250 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=mdadm exe=/usr/sbin/mdadm subj=system_u:system_r:mdadm_t:s0-s0:c0.c1023 key=(null) 
type=AVC msg=audit(08/25/2015 05:13:51.695:109) : avc:  denied  { getattr } for  pid=30250 comm=mdadm path=/run/systemd/initctl/fifo dev="tmpfs" ino=12713 scontext=system_u:system_r:mdadm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:initctl_t:s0 tclass=fifo_file 
----

Comment 43 Lukas Vrabec 2015-09-09 08:46:31 UTC

XiaoNi, I cannot connect to server via ssh.

Comment 49 Milos Malik 2015-09-11 12:21:56 UTC

What about AVCs which appear where you remove dontaudit rules? Following AVCs seem suspicious:
----
type=SYSCALL msg=audit(09/11/2015 08:11:05.391:574) : arch=x86_64 syscall=stat s
uccess=no exit=-13(Permission denied) a0=0x2433920 a1=0x7ffdc1f893e0 a2=0x7ffdc1
f893e0 a3=0x100 items=0 ppid=32744 pid=32751 auid=unset uid=root gid=root euid=r
oot suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset com
m=mdadm exe=/usr/sbin/mdadm subj=system_u:system_r:mdadm_t:s0-s0:c0.c1023 key=(n
ull) 
type=AVC msg=audit(09/11/2015 08:11:05.391:574) : avc:  denied  { getattr } for  pid=32751 comm=mdadm path=/dev/loop-control dev="devtmpfs" ino=7718 scontext=system_u:system_r:mdadm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:loop_control_device_t:s0 tclass=chr_file 
----
type=SYSCALL msg=audit(09/11/2015 08:11:05.398:690) : arch=x86_64 syscall=newfstatat success=no exit=-13(Permission denied) a0=0x5 a1=0x2443d43 a2=0x7ffdc1f894a0 a3=0x100 items=0 ppid=32744 pid=32751 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=mdadm exe=/usr/sbin/mdadm subj=system_u:system_r:mdadm_t:s0-s0:c0.c1023 key=(null) 
type=AVC msg=audit(09/11/2015 08:11:05.398:690) : avc:  denied  { getattr } for  pid=32751 comm=mdadm path=/dev/mapper/control dev="devtmpfs" ino=7720 scontext=system_u:system_r:mdadm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:lvm_control_t:s0 tclass=chr_file 
----

Comment 50 Milos Malik 2015-09-11 12:40:10 UTC

# getenforce 
Enforcing
# setsebool daemons_use_tty on
# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 loop3[3] loop2[2] loop1[1] loop0[0]
      4093952 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      [>....................]  reshape =  3.5% (73216/2046976) finish=3.1min speed=10459K/sec
      
unused devices: <none>
# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 loop3[3] loop2[2] loop1[1] loop0[0]
      4093952 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      [=>...................]  reshape =  7.9% (162816/2046976) finish=2.8min speed=10854K/sec
      
unused devices: <none>
# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 loop3[3] loop2[2] loop1[1] loop0[0]
      4093952 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      [===>.................]  reshape = 15.5% (319248/2046976) finish=2.8min speed=9976K/sec
      
unused devices: <none>
#

Comment 51 Milos Malik 2015-09-11 12:45:35 UTC

No special policy module is needed, but the daemons_use_tty boolean must be enabled.

# ps -efZ | grep mdadm
system_u:system_r:mdadm_t:s0    root       668     1  0 08:43 ?        00:00:00 /usr/sbin/mdadm --grow --continue /dev/md0
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 root 690 319  0 08:43 pts/2 00:00:00 grep --color=auto mdadm
# ls -l /proc/668/fd
total 0
lr-x------. 1 root root 64 Sep 11 08:43 0 -> /dev/null
l-wx------. 1 root root 64 Sep 11 08:43 1 -> /dev/null
l-wx------. 1 root root 64 Sep 11 08:43 2 -> /dev/null
lrwx------. 1 root root 64 Sep 11 08:43 3 -> /sys/devices/virtual/block/md0/md/sync_action
lr-x------. 1 root root 64 Sep 11 08:43 4 -> /dev/loop0
lr-x------. 1 root root 64 Sep 11 08:43 5 -> /dev/loop1
lr-x------. 1 root root 64 Sep 11 08:43 6 -> /dev/loop2
lrwx------. 1 root root 64 Sep 11 08:43 7 -> /dev/loop3
#

Comment 59 Lukas Vrabec 2015-09-14 12:50:30 UTC

commit 14a8d542325607b15689f919ddc91903f7664ee3
Author: Lukas Vrabec <lvrabec>
Date:   Mon Sep 14 14:43:12 2015 +0200

    Allow mdadm_t domain read/write to general ptys and unallocated ttys.
    Resolves: #1073314

Comment 68 errata-xmlrpc 2015-11-19 10:22:04 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2300.html