783823 – raid-check broken / sending ioctl 1261 to a partition

Bug 783823 - raid-check broken / sending ioctl 1261 to a partition

Summary: raid-check broken / sending ioctl 1261 to a partition

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	mdadm
Sub Component:
Version:	15
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Assignee:	Jes Sorensen
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-01-22 16:04 UTC by Harald Reindl
Modified:	2012-01-23 10:52 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2012-01-23 10:52:46 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Harald Reindl 2012-01-22 16:04:03 UTC

mdadm-3.2.3-3.fc15.x86_64
2.6.41.10-2.fc15.x86_64

each time calling "/sbin/mdadm --detail /dev/mdX" results in "sending ioctl 1261 to a partition" in "dmesg" and "/usr/sbin/raid-check" results in "Unit mdmonitor.service entered failed state"

[root@srv-rhsoft:~]$ cat /proc/mdstat 
Personalities : [raid1] [raid10] 
md2 : active raid10 sdc3[0] sdd3[3] sda3[4] sdb3[5]
      3875222528 blocks super 1.1 512K chunks 2 near-copies [4/4] [UUUU]
      bitmap: 10/29 pages [40KB], 65536KB chunk

md1 : active raid10 sdc2[0] sdd2[3] sda2[4] sdb2[5]
      30716928 blocks super 1.1 512K chunks 2 near-copies [4/4] [UUUU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

md0 : active raid1 sdc1[0] sdd1[3] sda1[4] sdb1[5]
      511988 blocks super 1.0 [4/4] [UUUU]
      
unused devices: <none>
_______________________________________________

yes, i am using the F16 systemd-services but that should not be a problem 

[root@srv-rhsoft:~]$ ls /etc/systemd/system/ | grep mdmonitor
-rw-r--r--  1 root root  330 2012-01-21 03:52 mdmonitor.service
-rw-r--r--  1 root root  255 2012-01-21 03:52 mdmonitor-takeover.service
_______________________________________________

Jan 22 14:13:58 srv-rhsoft kernel: md: md2: data-check done.
Jan 22 14:13:58 srv-rhsoft kernel: md: delaying data-check of md0 until md1 has finished (they share one or more physical units)
Jan 22 14:13:58 srv-rhsoft kernel: md: data-check of RAID array md1
Jan 22 14:13:58 srv-rhsoft kernel: md: minimum _guaranteed_  speed: 50000 KB/sec/disk.
Jan 22 14:13:58 srv-rhsoft kernel: md: using maximum available idle IO bandwidth (but not more than 500000 KB/sec) for data-check.
Jan 22 14:13:58 srv-rhsoft kernel: md: using 128k window, over a total of 30716928k.
Jan 22 14:14:00 srv-rhsoft systemd[1]: PID 20560 read from file /var/run/mdadm/mdadm.pid does not exist. Your service or init script might be broken.
Jan 22 14:14:00 srv-rhsoft systemd[1]: mdmonitor.service: main process exited, code=killed, status=6
Jan 22 14:14:00 srv-rhsoft systemd[1]: Unit mdmonitor.service entered failed state.
Jan 22 14:16:20 srv-rhsoft kernel: md: md1: data-check done.
Jan 22 14:16:20 srv-rhsoft kernel: md: data-check of RAID array md0
Jan 22 14:16:20 srv-rhsoft kernel: md: minimum _guaranteed_  speed: 50000 KB/sec/disk.
Jan 22 14:16:20 srv-rhsoft kernel: md: using maximum available idle IO bandwidth (but not more than 500000 KB/sec) for data-check.
Jan 22 14:16:20 srv-rhsoft kernel: md: using 128k window, over a total of 511988k.
Jan 22 14:16:25 srv-rhsoft kernel: md: md0: data-check done.

Comment 1 Jes Sorensen 2012-01-23 09:49:55 UTC

Do the raids get assembled without problems otherwise?

Which kernel are you running?

Does this happen if you use the normal Fedora 15 scripts?

Thanks,
Jes

Comment 2 Harald Reindl 2012-01-23 09:55:06 UTC

as said kernel 2.6.41.10-2.fc15.x86_64
yes, the array works without any problems

[root@rh:~]$ dmesg -c
[root@rh:~]$ /sbin/mdadm --detail /dev/md2 
/dev/md2:
        Version : 1.1
  Creation Time : Wed Jun  8 13:10:56 2011
     Raid Level : raid10
     Array Size : 3875222528 (3695.70 GiB 3968.23 GB)
  Used Dev Size : 1937611264 (1847.85 GiB 1984.11 GB)
   Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Mon Jan 23 10:54:36 2012
          State : active 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : localhost.localdomain:2
           UUID : ea253255:cb915401:f32794ad:ce0fe396
         Events : 51010

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       35        1      active sync   /dev/sdc3
       2       8       51        2      active sync   /dev/sdd3
       3       8       19        3      active sync   /dev/sdb3

[root@rh:~]$ dmesg -c
scsi_verify_blk_ioctl: 306 callbacks suppressed
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 1261 to a partition!

Comment 3 Jes Sorensen 2012-01-23 10:07:13 UTC

I have tried reproducing the problem here on a Fedora 15 system with
three raid devices. This is vanilla Fedora 15 though, no F16 scripts
pulled in.

Please try with the non-test version of the kernel:
kernel-2.6.41.9-1.fc15.x86_64

This is what I get:

[root@monkeybay ~]# cat /proc/mdstat 
Personalities : [raid1] [raid10] 
md42 : active raid1 sde3[0] sdf3[1]
      19529656 blocks super 1.2 [2/2] [UU]
      
md126 : active raid10 sde1[0] sdf1[1] sdg1[2] sdh1[3]
      39053184 blocks 64K chunks 2 near-copies [4/4] [UUUU]
      
md125 : active raid10 sde2[0] sdh2[3] sdg2[1] sdf2[2]
      39063424 blocks 64K chunks 2 near-copies [4/4] [UUUU]
      
unused devices: <none>
[root@monkeybay ~]# raid-check 
[root@monkeybay ~]# 
[root@monkeybay ~]# dmesg|grep 1261
[root@monkeybay ~]# rpm -q kernel mdadm 
kernel-2.6.41.9-1.fc15.x86_64
mdadm-3.2.3-3.fc15.x86_64

Sounds more like a bug in ioctl processing in the test kernel.

Comment 4 Harald Reindl 2012-01-23 10:50:38 UTC

confirmed, see my kernel-bugreport
https://bugzilla.redhat.com/show_bug.cgi?id=783955

thank you for your feedback!

Comment 5 Jes Sorensen 2012-01-23 10:52:46 UTC

Glad we're getting closer to the issue. Thanks for testing.

I am going to close this one since it's not an mdadm bug, but I'll
keep an eye on the kernel bug as well.

Cheers,
Jes

Note You need to log in before you can comment on or make changes to this bug.