947539 – kernel-3.8.5-201.fc18 degraded raid1 / array on reboot with mdadm mpt2sas

Bug 947539 - kernel-3.8.5-201.fc18 degraded raid1 / array on reboot with mdadm mpt2sas

Summary: kernel-3.8.5-201.fc18 degraded raid1 / array on reboot with mdadm mpt2sas

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	18
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-04-02 17:01 UTC by Anthony Messina
Modified:	2013-05-04 18:57 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2013-04-19 04:57:44 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
dmesg output from failed mdadm/mpt2sas rai1 array on / (/dev/md1) (104.77 KB, text/plain) 2013-04-02 17:01 UTC, Anthony Messina	no flags	Details
dmesg output after upgrade and reboot to kernel-3.8.6-203.fc18.x86_64 (117.89 KB, text/plain) 2013-04-12 19:37 UTC, Anthony Messina	no flags	Details
View All

Description Anthony Messina 2013-04-02 17:01:38 UTC

Created attachment 730867 [details]
dmesg output from failed mdadm/mpt2sas rai1 array on / (/dev/md1)

After rebooting an x86_64 system to the upgraded kernel-3.8.5-201.fc18, the / filesystem is degraded.  I am not able to re-add the degraded drive, until I revert to kernel-3.8.4-202.fc18.x86_64, where the machine will come up degraded initially, but I am able to re-add the drive, and reboot again into kernel-3.8.4-202.fc18.x86_64 without a degraded array.

Of the arrays shown below with shared drived, it is always the /dev/md1 array that fails, which holds /.  The /dev/md0 array holds /boot.

The problem is persistent in that even after rebuilding the array under the 3.8.4 kernel, when rebooting to 3.8.5, the problem recurrs.

I've attached my whole dmesg output from one of the failed boots, but some relevant parts are here:

[   14.586268] md: bind<sdi3>
[   14.588147] md: bind<sdj3>
[   14.588787] md: kicking non-fresh sdi3 from array!
[   14.588790] md: unbind<sdi3>
[   14.595707] md: export_rdev(sdi3)
[   14.597539] md: raid1 personality registered for level 1
[   14.597691] md/raid1:md1: active with 1 out of 2 mirrors
[   14.597808] created bitmap (5 pages) for device md1
[   14.598012] md1: bitmap initialized from disk: read 1 pages, set 58 of 8935 bits
[   14.602297] md1: detected capacity change from 0 to 599598686208

...

[   54.777317] mpt2sas0: log_info(0x31120320): originator(PL), code(0x12), sub_code(0x0320)
[   54.787636] mpt2sas0: log_info(0x31120320): originator(PL), code(0x12), sub_code(0x0320)
[   54.795029] mpt2sas0: log_info(0x31120320): originator(PL), code(0x12), sub_code(0x0320)
[   54.804021] mpt2sas0: log_info(0x31120320): originator(PL), code(0x12), sub_code(0x0320)
[   54.810805] mpt2sas0: log_info(0x31120320): originator(PL), code(0x12), sub_code(0x0320)
[   54.816919] mpt2sas0: log_info(0x31120320): originator(PL), code(0x12), sub_code(0x0320)
[   54.816935] sd 6:0:7:0: [sdj] Unhandled error code
[   54.816938] sd 6:0:7:0: [sdj]  
[   54.816940] Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
[   54.816943] sd 6:0:7:0: [sdj] CDB: 
[   54.816944] Write(10): 2a 00 1b 1b c0 08 00 00 08 00
[   54.816954] end_request: I/O error, dev sdj, sector 454803464
[   54.858175] md1: WRITE SAME failed. Manually zeroing.


The offending menuentry from /etc/grub2.cfg:
menuentry 'Fedora (3.8.5-201.fc18.x86_64)' --class fedora --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-40fc2a7c-c034-4f1e-af4a-a7d8b0a53469' {
        load_video
        set gfxpayload=keep
        insmod gzio
        insmod part_msdos
        insmod part_msdos
        insmod diskfilter
        insmod mdraid1x
        insmod ext2
        set root='mduuid/dc32b61d06a80bf56f42f3bc53228fb2'
        if [ x$feature_platform_search_hint = xy ]; then
          search --no-floppy --fs-uuid --set=root --hint='mduuid/dc32b61d06a80bf56f42f3bc53228fb2'  5cc0bbad-84b2-42b9-a2bf-af992d62df0e
        else
          search --no-floppy --fs-uuid --set=root 5cc0bbad-84b2-42b9-a2bf-af992d62df0e
        fi
        echo 'Loading Fedora (3.8.5-201.fc18.x86_64)'
        linux   /vmlinuz-3.8.5-201.fc18.x86_64 root=UUID=40fc2a7c-c034-4f1e-af4a-a7d8b0a53469 ro rd.dm=0 rd.lvm=0 rd.luks=0 rd.md.uuid=52e4971c:dfe03410:fcef0ac7:847cd2d4 rd.md.uuid=dc32b61d:06a80bf5:6f42f3bc:53228fb2  vconsole.keymap=us quiet LANG=en_US.UTF-8
        echo 'Loading initial ramdisk ...'
        initrd /initramfs-3.8.5-201.fc18.x86_64.img
}


[root@ds ~]# mdadm --detail /dev/md[01]
/dev/md0:
        Version : 1.0
  Creation Time : Thu May 31 05:44:54 2012
     Raid Level : raid1
     Array Size : 511988 (500.07 MiB 524.28 MB)
  Used Dev Size : 511988 (500.07 MiB 524.28 MB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Tue Apr  2 11:51:29 2013
          State : clean 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : ds.example.com:0  (local to host ds.example.com)
           UUID : dc32b61d:06a80bf5:6f42f3bc:53228fb2
         Events : 330

    Number   Major   Minor   RaidDevice State
       3       8      130        0      active sync   /dev/sdi2
       2       8      146        1      active sync   /dev/sdj2
/dev/md1:
        Version : 1.2
  Creation Time : Thu May 31 05:44:50 2012
     Raid Level : raid1
     Array Size : 585545592 (558.42 GiB 599.60 GB)
  Used Dev Size : 585545592 (558.42 GiB 599.60 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Tue Apr  2 11:52:22 2013
          State : active 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : ds.example.com:1  (local to host ds.example.com)
           UUID : 52e4971c:dfe03410:fcef0ac7:847cd2d4
         Events : 56786

    Number   Major   Minor   RaidDevice State
       0       8      131        0      active sync   /dev/sdi3
       1       8      147        1      active sync   /dev/sdj3


[root@ds ~]# cat /etc/fstab
UUID=40fc2a7c-c034-4f1e-af4a-a7d8b0a53469 /      ext4    defaults        1 1
UUID=5cc0bbad-84b2-42b9-a2bf-af992d62df0e /boot  ext4    defaults        1 2
UUID=a91fdd74-77ba-4e83-a921-99e11d1898d1 /home  ext4    defaults        1 2
UUID=99f58539-3634-44f8-86a7-305432d26202 /srv   ext4    defaults        1 2
UUID=af1e6068-9eeb-43e3-b4b8-67ce05a095e7 swap   swap    defaults        0 0
UUID=99422c71-edc6-4827-b4d7-379dc875ae82 swap   swap    defaults        0 0
#
# NFSv4 bind mounts
/home           /export/home            none    bind    0 0
/srv/media      /export/media           none    bind    0 0
/srv/software   /export/software        none    bind    0 0

Comment 1 Anthony Messina 2013-04-02 17:04:16 UTC

This issue appears to be similar to http://ubuntuforums.org/showthread.php?t=2108856, but I can't confirm if it's the same.

Comment 2 Josh Boyer 2013-04-02 17:16:16 UTC

There were a few md related changes from 3.8.4->3.8.5, but nothing is really jumping out at me.  There were no changes to mpt2sas or the scsi stack in general.

Was anything else updated between the time you installed 3.8.4 and 3.8.5?  I'm curious if you have e.g. a newer version of mdadm in the initramfs for 3.8.5 or something similar.

Comment 3 Anthony Messina 2013-04-02 17:30:40 UTC

From /var/log/yum.log:

Jan 06 13:46:27 Updated: mdadm-3.2.6-7.fc17.x86_64
Mar 22 09:32:40 Updated: mdadm-3.2.6-8.fc17.x86_64
Mar 28 17:21:09 Updated: mdadm-3.2.6-14.fc18.x86_64
Mar 28 19:43:36 Installed: mdadm-3.2.6-14.fc18.x86_64

This machine was upgraded from F17 last week. Should I remove the F17 kernels any potential bits they may have laying around aren't included in the initramfs (If that sort of thing happens)?

I also took a quick peek at the diff between v3.8.4 and v3.8.5 but didn't find anything that seemed related.

Comment 4 Jes Sorensen 2013-04-03 09:55:18 UTC

Your log is spewing out disk write errors for 'sdj', it is most likely this
causing the problem, and the difference between the kernel versions is what
state the drive is left in on the follow-on boot.

I cannot say whether this is a bad cable or a bad drive, but I would start
by looking at the drive if it was me.

It's possible the F18 kernel is less forgiving on write errors.

Jes

Comment 5 Anthony Messina 2013-04-03 10:09:45 UTC

(In reply to comment #4)
> Your log is spewing out disk write errors for 'sdj', it is most likely this
> causing the problem, and the difference between the kernel versions is what
> state the drive is left in on the follow-on boot.
> 
> I cannot say whether this is a bad cable or a bad drive, but I would start
> by looking at the drive if it was me.
> 
> It's possible the F18 kernel is less forgiving on write errors.
> 
> Jes

Thanks, Jes.  I will look at the drive and cabling itself.  However the thing that confuses me is that the error is specifically occurring between kernel-3.8.4-202.fc18.x86_64 and kernel-3.8.5-201.fc18.x86_64 -- both F18 kernels.  The write errors I see on /dev/sdj seem to my untrained eye to be that it's partner in the raid1 array, /dev/sdi, is unavailable.  I am probably quite wrong, but this seems like a symptom more than a source.

It's always /dev/sdi that doesn't com up with kernel-3.8.5-201.fc18.x86_64.  /dev/sdj comes up every time.

Comment 6 Jes Sorensen 2013-04-03 10:13:39 UTC

Hi Anthony,

It is strange this happens between the two kernel versions. What you see in the
log is a write error which looks to come from a write from the md stack. Why
this goes wrong between the two kernel versions is odd - no changes to the
mpt2sas driver in this kernel update?

Cheers,
Jes

Comment 7 Anthony Messina 2013-04-03 10:20:59 UTC

I looked at the git-diff between v3.8.4 and v3.8.5 and didn't see anything for mpt2sas.  There were a few changed to md, however, the seemed unrelated according to Josh in Comment #2.

Is there any notion that having that old F17 kernel laying around might be causing some strange bits to be added to grub, systemd or the initramfs?  If so, would 

yum remove kernel-3.8.3-103.fc17.x86_64
yum reinstall kernel-3.8.5-201.fc18.x86_64

make any difference?

Comment 8 Jes Sorensen 2013-04-03 12:10:28 UTC

Those old kernels should cause no problems - dracut will generate an initrd
for the specific kernel and only that will be used during the boot process.

Did anything chance in the block layer between the two kernels?

Comment 9 Anthony Messina 2013-04-03 12:20:32 UTC

(In reply to comment #8)
> Those old kernels should cause no problems - dracut will generate an initrd
> for the specific kernel and only that will be used during the boot process.
> 
> Did anything chance in the block layer between the two kernels?

I'm not exactly sure what you mean, but upstream

[linux-stable.git]$ git diff v3.8.4 v3.8.5 -- block

reveals nothing.

On my user end, this was a simple yum upgrade...  Followed by a reboot within 2-5 minutes or so. Something I performed for all my machines.

Comment 10 Jes Sorensen 2013-04-03 13:02:23 UTC

Ok if you do a yum update/downgrade of mdadm, you need to run 'dracut -f'
afterwards to ensure the initramfs is regenerated to pull in the mdadm
changes.

Cheers,
Jes

Comment 11 Anthony Messina 2013-04-03 13:18:48 UTC

Ok, does the dracut happen normally after a kernel install?

From my yum.log:
Mar 28 19:43:36 Installed: mdadm-3.2.6-14.fc18.x86_64
Apr 02 10:36:50 Installed: kernel-3.8.5-201.fc18.x86_64

I would have thought that the kernel install on Apr 02, would have pulled in the bits from the previous mdadm install on Mar 28.  Of course, I could always try again.

Comment 12 Jes Sorensen 2013-04-03 13:43:38 UTC

A kernel upgrade should do it, it should only be an issue if you up/down-grade
mdadm manually

Comment 13 Anthony Messina 2013-04-12 19:36:30 UTC

This same issue occurs with kernel-3.8.6-203.fc18.x86_64 even after replacing drive cabling.  Reverting to kernel-3.8.4-202.fc18.x86_64 makes the issue disappear.

I am trying to schedule some time next week to replace the drive itself to see if that may be related, but it still seems odd to me that things work without issue in kernel-3.8.4-202.fc18.x86_64, but not in kernel-3.8.5-201.fc18.x86_64 or kernel-3.8.6-203.fc18.x86_64.

I've attached the dmesg report after the reboot to kernel-3.8.6-203.fc18.x86_64 in case there is something to spot there.

Comment 14 Anthony Messina 2013-04-12 19:37:12 UTC

Created attachment 734936 [details]
dmesg output after upgrade and reboot to kernel-3.8.6-203.fc18.x86_64

Comment 15 Tomas Henzl 2013-04-15 12:41:21 UTC

> There were no changes to mpt2sas or the scsi stack in general.

And a test i/o test without active md, just to test the mpt2sas driver/disks, shows problems?
When you re-create the md array on a running system on new kernel do you see problems? Maybe we see some timing problems, the md arrays starts degraded and some disks are recognized later or need more time. I know nothing about md, but is it possible to start it later, just to see if it helps?

Comment 16 Anthony Messina 2013-04-17 16:57:00 UTC

After some extensive Googling, it appears as though another person may be having a similar issue: http://www.spinics.net/lists/linux-scsi/msg65403.html

I'm still running 3.8.4-202.fc18.x86_64 where I see no issues whereas kernel-3.8.5-201.fc18.x86_64 and kernel-3.8.6-203.fc18.x86_64 *do* have this problem.

According to Matthias Prager in the spinics thread, https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=c8dc9c6 fixes the issue.

Unfortunately for me, this looks to be in 3.9-rc*, but not 3.8 kernels.

Comment 17 Josh Boyer 2013-04-17 17:41:37 UTC

http://koji.fedoraproject.org/koji/taskinfo?taskID=5266642 is a scratch build with that commit applied to 3.8.8.  When it completes, please test and let us know how it works.

Comment 18 Anthony Messina 2013-04-17 20:06:45 UTC

(In reply to comment #17)
> http://koji.fedoraproject.org/koji/taskinfo?taskID=5266642 is a scratch
> build with that commit applied to 3.8.8.  When it completes, please test and
> let us know how it works.

I can confirm that the above Koji scratch-build fixes this issue on x86_64.  Thank you.

Comment 19 Josh Boyer 2013-04-17 20:12:27 UTC

Added in Fedora git.  Thanks again.

Comment 20 Fedora Update System 2013-04-18 01:05:51 UTC

kernel-3.8.8-202.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/kernel-3.8.8-202.fc18

Comment 21 Fedora Update System 2013-04-19 04:57:47 UTC

kernel-3.8.8-202.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 22 Fedora Update System 2013-04-29 17:12:15 UTC

kernel-3.8.10-100.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/kernel-3.8.10-100.fc17

Comment 23 Fedora Update System 2013-05-04 00:07:08 UTC

kernel-3.8.11-100.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 24 Trevor Cordes 2013-05-04 18:57:41 UTC

For anyone else who might stumble upon this, it may have been a more generic issue.  I don't have a mpt2sas on my system but I had a very similar bug and not many good hits regarding it on bugzilla or google, except this bug.

I have a scsi card using aic7xxx and 2 SCSI drives in md RAID1.  Under kernel-PAE-3.8.8-100.fc17.i686 I get similar SCSI errors and RAID drives dropping out as failed (after 10 mins of moderate random I/O).  Under kernel-PAE-3.7.9-104.fc17.i686 I get no errors (as far as I can tell after testing for hours).

I saw that kernel-PAE-3.8.11-100.fc17.i686 was just released so I am now running that.  So far no errors but it has only been a few mins.  I will report back if 3.8.11-100 gives me more errors.

If this bug fixed my problem then 3.8.11 should not have the problems.  But it would show this isn't a mpt2sas problem or even an aic7xxx problem, but something common to both scsi systems.  If anyone finds the bug report from the kernel guys, please post it here.

My errors looked like:

May  4 11:55:52 pog kernel: [  295.675753] (scsi8:A:4:0): data overrun detected in Data-out phase.  Tag == 0x7.
May  4 11:55:52 pog kernel: [  295.675757] (scsi8:A:4:0): Have seen Data Phase.  Length = 512.  NumSGs = 1.
May  4 11:55:52 pog kernel: [  295.675762] sg[0] - Addr 0x0d0a000 : Length 512
May  4 11:55:52 pog kernel: [  295.675794] sd 8:0:4:0: [sdb] Unhandled error code
May  4 11:55:52 pog kernel: [  295.675801] sd 8:0:4:0: [sdb]  
May  4 11:55:52 pog kernel: [  295.675804] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
May  4 11:55:52 pog kernel: [  295.675807] sd 8:0:4:0: [sdb] CDB: 
May  4 11:55:52 pog kernel: [  295.675822] end_request: I/O error, dev sdb, sector 19847408
May  4 11:55:52 pog kernel: [  295.675876] md127: WRITE SAME failed. Manually zeroing.

Note You need to log in before you can comment on or make changes to this bug.