Bug 442524

Summary:	kobject_issues_in_SAN_environment
Product:	Red Hat Enterprise Linux 5	Reporter:	Jan Lieskovsky <jlieskov>
Component:	kernel	Assignee:	Jes Sorensen <Jes.Sorensen>
Status:	CLOSED DUPLICATE	QA Contact:	Red Hat Kernel QE team <kernel-qe>
Severity:	low	Docs Contact:
Priority:	low
Version:	5.2	CC:	arozansk, coughlan, lwang, mchristi, peterm, rproffit
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-05-04 14:56:03 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	430984
Bug Blocks:

Description Jan Lieskovsky 2008-04-15 11:10:26 UTC

Description of problem:

The SAN initiators use RAID1 mirrors to access SAN targets in two
geographically isolated data-centers. We've been having issues with
SAN targets periodically failing which present as I/O's never
completing to the RAID1 driver. As has been discussed earlier on
linux-raid this is an error scenario which the RAID1 driver cannot
address short of a timer based solution which arguably involves a
layering violation.

As processes on the initiators attempt to do I/O they go into D state
which eventually produces very high load levels. Our response to this
is to restart the hung target. This generates a LIP in the affected
zone which the HBA's can of course detect which in turn results in the
underlying device being failed from the mirror.

Version-Release number of selected component (if applicable):
2.6.18-53.1.13.el5

How reproducible:
Always

Actual results:

The RAID1 driver detects the anomaly, kicks the device and indicates its
continueing on one device but then throws an I/O error up through LVM
and into the overlying filesystem. This causes the filesystem to go
read-only but in some cases not before generating filesystem
corruption which needs to be corrected by a full ext3 filesystem
check.

Expected results:

The 2.6.22.x/2.6.23.x kernels see the same I/O stalls when the targets
hang but they do handle the failure scenario correctly. The MD device
picks up on the target reboot induced LIP, fails out the device and
the machines go forward as they should.

Additional info:

URL of the whole post:

http://www.issociate.de/board/post/478736/MD_sysfs/kobject_issues_in_SAN_environment..html

Comment 1 Jan Lieskovsky 2008-04-15 11:12:29 UTC

More details from GregKH:

Here is an exerpt from logging at 'info' and 'notice' priority levels
for the event:

------------------------------------------------------------ ---------------
Feb 22 01:36:54 rsg1 kernel: qla2xxx 0000:03:09.0: scsi(3:0:1): Abort command
issued -- 1 3818fa 2002.
Feb 22 01:37:20 rsg1 hotplug: Event 871 requested remove for device: scsi_disk
Feb 22 01:37:20 rsg1 hotplug: Event 870 requested remove for device: scsi_device
Feb 22 01:37:20 rsg1 hotplug: Event 872 requested remove for device: block
Feb 22 01:37:20 rsg1 hotplug: Event 873 requested remove for device: block
Feb 22 01:37:20 rsg1 hotplug: Event 874 requested remove for device: scsi
Feb 22 01:37:24 rsg1 kernel: qla2xxx 0000:03:09.0: scsi(3:0:1): Abort command
issued -- 1 3818fb 2002.
Feb 22 01:37:24 rsg1 kernel: scsi 3:0:0:1: scsi: Device offlined - not ready
after error recovery
Feb 22 01:37:24 rsg1 kernel: scsi 3:0:0:1: [sdd] Result: hostbyte=DID_NO_CONNECT
driverbyte=DRIVER_OK,SUGGEST_OK
Feb 22 01:39:24 rsg1 kernel: scsi scan: INQUIRY result too short (5), using 36
Feb 22 01:39:24 rsg1 kernel: scsi 3:0:0:1: Direct-Access FCTARGET lvm(58,6)
0.9 PQ: 0 ANSI: 3
Feb 22 01:39:24 rsg1 kernel: sd 3:0:0:1: [sde] 418775040 512-byte hardware
sectors (214413 MB)
Feb 22 01:39:24 rsg1 kernel: sd 3:0:0:1: [sde] Write Protect is off
Feb 22 01:39:24 rsg1 kernel: sd 3:0:0:1: [sde] Write cache: disabled, read
cache: enabled, doesn't support DPO or FUA
Feb 22 01:39:24 rsg1 hotplug: Event 876 requested add for device: scsi_disk
Feb 22 01:39:24 rsg1 hotplug: Event 875 requested add for device: scsi
Feb 22 01:39:24 rsg1 kernel: sde: sde1
Feb 22 01:39:24 rsg1 hotplug: Event 877 requested add for device: block
Feb 22 01:39:24 rsg1 hotplug: Event 878 requested add for device: block
Feb 22 01:39:24 rsg1 kernel: sd 3:0:0:1: [sde] Attached SCSI disk
------------------------------------------------------------ ---------------

Which corresponds to the following messages at 'warn' priority:

------------------------------------------------------------ ---------------
Feb 22 01:37:20 rsg1 kernel: rport-3:0-1: blocked FC remote port time out:
removing target and saving binding
Feb 22 01:37:24 rsg1 kernel: end_request: I/O error, dev sdd, sector 418774920
Feb 22 01:37:24 rsg1 kernel: md: super_written gets error=-5, uptodate=0
Feb 22 01:37:24 rsg1 kernel: raid1: Disk failure on sdd1, disabling device.
Feb 22 01:37:24 rsg1 kernel: ^IOperation continuing on 1 devices
Feb 22 01:37:24 rsg1 kernel: scsi 3:0:0:1: rejecting I/O to dead device
Feb 22 01:37:24 rsg1 kernel: md: super_written gets error=-5, uptodate=0
Feb 22 01:37:24 rsg1 kernel: scsi 3:0:0:1: rejecting I/O to dead device
Feb 22 01:37:24 rsg1 kernel: md: super_written gets error=-5, uptodate=0
Feb 22 01:37:24 rsg1 kernel: RAID1 conf printout:
Feb 22 01:37:24 rsg1 kernel: --- wd:1 rd:2
Feb 22 01:37:24 rsg1 kernel: disk 0, wo:0, o:1, dev:sdc1
Feb 22 01:37:24 rsg1 kernel: disk 1, wo:1, o:0, dev:sdd1
Feb 22 01:37:24 rsg1 kernel: RAID1 conf printout:
Feb 22 01:37:24 rsg1 kernel: --- wd:1 rd:2
Feb 22 01:37:24 rsg1 kernel: disk 0, wo:0, o:1, dev:sdc1
Feb 22 01:39:26 rsg1 kernel: kobject_add failed for 3:0:0:1 with -EEXIST, don't
try to register things with the same name in the same directory.
Feb 22 01:39:26 rsg1 kernel: [<c01bddde>] kobject_shadow_add+0x101/0x10a
Feb 22 01:39:26 rsg1 kernel: [<c01edc5d>] device_add+0x7c/0x374
Feb 22 01:39:26 rsg1 kernel: [<c020cb47>] scsi_sysfs_add_sdev+0x27/0x148
Feb 22 01:39:26 rsg1 kernel: [<c020b09f>] scsi_add_lun+0x2b7/0x2cc
Feb 22 01:39:26 rsg1 kernel: [<c020b228>] scsi_probe_and_add_lun+0x174/0x206
Feb 22 01:39:26 rsg1 kernel: [<c020b37d>] scsi_sequential_lun_scan+0xc3/0xda
Feb 22 01:39:26 rsg1 kernel: [<c020b98e>] __scsi_scan_target+0xd1/0xe8
Feb 22 01:39:26 rsg1 kernel: [<c020ba4d>] scsi_scan_target+0xa8/0xc2
Feb 22 01:39:26 rsg1 kernel: [<c0213529>] fc_scsi_scan_rport+0x0/0x73
Feb 22 01:39:26 rsg1 kernel: [<c021357e>] fc_scsi_scan_rport+0x55/0x73
Feb 22 01:39:26 rsg1 kernel: [<c01227f9>] run_workqueue+0x77/0xf3
Feb 22 01:39:26 rsg1 kernel: [<c0122875>] worker_thread+0x0/0xb1
Feb 22 01:39:26 rsg1 kernel: [<c012291c>] worker_thread+0xa7/0xb1
Feb 22 01:39:26 rsg1 kernel: [<c01256e9>] autoremove_wake_function+0x0/0x33
Feb 22 01:39:26 rsg1 kernel: [<c01256e9>] autoremove_wake_function+0x0/0x33
Feb 22 01:39:26 rsg1 kernel: [<c0122875>] worker_thread+0x0/0xb1
Feb 22 01:39:26 rsg1 kernel: [<c0125307>] kthread+0x34/0x55
Feb 22 01:39:26 rsg1 kernel: [<c01252d3>] kthread+0x0/0x55
Feb 22 01:39:26 rsg1 kernel: [<c01030c7>] kernel_thread_helper+0x7/0x10
Feb 22 01:39:26 rsg1 kernel: =======================
------------------------------------------------------------ ---------------

As can be seen from the first set of logs the /dev/sdd device is what
gets kicked out of the RAID1 device. The SAN target coming back
on-line triggers an event which results in the same device being
re-discovered as /dev/sde.

My assumption is device discovery is forced as the 'next' SCSI device
due to the fact the RAID1 MD device is still holding a reference to
the failed /dev/sdd device, thus preventing its re-use. Thats at
least how I interpret the:

Feb 22 01:39:26 rsg1 kernel: kobject_add failed for 3:0:0:1 with -EEXIST, don't
try to register things with the same name in the same directory.

And the resultant stack trace.

The interaction of this with udev tends to complicate the situation
from a systems administration point of view. The loss of the /dev/sdd
device is picked up by udevd which causes node entries for the block
device and its associated partitions to be removed.

This tends to violate the premise of 'least surprise' when the systems
staff logs in to recover the situation and can't remove the failed
device from the RAID1 device and are potentially forced to add a
completely different device back into the MD device to recover from
the event.

Comment 2 Jan Lieskovsky 2008-04-15 11:16:47 UTC

Contacted Alan and Aristeu Rozanski due this one --
if this issue is the same as BZ#239604 [RHEL5] console: kobject_add failed,
which is already tracked in BZ. Aristeu's reaction:


Hello Jan,

These issues aren't related. this message is displayed when any part of
the kernel try to register an object with an existing name and usually
means that there's a bug on the caller. On the vcs case, there's a race
on the virtual console code while opening a not totally closed virtual
console.
If there isn't a BZ# about this RAID bug, please create one and contact
the RAID maintainer.

-- 
Aristeu

Comment 3 Doug Ledford 2008-06-14 17:54:28 UTC

There are two distinctly different bugs in this issue.  The first is the kicking
the error up to the filesystem and causing the ext3 filesystem to go readonly. 
That particular bug is a duplicate of bug 430984, so I'm going to entirely
ignore that part of this bug report.  The second issue is the reregistration of
the fc disk under a different sd name due to the kobject still existing when the
device is rescanned.  I'm going to defer my decision on whether or not that can
be fixed in this release cycle until after I've completed 430984 and can see if
it impacts this at all.

Comment 5 RHEL Program Management 2008-07-25 17:04:37 UTC

This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".

Comment 6 Ludek Smid 2008-07-25 21:53:55 UTC

Unfortunately the previous automated notification about the
non-inclusion of this request in Red Hat Enterprise Linux 5.3 used
the wrong text template. It should have read: this request has been
reviewed by Product Management and is not planned for inclusion
in the current minor release of Red Hat Enterprise Linux.

If you would like this request to be reviewed for the next minor
release, ask your support representative to set the next rhel-x.y
flag to "?" or raise an exception.

Comment 9 Jes Sorensen 2012-05-04 14:56:03 UTC

(In reply to comment #3)
> There are two distinctly different bugs in this issue.  The first is the kicking
> the error up to the filesystem and causing the ext3 filesystem to go readonly. 
> That particular bug is a duplicate of bug 430984, so I'm going to entirely
> ignore that part of this bug report.  The second issue is the reregistration of
> the fc disk under a different sd name due to the kobject still existing when the
> device is rescanned.  I'm going to defer my decision on whether or not that can
> be fixed in this release cycle until after I've completed 430984 and can see if
> it impacts this at all.

Looks like the second issue is a duplicate of
https://bugzilla.redhat.com/show_bug.cgi?id=668905 even though it was reported
there much later.

There's a reference to an upstream fix, so I will dupe this bug to 668905 
and work it there.

Jes

*** This bug has been marked as a duplicate of bug 668905 ***