linux-imx/drivers/md
NeilBrown 04bf190d93 md: avoid endless recovery loop when waiting for fail device to complete.
commit 4274215d24633df7302069e51426659d4759c5ed upstream.

If a device fails in a way that causes pending request to take a while
to complete, md will not be able to immediately remove it from the
array in remove_and_add_spares.
It will then incorrectly look like a spare device and md will try to
recover it even though it is failed.
This leads to a recovery process starting and instantly aborting over
and over again.

We should check if the device is faulty before considering it to be a
spare.  This will avoid trying to start a recovery that cannot
proceed.

This bug was introduced in 2.6.26 so that patch is suitable for any
kernel since then.

Reported-by: Jim Paradis <james.paradis@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2011-08-01 13:54:58 -07:00
..
raid6test md: drivers/md/unroll.pl replaced with awk analog 2009-10-16 16:25:19 +11:00
.gitignore gitignore: misc files 2006-01-01 22:21:50 +01:00
bitmap.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-05-21 19:37:45 -07:00
bitmap.h md/raid1: delay reads that could overtake behind-writes. 2010-05-18 15:27:57 +10:00
dm-bio-record.h dm: preserve bi_io_vec when resubmitting bios 2009-04-02 19:55:23 +01:00
dm-crypt.c dm table: remove unused dm_get_device range parameters 2010-03-06 02:32:27 +00:00
dm-delay.c dm table: remove unused dm_get_device range parameters 2010-03-06 02:32:27 +00:00
dm-exception-store.c dm snapshot: test chunk size against both origin and snapshot 2010-08-26 16:45:55 -07:00
dm-exception-store.h dm snapshot: test chunk size against both origin and snapshot 2010-08-26 16:45:55 -07:00
dm-io.c dm io: handle empty barriers 2009-12-10 23:52:22 +00:00
dm-ioctl.c dm: separate device deletion from dm_put 2010-08-26 16:46:18 -07:00
dm-kcopyd.c dm kcopyd: accept zero size jobs 2009-12-10 23:52:13 +00:00
dm-linear.c dm table: remove unused dm_get_device range parameters 2010-03-06 02:32:27 +00:00
dm-log-userspace-base.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
dm-log-userspace-transfer.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
dm-log-userspace-transfer.h dm log: userspace add luid to distinguish between concurrent log instances 2009-09-04 20:40:34 +01:00
dm-log.c dm table: remove unused dm_get_device range parameters 2010-03-06 02:32:27 +00:00
dm-mpath.c dm mpath: disable blk_abort_queue 2011-03-31 11:57:55 -07:00
dm-mpath.h dm mpath: remove is_active from struct dm_path 2008-10-10 13:36:58 +01:00
dm-path-selector.c dm: path selector use module refcount directly 2009-04-02 19:55:27 +01:00
dm-path-selector.h dm mpath: add start_io and nr_bytes to path selectors 2009-06-22 10:12:27 +01:00
dm-queue-length.c dm mpath: add queue length load balancer 2009-06-22 10:12:27 +01:00
dm-raid1.c dm raid1: fix deadlock when suspending failed device 2010-03-06 02:32:35 +00:00
dm-region-hash.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
dm-round-robin.c dm mpath: add start_io and nr_bytes to path selectors 2009-06-22 10:12:27 +01:00
dm-service-time.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
dm-snap-persistent.c dm snapshot: persistent annotate work_queue as on stack 2010-02-16 18:42:51 +00:00
dm-snap-transient.c dm snapshot: move cow ref from exception store to snap core 2009-12-10 23:52:12 +00:00
dm-snap.c dm snapshot: test chunk size against both origin and snapshot 2010-08-26 16:45:55 -07:00
dm-stripe.c dm table: remove unused dm_get_device range parameters 2010-03-06 02:32:27 +00:00
dm-sysfs.c Driver core: Constify struct sysfs_ops in struct kobj_type 2010-03-07 17:04:49 -08:00
dm-table.c dm table: reject devices without request fns 2011-08-01 13:54:53 -07:00
dm-target.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
dm-uevent.c dm table: remove dm_get from dm_table_get_md 2010-03-06 02:29:52 +00:00
dm-uevent.h dm: uevent generate events 2007-10-20 02:01:26 +01:00
dm-zero.c dm: consolidate target deregistration error handling 2009-01-06 03:04:58 +00:00
dm.c dm: dont take i_mutex to change device size 2011-03-31 11:57:55 -07:00
dm.h dm: separate device deletion from dm_put 2010-08-26 16:46:18 -07:00
faulty.c Merge commit '3ff195b011d7decf501a4d55aeed312731094796' into for-linus 2010-05-22 08:31:36 +10:00
Kconfig md: remove EXPERIMENTAL designation from RAID10 2010-05-18 15:27:58 +10:00
linear.c Merge commit '3ff195b011d7decf501a4d55aeed312731094796' into for-linus 2010-05-22 08:31:36 +10:00
linear.h md/linear: use call_rcu to free obsolete 'conf' structures. 2009-06-18 08:49:42 +10:00
Makefile md: drivers/md/unroll.pl replaced with awk analog 2009-10-16 16:25:19 +11:00
md.c md: avoid endless recovery loop when waiting for fail device to complete. 2011-08-01 13:54:58 -07:00
md.h md: Fix - again - partition detection when array becomes active 2011-03-31 11:58:54 -07:00
mktables.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
multipath.c Merge commit '3ff195b011d7decf501a4d55aeed312731094796' into for-linus 2010-05-22 08:31:36 +10:00
multipath.h md: remove mddev_to_conf "helper" macro 2009-06-16 16:54:21 +10:00
raid1.c md/raid1: really fix recovery looping when single good device fails. 2010-12-14 23:40:07 +01:00
raid1.h md/raid1: add takeover support for raid5->raid1 2009-12-14 12:51:41 +11:00
raid5.c md/raid5: fix raid5_set_bi_hw_segments 2011-08-01 13:54:56 -07:00
raid5.h percpu: add __percpu sparse annotations to what's left 2010-02-17 11:17:38 +09:00
raid6algos.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
raid6altivec.uc md: drivers/md/unroll.pl replaced with awk analog 2009-10-16 16:25:19 +11:00
raid6int.uc md: drivers/md/unroll.pl replaced with awk analog 2009-10-16 16:25:19 +11:00
raid6mmx.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6recov.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6sse1.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6sse2.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6x86.h md: fix typo in FSF address 2009-03-31 14:57:37 +11:00
raid10.c md: protect against NULL reference when waiting to start a raid10. 2011-02-06 11:03:42 -08:00
raid10.h md: fix handling of array level takeover that re-arranges devices. 2010-06-24 13:33:24 +10:00
raid0.c md: enable raid4->raid0 takeover 2010-06-24 13:34:57 +10:00
raid0.h md: fix handling of array level takeover that re-arranges devices. 2010-06-24 13:33:24 +10:00
unroll.awk md: drivers/md/unroll.pl replaced with awk analog 2009-10-16 16:25:19 +11:00