md: avoid endless recovery loop when waiting for fail device to complete.
commit1ca39696ba621b0737c78af2c104939c60b29ce4
authorNeilBrown <neilb@suse.de>
Tue, 28 Jun 2011 06:59:42 +0000 (28 16:59 +1000)
committerGreg Kroah-Hartman <gregkh@suse.de>
Wed, 13 Jul 2011 03:29:25 +0000 (13 05:29 +0200)
treeba3fee215495632bc74e5066727ac46fa99c45c9
parent48984ada7416c0f533bc2a0f886ccb6c3646fe9a
md: avoid endless recovery loop when waiting for fail device to complete.

commit 4274215d24633df7302069e51426659d4759c5ed upstream.

If a device fails in a way that causes pending request to take a while
to complete, md will not be able to immediately remove it from the
array in remove_and_add_spares.
It will then incorrectly look like a spare device and md will try to
recover it even though it is failed.
This leads to a recovery process starting and instantly aborting over
and over again.

We should check if the device is faulty before considering it to be a
spare.  This will avoid trying to start a recovery that cannot
proceed.

This bug was introduced in 2.6.26 so that patch is suitable for any
kernel since then.

Reported-by: Jim Paradis <james.paradis@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
drivers/md/md.c