diff options
author | madduck <madduck@3cfab66f-1918-0410-86b3-c06b76f9a464> | 2006-10-26 19:20:12 +0000 |
---|---|---|
committer | madduck <madduck@3cfab66f-1918-0410-86b3-c06b76f9a464> | 2006-10-26 19:20:12 +0000 |
commit | 6266e55ae68d503bf4297170f5304b59009ba77f (patch) | |
tree | fd3d1b68182d567e387cd07ccfa952e72e45f004 /debian/FAQ | |
parent | 577ef7c9ac951f20f6697a52ee25fc61f38df0c0 (diff) |
new qs 18-20
Diffstat (limited to 'debian/FAQ')
-rw-r--r-- | debian/FAQ | 81 |
1 files changed, 81 insertions, 0 deletions
@@ -144,6 +144,8 @@ Also see /usr/share/doc/mdadm/README.recipes.gz use RAID6, or store more than two copies of each block (see the --layout option in the mdadm(8) manpage). + See also question 18 further down. + 0. it's actually 1/(n-1), where n is the number of disks. I am not a mathematician, see http://aput.net/~jheiss/raid10/ @@ -402,6 +404,85 @@ Also see /usr/share/doc/mdadm/README.recipes.gz overridden with the --force and --assume-clean options, but it is not recommended. Read the manpage. +18. How many failed disks can a RAID10 handle? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + (see also question 4b) + + The following table shows how many disks you can lose and still have an + operational array. In some cases, you *can* lose more than the given number + of disks, but there is no guarantee that the array survives. Thus, the + following is the guaranteed number of failed disks a RAID10 array survives + and the maximum number of failed disks the array can (but is not guaranteed + to) handle, given the number of disks used and the number of data block + copies. Note that 2 copies means original + 1 copy. Thus, if you only have + one copy, you cannot handle any failures. + + 1 2 3 4 (# of copies) + 1 0/0 0/0 0/0 0/0 + 2 0/0 1/1 1/1 1/1 + 3 0/0 1/1 2/2 2/2 + 4 0/0 1/2 2/2 3/3 + 5 0/0 1/2 2/2 3/3 + 6 0/0 1/3 2/3 3/3 + 7 0/0 1/3 2/3 3/3 + 8 0/0 1/4 2/3 3/4 + (# of disks) + + Note: I have not really verified the above information. Please don't count + on it. If a disk fails, replace it as soon as possible. Corrections welcome. + +19. What should I do if a disk fails? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Replace it as soon as possible: + + mdadm --remove /dev/md0 /dev/sda1 + halt + <replace disk and start the machine> + mdadm --add /dev/md0 /dev/sda1 + +20. So how do I find out which other disk(s) can fail without killing the + array? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Did you read the previous question and its answer? + + For cases when you have two copies of each block, the question is easily + answered by looking at the output of /proc/mdstat. For instance on a four + disk array: + + md3 : active raid10 sdg7[3] sde7[0] sdh7[2] sdf7[1] + + you know that sde7/sdf7 form one pair and sdg7/sgh7 the other. + + If sdh now fails, this will become + + md3 : active raid10 sdg7[3] sde7[0] sdh7[4](F) sdf7[1] + + So now the second pair is broken; the array could take another failure in + the first pair, but if sdg now also fails, you're history. + + Now go and read question 19. + + For cases with more copies per block, it becomes more complicated. Let's + think of a seven disk array with three copies: + + md5 : active raid10 sdg7[6] sde7[4] sdb7[5] sdf7[2] sda7[3] sdc7[1] sdd7[0] + + Each mirror now has 7/3 = 2.33 disks to it, so in order to determine groups, + you need to round up. Note how the disks are arranged in increasing order of + their indices (the number in brackes in /proc/mdstat): + + disk: -sdd7- -sdc7- -sdf7- -sda7- -sde7- -sdb7- -sdg7- + group: [ one ][ two ][ three ] + + Basically this means that after two disk failed, you need to make sure that + the third failed disk doesn't destroy all copies of any given block. And + that's not always easy as it depends on the layout chosen: whether the + blocks are near (same offset within each group), far (spread apart in a way + to maximise the mean distance), or offset (offset by size/n within each + block). + + I'll leave it up to you to figure things out. Now go read question 19. + -- martin f. krafft <madduck@debian.org> Thu, 26 Oct 2006 11:05:05 +0200 $Id$ |