Convert Root System to Bootable Software RAID1 (Debian)

How to convert a Debian system to bootable Software RAID 1 with a second hard drive, 'mdadm' and a few standard UNIX tools

Version 0.97 (2004-06-03) Lucas Albers -- admin At cs DOT montana dot edu and Roger Chrisman
Home of most recent version: http://alioth.debian.org/projects/rootraiddoc
Thanks to: Alvin Olga, Era Eriksson, Yazz D. Atlas, James Bromberger, Timothy F Nagy, and alioth.debian.org

WARNING: No warranty of any kind. Proceed at your own risk. A typo, especially in lilo.conf, can leave your system unbootable. Back-up data and make a boot floppy before starting this procedure.

Table of Contents

Summary

Procedure

  1. Install Debian
    on your Primary Master disk -- hda. Or if you already have Debian installed, go to step 2.
  2. Upgrade to RAID savvy Kernel
    and install 'mdadm'.
  3. Setup RAID 1
    declaring disk-one 'missing' and disk-two hdc.
  4. Copy your Debian system
    from hda to /dev/md0 ('missing' + 'hdc').
  5. Reboot to RAID device.
  6. Reformat hda as 'fd' and declare it as disk-one of your RAID,
    and watch the booted RAID system automatically mirror itself onto the new drive. Done.

Alternate grub/initrd procedure

  1. Part II. RAID using initrd and grub

Appendix

  1. RAID Introduction
  2. Drive designators (hda, hdb, hdc, hdd), jumpers and cables
  3. Setting up software RAID for multiple partitions
  4. Lilo
  5. Copying Data
  6. Rebooting
  7. Initrd
  8. Verify that system will boot even with one disk off-line
  9. Setting up a RAID 1 Swap device
  10. Performance Optimizations
  11. Disaster Recovery
  12. Quick Reference
  13. Troubleshooting
  14. Raid Disk Maintenance

References

^

Summary

We begin with Debian installed on the Primary Master drive, hda (step 1). We need RAID support in our Kernel (step 2). We add another disk as Secondary Master, hdc, set it up for RAID (step 3), and copy Debian to it (step 4). Now we can reboot to the RAID device (step 5) and declare hda part of the RAID and it automatically syncs with hdc to complete our RAID 1 device (step 6).

If all goes well

Use this HowTo at your own risk. We are not responsible for what happens!

First things first

Whenever you change your partitions, you need to reboot! (If you know what you are doing, ignore this advice.)

I assume you will mess up a step so wherever possible, we include verification.

I use 'mdadm' because it is easier than 'raidtools' or 'raidtools2'.


We now have grub and lilo directions, grub directions are still in beta form.
Read the grub directions, and comment on them.

^

Procedure

1. Install Debian

Do a fresh install the normal way on your first drive, hda (the Primary Master drive in your computer). Or, if you already have a running Debian system that you want to use on hda; skip ahead to step 2. If you need Debian installation instructions, see:

Debian Installation HowTo » http://www.debian.org/releases/stable/installmanual

Sarge Debian Installation HowTo » http://d-i.alioth.debian.org/manual/

^

2. Upgrade to a RAID savvy Kernel

2.1 Compile and install a RAID savvy Kernel.

RAID must be compiled into the Kernel, not added as a module, for you to boot from the RAID device (unless you use a RAID savvy initrd kernel or boot from a non-RAID boot drive. (I now cover initrd methods!). You need RAID 1 but I usually include RAID 5, too. For step by step Kernel compile and install instructions, see:

Creating custom Kernels with Debian's kernel-package system » http://newbiedoc.sourceforge.net/system/kernel-pkg.html


2.2 Verify your RAID savvy Kernel.

cat /proc/mdstat

(You should see the RAID "personalities" your Kernel supports.)

Something like this:

Personalities : [linear] [raid0] [raid1] [raid5] read_ahead 1024 sectors md4 : active raid5 hdh4[3] hdg4[2] hdf4[1] hde4[0] 356958720 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU] unused devices:

YOU MUST VERIFY you have raid support via /proc/mdstat. This is the most important item to verify before going any farther. So the kernel has to support it or you have to load the modules in initrd.

(This will show you if raid is compiled into kernel, or detected as a module from initrd.) /etc/modules will not list RAID if Kernel has RAID compiled in instead of loaded as modules.
Use lsmod to list currently loaded modules, this will show raid modules loaded.

reiserfs
raid1
ext2
ide-disk
raid5
ext3

cat /etc/modules

(IF YOU SEE ANY RAID LISTED IN /etc/modules, then you probably have your Kernel loading RAID via modules. That will prevent you from booting from your RAID device, unless you use initrd. To boot from your RAID device, unless you use a RAID savvy initrd, you need RAID compiled into Kernel, not added as a module.)

2.3 Install 'mdadm':

apt-get install mdadm

2.4 List what IDE devices you have:

ls /proc/ide

^

3. Setup RAID 1

Setup RAID 1 and declare disk-one of your RAID to be 'missing' and disk-two of your RAID to be 'hdc'.

3.1 Create RAID (fd) partition on hdc

Warning: ALWAYS give the partition when editing with cfdisk. By default cfdisk will select the first disk in the system. I accidentally wiped the wrong partition with cfdisk, once.

Do A or B, either way will work:

A. Create partitions on new disk.

cfdisk /dev/hdc

or

B. copy existing partitions to new disk with sfdisk.

sfdisk -d /dev/hda | sfdisk /dev/hdc


NOTE: On some disks you cannot copy over the partitions correctly using this method
It will detect the new partition as 0 size or a strange size.
You will need to manually create the partitions, making them the same size with cfdisk.

3.2 Create correct partition type signatures on new partition.

cfdisk /dev/hdc

reboot

(To verify that everything is working ok.)

3.3 Create RAID device

that has two members and one of the members does not exist yet. md0 is the RAID partition we are creating, /dev/hdc1 is the initial partition. We will be adding /dev/hda1 back into the /dev/md0

RAID set after we boot into /dev/md0.

mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/hdc1

If this gives errors then you need to zero the super block, see useful mdadm commands.

3.4 Format RAID device

You can use reiserfs or ext3 for this, both work, I use reiserfs for larger devices. Go with what you trust.

mkfs.ext3 /dev/md0

or

mkfs -t reiserfs /dev/md0

^

4. Copy your Debian system

Copy your Debian system from hda to /dev/md0 ('missing' + 'hdc'). Then, check to make sure that the new RAID device is still setup right and can be mounted correctly. We do this with an entry in hda's /etc/fstab and a reboot. Note that by editing hda's /etc/fstab after the copy, instead of before, we leave the copy on md0 unaltered and only are editing hda's /etc/fstab.

NB: THIS IS A BRANCH IN OUR SYSTEM CONFIGURATION (eg temporary!), but it will overwritten later by the md0 version of /etc/fstab by the sync in step 6.

4.1 Create a mount point.

mkdir /mnt/md0

4.2 Mount your RAID device.

mount /dev/md0 /mnt/md0

4.3 Copy your Debian system to RAID device.

cp -axu / /mnt/md0

Please refer to the Copying data section to verify you copied the data correctly.
See Copying Data

You don't need the -u switch; it just tells cp not to copy the files again if they exist. If you are running the command a second time it will run faster with the -u switch.

4.4 Edit /etc/fstab so that you mount your new RAID partition on boot up.

This verifies that you have the correct partition signatures on the partition and that your partition is correct. Sample Line in /etc/fstab:

/dev/md0 /mnt/md0 ext3 defaults 0 0

Then

reboot

And see if the RAID partition comes up.

mount

Should show /dev/md0 mounted on /mnt/md0.

^

5. Reboot to RAID device

For step 5 reboot, we will tell Lilo that

We will, as before, be using hda's MBR (Master Boot Record is the first 512 bytes on a disk and is what the BIOS reads first in determining how to boot up a system) and hda's /boot dir (the kernel-image and some other stuff live here), but instead of mounting root (/) from hda, we will mount md0's root (/) (the root of our RAID device, currently running off of only hdc because we declared the first disk 'missing').

5.1 Configure Lilo to boot to the RAID device

(Later we will configure Lilo to write the boot sector to the RAID boot device also, so we can still boot even if either disk fails.)

Add a stanza labeled 'RAID' to /etc/lilo.conf on hda1 so that we can boot with /dev/md0, our RAID device, as root (/):

#the same boot drive as before.
boot=/dev/hda
image=/vmlinuz
label=RAID
read-only
#our new root partition.
root=/dev/md0

That makes an entry labeled 'RAID' specific to the RAID device, so you can still boot to /dev/hda if /dev/md0 does not work.

sample complete lilo.conf file:

#sample working lilo.conf for raid.
#hda1,hdc1 are boot, hda2,hdc2 are swap
#hda3,hdc3 are the partition used by array
#root partition is /dev/md3 on / type reiserfs (rw)
#I named the raid volumes the same as the partition numbers
#this is the final lilo.conf file of a system completely finished,
#and booted into raid.


lba32
boot=/dev/md1
root=/dev/hda3
install=/boot/boot-menu.b
map=/boot/map
prompt
delay=50
timeout=50
vga=normal
raid-extra-boot=/dev/hda,/dev/hdd
default=RAID
image=/boot/vmlinuz-RAID
label=RAID
read-only
root=/dev/md3
alias=1
image=/vmlinuz
label=Linux
read-only
alias=2

image=/vmlinuz.old
label=LinuxOLD
read-only
optional

5.2 Test our new lilo.conf

lilo -t -v

(With a RAID installation, always run lilo -t first just to have Lilo tell you what it is about to do; use the -v flag, too, for verbose output.)

5.3 Run Lilo

Configure a one time Lilo boot via the -R flag and with a reboot with Kernel panic

The -R <boot-parameters-here> tells Lilo to only use the specified image for the next boot. So once you reboot it will revert to your old Kernel.

From 'man lilo':
-R command line
This option sets the default command for the boot loader the next time it executes. The boot loader will then erase this line: this is a once-only command. It is typically used in reboot scripts, just before calling `shutdown -r'. Used without any arguments, it will cancel a lock-ed or fallback command line.

Before you can do the 'lilo -v -R RAID' command, you must first do a 'lilo' command to update the Lilo boot record with the contents of your new lilo.conf. Otherwise Lilo does not know what you mean by 'RAID' and you just get a 'Fatal: No image "RAID" is defined' error message when you do 'lilo -v -R RAID'. So,

lilo
lilo -v -R RAID

5.4 Edit /mnt/md0/etc/fstab and reboot

to have /dev/md0 mount as root (/), when Lilo boots from our RAID device, /dev/md0.

Previous root (/) in fstab was:

/dev/hda1 / reiserfs defaults 0 0

Edit it to:

/dev/md0 / ext3 defaults 0 0

Note: edit /mnt/md0/etc/fstab, not /etc/fstab, because at the moment we are booted with hda1 as root (/) but we want to change the /etc/fstab that we currently have mounted on /mnt/md0/etc/fstab, our RAID device.

Reboot to check if system boots our RAID device, /dev/md0, as root (/). If it does not, just reboot again and you will come up with your previous boot partition courtesy of the -R flag in step 5.3 above.

reboot

Verify /dev/md0 is mounted as root (/)

mount

should show:

/dev/md0 on / type reiserfs (rw)
proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)

'type reiserfs' is just my example; you will see whatever your file system type is.

Now we are booted into the new RAID device -- md0 as root (/). Our RAID device only has one disk in it at the moment because we earlier declared the other disk as 'missing'. That was because we needed that other disk, hda, to install Debian on or because it was our pre-existing Debian system.

^

6. Reformat hda as 'fd' and declare it as disk-one of your RAID

For step 6 reboots, we tell Lilo that

Here we not only use md0's root (/) as in step 5, but also md0's /boot (it contains an identical kernel-image to the one on hda because we copied it here from hda in step 4, but we will be overwriting everything on hda in step 6 and can't continue relying on the stuff on hda) and MBR from either hda or hdc, whichever the BIOS can find (they will be identical MBRs and the BIOS will still find hda's MBR but in case the hda disk were to fail down the road we would want the BIOS to look on hdc as a fail over so that it could still boot up the system).

6.1 Change the signature on /dev/hda to software RAID

cfdisk /dev/hda

My two hard disks are from different manufacturers and as it happens, while both are roughly 40G, they have different architectures in terms of sectors and precise size. So cfdisk was unable to make the partitions precisely the same size and I had hda1 29,997.60MB and hdc1 30,000MB. This didn't work when I get to the 'mdadm --add /dev/md0 /dev/hda1' step. I got a, "failed: no space left on device!" error. So I ran cfdisk again and made hda1 slightly larger than hdc1, since I could not make them both exactly the same size. Now hda1 is 30,005.83MB and the 'mdadm -add /dev/md0 /dev/hda1' step works :-). (The remaining 10,000MB on each disk I am using for other purposes, including a md1 of 1,000MB composed of hda2 and hdc2.)

6.2 Add the first-disk to our existing RAID device

And watch the booted RAID system automatically mirror itself onto the new drive. We are currently booted from MBR and /boot device on /dev/hdc1, with /dev/md0 as root (/).

mdadm --add /dev/md0 /dev/hda1

Note: We are adding /dev/hda1 into our existing RAID device. See if it is syncing.

cat /proc/mdstat

should show that it is syncing.

6.3 Write new /etc/lilo.conf settings

these are from when we are booted onto RAID.

boot=/dev/md0
root=/dev/md0
#this writes the boot signatures to either disk.
raid-extra-boot=/dev/hda,/dev/hdc
image=/vmlinuz
label=RAID
read-only

YOU NEED THE raid-extra-boot to have it write the boot loader to all the disks.

YOU ARE OVERWRITING THE BOOT LOADER ON BOTH /dev/hda and /dev/hdc.

You can keep your old boot option to boot /dev/hda so you can boot RAID and /dev/hda.

But remember you don't want to boot into a RAID device in non RAID as it will hurt the synchronization. If you make changes on one disk and not the other.

6.4 Run Lilo with -R option and reboot

(we are currently booted into RAID)

lilo -t -v

lilo -R RAID

The -R option tells Lilo it to use the new Lilo setting only for the next reboot, and then revert back to previous setting.

Note 1: Step 6.4 returned an error, "Fatal: Trying to map files from unnamed device 0x0000 (NFS/RAID mirror down ?)."

So I waited for the synchronization, started in Step 6.2, to finish (checking it with 'cat /proc/mdstat'). Once it was done, did 'lilo -t -v' again. No "Fatal" error; Lilo seems happy now (no "Fatal" message).

Note 1a: The synchronization however took two hours! I checked with 'hdparm' and it seems I have DMA turned off. Perhaps the synchronization would go faster with DMA turned on. Some examination of my system revealed that I did not have my computer's PCI chipset support compiled into my custom kernel. I recompiled the kernel (kernel 2.6.4) and selected the correct PCI chipset support for my computer and now DMA works correctly :-) and by default. For DMA to be default is also configurable in the PCI area of 'make menuconfig' during kernel compile configuration, and I chose it.

So I can now do Lilo with '-R ' switch and reboot.

Note 2: another error, "Fatal: No image "RAID" is defined."

As in Step 5.3 above, I need to do 'lilo' first so that Lilo reads my new /etc/lilo.conf, otherwise Lilo does not know about my stanza labeled "RAID" which is new in my lilo.conf. (Yes I told Lilo about it on hda1 in step 5.3, but that was after I had copied the hda1 root (/) system to here, md0, which branched my system into two separate system configurations. So it needs to be done here, too. Then I can do 'lilo -R RAID'.

Note 2a: However, the '-R' switch is pointless here unless the lilo.conf stanza labeled "RAID" is *not* the first kernel-image stanza in my lilo.conf. Because if it *is* the first stanza, then it is the default stanza anyway, with or without the '-R'.

Then

reboot

and check

cat /proc/mdstat

and check

mount

to be sure all is as expected.

6.5 Now run Lilo normally (without -R) and reboot

See what Lilo will do.

lilo -t -v

If it looks okay, do it:

lilo

reboot

and check

cat /proc/mdstat

and check

mount

as a final system check.

Done.

^

Part II. RAID using initrd and grub

- Ferdy Nagy

I used the following procedure with stock Debian 2.6.5, which has an initrd with all the modules ready to boot into RAID. The procedure also covers using grub as the boot loader. I built this from a bare install of Sarge using the new installer with grub as the boot loader, but most of this document is distro independent. My file system throughout is ext3 and it shouldn't take too much to use reiserfs.

These steps reference back to the procedure sections outlined above and indicate where things differ due to initrd or grub, so you will have to read/do/be familiar with the above steps. Also, make sure you currently use grub as your boot loader, if you are using LILO, install grub and make sure it works before proceeding!

Section - 2. Upgrade to a RAID savvy kernel

Section 2

When using initrd the kernel does not need to have the RAID compiled in, they will be loaded as modules. Make sure the kernel loads the RAID modules.

Edit /etc/modules and add

md
raid1

Section - 3. Setup RAID 1

Follow section 3 to setup the RAID 1.

Section - 4. Copy your Debian system

Follow section 4 to copy the debian system.

Section - 5. Reboot to RAID device

Instead of section 5 using LILO, grub is used as the boot loader, and initrd used to load the kernel. A new kernel entry in the grub menu is created that refers to an initrd that is created which will start the md [raid] device. The original kernel entry will remain and can be reverted to if something goes wrong until RAID is running. This will still use grub loaded installed on the /dev/hda MBR.

5.1 Build a new RAID initrd

A) Make sure the initrd has the modules it needs, by editing /etc/mkinitrd/modules. Add the following [you can see what modules are available by mounting the initrd and looking in the lib/modules - see section 8.]:

md
raid1

B) Update the initrd so that the root device loaded is the raid device, not probed. Edit the /etc/mkinitrd/mkinitrd.conf, and update the ROOT line
ROOT=/dev/md0

C) Create the new initrd and a link to it.

mkinitrd -o /boot/initrd.img-2.6.5-raid

5.2 Update the grub boot menu

edit /boot/grub/menu.lst

1. Add the following entry

title           Debian GNU/Linux, kernel 2.6.5-1-686 RAID
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.5-1-686 root=/dev/md0 ro
initrd          /boot/initrd.img-2.6.5-1-686-raid
savedefault
boot

2. Update the following kernel root option in the file. Note: the grub known issues, so this option will not be used anyway.

# kopt=root=/dev/md0 ro

5.3 Do the above 5.4 Edit /mnt/md0/etc/fstab and reboot

[Copied from Part I 5.4 above]

to have /dev/md0 mount as root (/), when grub boots from our RAID device, /dev/md0:

Previous root (/) in fstab was:

/dev/hda1 / ext3 defaults 0 0

Edit it to:

/dev/md0 / ext3 defaults 0 0

Note: edit /mnt/md0/etc/fstab, not /etc/fstab, because at the moment we are booted with hda1 as root (/) but we want to change the /etc/fstab that we currently have mounted on /mnt/md0/etc/fstab, our RAID device.

Reboot and choose the RAID kernel to check if system boots our RAID device, /dev/md0, as root (/). If it does not, just reboot again and choose the original pre-read kernel image

reboot

Verify /dev/md0 is mounted as root (/)

mount

should show something similar to:

/dev/md0 on / type ext3 (rw)
proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)

Now we are booted into the new RAID device -- md0 as root (/). Our RAID device only has one disk in it at the moment because we earlier declared the other disk as 'missing'. That was because we needed that other disk, hda, to install Debian on or because it was our pre-existing Debian system.

cat /proc/mdstat shows the [degraded] array is up and running, note the [_U] - second disk is up.

Section - 6. Reformat hda as fd and declare it as disk-one of your raid

6.1/2 Setup hda and add to array

Follow steps 6.1, and 6.2. Wait and make sure the drives are fully synced before proceeding.

6.3 re-run mkinitrd again, and reboot.

This is needed to make sure that mkinitrd starts the newly built array with all drives. mkinitrd uses mdadm -D to discover what drives to assemble in the array during startup, this is contained in a script in the initrd image. If this step is not done the next time you reboot the array will be degraded.

Do the following

mkinitrd -o /boot/initrd-2.6.5-raid.img

reboot

and check the array is fully up, look for the [UU]

cat /proc/mdstat

and check /dev/md0 is mounted

mount

7. Put grub into the MBR of the second disk

grub refers to the boot(ed) device as hd0, so if the primary hard drive (/dev/hda) fails the system will look for the next bootable device (/dev/hdc) and loads it's MBR, which grub will still refer to as hd0. So, the grub configuration can still use hd0 even when the primary device fails.

7.1 Put grub into the MBR

These steps temporarily tell grub the second device is hd0 and then loads the MBR.

start the grub command line, then run the load commands. Note: grub partition references are offset by 1, so in the following with a partition of /dev/hdc1, the root is (hd0,0) [previous line tells grub to set hdc as hd0]. If the partition was /dev/hdc2, the root would be (hd0,1)!

grub
grub> device (hd0) /dev/hdc
grub> root (hd0,0)
grub> setup (hd0)

7.2 Testing

reboot, verify the /proc/mdstat devices always start. Follow section VIII and verify the system boots with one disk off line.

8. Known Issues

grub

grub will already be installed on hda, and you will manually force grub to be installed on hdc so the MBRs are ok; however, install-grub and update-grub will fail because grub does not understand the md0 device. This is not a problem with install-grub as it will not be executed again after it has been installed, but update-grub is executed after an updated kernel is apt'd, causing an error to be reported by apt. The update-grub error is ok, the kernel gets installed and the initrd is created with all the md array information, provided the array was not degraded during the kernel upgrade. But you will have to manually update the grub menu.lst and add the new kernel information before you reboot, or the new kernel will not appear in the grub menu.

mkinitrd

When using mdadm, mkinitrd will only detect disks in the array that are running at the time of execution. You should not install a new kernel while the array is degraded, otherwise, even if you do an mdadm --add, the next reboot will still be degraded! The array is started at boot time by script. You can see what is in the script of the initrd by mounting it, e.g.

mount /boot/initrd.img-X.X.X /mnt -o loop
cat /mnt/script

And look for the array start line similar to

mdadm -A /devfs/md/0 -R -u 23d8dd00:bc834589:0dab55b1:7bfcc1ec /dev/hda1 /dev/hdc1

^

Appendix

I. RAID 1 Introduction

Redundant Array of Inexpensive Disks (RAID) refers to putting more than one hard disk to work together in various advantageous ways. Hardware RAID relies on special hardware controllers to do this and we do not covered in this HowTo. Software RAID, this HowTo, uses software plus the ordinary controllers on your computer's motherboard and works excellently.

RAID 1 is where you use two hard drives as if they were one by mirroring them onto each other. Advantages of RAID 1 are (a) faster data reads because one part of the data can be read from one of the disks while simultaneously another part of the data is read from the other disk, and (b) a measure of fail over stability -- if one of the disks in the RAID 1 fails, the system will usually stay online using the remaining drive while you find time to replace the failed drive.

To achieve the speed gain, the two disks that comprise your RAID 1 device must be on separate controllers (in other words, on separate drive cables). The first part of the data is read from one disk while simultaneously the second part of data is read from the other disk. Writing data to a RAID 1 device takes twice as long apparently. However, under most system use data is more often read from disk than written to disk. So RAID 1 almost doubles the effective speed of your drives. Nice.

RAID is not a substitute for regular data back ups. Many things can happen that destroy both your drives at the same time.

^

II. Drive designators (hda, hdb, hdc, hdd), jumpers and cables

Drive designators.

Drives on IDE 1 -- Primary Controller

Drives on IDE 2 -- Secondary Controller

Jumpers. When moving drives around in your computer, be sure to set the jumpers on your drives correctly. They are the little clips that connect two of various pins on your drive to set it to Cable Select, Master, or Slave. IDE drives usually have a diagram right on their case that shows where to set the clip for what setting. Different brands sometimes use different pin configurations.

Cables. Use 80 wire 40 pin IDE drive cables, not 40 wire 40 pin or you will slow down your hard drive access. For best results, cables should be no longer than the standard 18". If your cable has a blue end, that's the end to attach to the mother board (I don't know why). I don't think it matters which of the two drive connectors on the cable you plug your drive into, the middle or end one, unless you use Cable Select in which case I believe the sable's end plug is Master and its middle plug is Slave.

^

III. Setting up software RAID for multiple partitions.

You can have a multi-partition RAID system if you prefer. You just need to create multiple RAID devices.

I have found it useful when setting software RAID on multiple partitions to set the RAID device to the same name as the disk partition.

If you have 3 partitions on /dev/hda and I want to add /dev/hdc for software RAID, then boot /dev/hdc and add /dev/hda back into the device, exactly what I did earlier, but with 3 partitions which are: hda1=/boot, hda2=/, hda3=/var

sfdisk -d /dev/hda | sfdisk /dev/hdc;
reboot
mdadm --zero-superblock /dev/hda1
mdadm --zero-superblock /dev/hda2
mdadm --zero-superblock /dev/hda3
mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/hdc1
mdadm --create /dev/md2 --level=1 --raid-disks=2 missing /dev/hdc2
mdadm --create /dev/md3 --level=1 --raid-disks=2 missing /dev/hdc3
mkfs.reiserfs /dev/md1;mkfs.reiserfs /dev/md2; mkfs /dev/md3;
mkdir /mnt/md1 /mnt/md2 /mnt/md3;
cp -ax /boot /mnt/md1;cp -ax / /mnt/md2; cp -ax /var /mnt/md3;

add entry in current fstab for all 3 and REBOOT.

Sync data again, only copying changed stuff.

cp -aux /boot /mnt/md1;cp -aux / /mnt/md2; cp -aux /var /mnt/md3;

edit lilo.conf entry in this case:

boot=/dev/md1
root=/dev/md2

Edit /mnt/md2/etc/fstab to have / set to /dev/md2.

REBOOT into RAID.

Add devices in:

mdadm --add /dev/md1 /dev/hda1
mdadm --add /dev/md2 /dev/hda2

Wait for sync, write Lilo permanently, and REBOOT into your setup.

It is not harder to include more devices in a software RAID device.

^

IV. Lilo

You need special entries to use Lilo as your boot loader, I couldn't get grub to work, but nothing prevents you from using grub. Just standard Lilo/grub entries WILL NOT WORK FOR RAID.

Entries in /etc/lilo.conf:

raid-extra-boot=<option>

That option only has meaning for RAID 1 installations. The <option> may be specified as none, auto, mbr-only, or a comma-separated list of devices; e.g., "/dev/hda,/dev/hdc6".

panic='' line in lilo.conf tells Lilo to automatically boot back to the old install if something goes wrong with the new Kernel.

^

V. Copying data

Use "cp -aux" to just copy updated items. if you are copying a partition that is not root you need to copy the subdirectories and not the mount point, otherwise it will just copy the directory over. To copy boot which is a separately mounted partition to /mnt/md1 which is our new software RAID partition we copy as thus: "cp -aux /boot/* /mnt/md1" NOTE THE DIFFERENCE when copying mount points and not just /. If you just do cp -aux /boot /mnt/md1 it will just copy over boot as a subdirectory of /mnt/md1.

Or, alternatively, you could copy the root system with 'find' piped to 'cpio', like this:

cd /
find . -xdev -print | cpio -dvpm /mnt/md0

^

VI. Rebooting

You should always reboot if you have changed your partitions, otherwise the Kernel will not see the new partitions correctly. I have changed partitions and not rebooted, and it caused problems. I would rather have the simpler longer less potentially troublesome approach. Just because it appears to work, does not mean it does work. You really only need to reboot if you are CHANGING or rebooting a new Lilo configuration. Don't email me if you hose yourself because you did not feel the urge to reboot. Trust me.

^

VII. initrd

initrd: Use RAID as initrd modules.

The Kernel that is installed when you first build a system does not use an initrd.img. However the default kernel uses initrd. So you can use a stock kernel for with software raid.

The new Kernel by default won't contain the right modules for creating a RAID savvy initrd, but they can be added.

 

(Per James Bromberger)

Now we need to prepare for running a RAID setup. Our packages need an update. Use apt, because it rocks, and install the following:

DevFSd
kernel-image-2.4.x (whatever suits you)
reiserfsprogs
less
screen
vim

...Anything else you need and can't live without for the next 10 minutes

You might already have some of these modules in the kernel, eg ext2. Edit /etc/modules and add the following modules:

reiserfs
md
raid1
ext2
ide-disk (might not need this one.)
raid5
ext3
ide-probe-mod (might not need this one.)
ide-mod (might not need this one.)


Edit /etc/mkinitrd/modules, and add the same modules to this list. Your initrd image needs to be able to read and write to your RAID array, before your filesystem is mounted. Initrd is the trick here. You probably also want to see if you need to edit /etc/mkinitrd/mkinitrd.cfg and set the variable ROOT=probe to be ROOT=/dev/md0, or possibly, if using DevFS, ROOT=/dev/md/0.

Regenerate your initrd image for your new kernel with

mkinitrd -o /tmp/initrd-new /lib/modules/2.4.x-... .

If all is good, move this to /boot/initrd-2.4.x-... and edit your /etc/lilo.conf to add initrd=/boot/initrd against the "Linux" kernel entry. Run lilo, and you should see an asterisk next to the boot image "Linux".

With those modules you should be able to install the new kernel-image package. The install will add those modules to the initrd.img that. Now you can do for example (I actually only tested with kernel-image-2.4.24-1-686-smp on a machine using testing and unstable listed in the /etc/apt/source.list)

apt-get install kernel-image-2.4.24-1-686-smp

You will need to modify /etc/lilo.conf to include the right stuff. Otherwise the post install scripts for the package will likely fail.

image=/vmlinuz
label=Linux
initrd=/initrd.img

(The above is all one line)

Run Lilo and REBOOT.

You should now have the modules loaded. Check with: cat /proc/mdstat

^

VIII. Verify that system will boot even with one disk off-line

Roger did it this way.

  1. Shutdown and power-off your computer.
  2. Open up computer and unplug the power to Primary Master disk (/dev/hda).
  3. Start up your computer. It should boot up from the other disk.
  4. Now look at
    cat /proc/mdstat
    you should see that one of the disks in your md0 has "failed".
  5. Shutdown and then unplug the power to you computer, again.
  6. Reconnect the power to Primary Master disk.
  7. Start up your computer, again. It should boot up from the other disk still. It wont try to access the disk that it now has on record as "failed" until you re-add it to your RAID. Look again at
    cat /proc/mdstat
    you should still see one of the disks in your md0 listed as "failed". If this were not a simulation it probably would be failed and you would want to replace it with a new one. But for the simulation we just un-plug and later re-plug the power connector to the disk.
  8. Now that you have re-connected the power to the disk (or replaced it with a new one were it really was a failed disk) bring it back online with mdadm,
    mdadm --add /dev/md0 /dev/hda1
    and check its status with, cat /proc/mdstat
    you should see that it is being synchronized the the other disk in your RAID 1.
  9. WAIT until the synchronization has completed. Then you can try the above again but unplugging the other disk in your RAID 1. WARNING if you do not wait for synchronization to fully complete (check with '/proc/mdstat') you will have a real problem because your system is only partially rebuild on the "new" disk until synchronization has finished.

NB: I (Roger) had to disconnect power to my CD-ROM drive (because my CD-ROM was on /dev/hdd -- Secondary Slave) in order to boot with my Secondary Master disconnected. Otherwise my BIOS refused to boot the machine because my CD-ROM was then a Slave on a cable without any Master. Your mileage may vary. :-) So I decided to leave my CD-ROM disconnected, as this is a server and I need it to boot even with a failed drive more than I need the convenience of keeping the CD-ROM connected. I can of course connect the CD-ROM when I need it as long as I have a working Master drive on its cable with it or set it to Master.

^

IX. Setting up a RAID 1 Swap device

I created a swap RAID device as follows:

(I have a 1000MB hda2 and a 1000MB hdc2, both as type 'fd' created with 'cfdisk', that I will use as md1 for swap.)

(Or you can just create the swap parttions on the actual disk, don't put swap on raid. Just put a swap partition on each disk in your raid set on an empty partition.)

Add a Swap entry in /etc/fstab, just after root (/) partition line. Example line to add to /etc/fstab:

/dev/md1 none swap sw 0 0

Reboot and the boot sequence should start up the Swap when it reads /etc/fstab.

reboot

You can argue whether swap should be on raid. A large colo admin mentions that he does not use swap on raid. Keep it as simple as possible. You decide.

^

X. Performance Optimizations

For every ide drive turn on hdparm.

 

hdparm -d1 -c3 /dev/hda /dev/hdc


You need to use bonnie++ to measure software raid performance
You want all your devices to be as masters. As your limited to total bandwidth on that chain of
hard drives.
I just stick as many hard drives in the system as possible,
I have not encountered problems where having disks on the same master
slave channel caused a slowdown.

^

XI. Disaster Recovery

(These directions are untested, I need to adopt them to mdadm instead of raid2 --luke)

So what to do if you can't get your root RAID1 filesystem to boot? Here is a straightforward way to get to your md0:

^

XII. Quick Reference

DON'T JUST LOOK AT THIS QUICK REFERENCE. Understand the rest of the document.

Quick Reference -- setting up bootable system on /dev/md0 using /dev/hda and /dev/hdc as RAID 1 component disks

Verify RAID savvy Kernel. (1) You should see the RAID "personalities" your Kernel supports:

cat /proc/mdstat

dmsg|grep -i RAID

(This will show you if raid is compiled into kernel, or detected as a module from initrd.) /etc/modules will not list RAID if Kernel has RAID compiled in instead of loaded as modules. Use lsmod to list currently loaded modules this will show raid modules loaded.

(2) You should NOT see any RAID modules in /etc/modules (If you do, review step 2 of Procedure):

cat /etc/modules

Copy partitions hda to hdc:

sfdisk -d /dev/hda | sfdisk /dev/hdc

Create array:

mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/hdc1

Copy data:

cp -ax / /mnt/md0

Example /etc/lilo.conf entry for 1 disk RAID device:

boot=/dev/hda
image=/vmlinuz
label=RAID
read-only
#our new root partition.
root=/dev/md0

Add second disk to array:

mdadm --add /dev/md0 /dev/hdc1

Example final /etc/lilo.conf entry:

boot=/dev/md0
root=/dev/md0
#this writes the boot signatures to either disk.
raid-extra-boot=/dev/hda,/dev/hdc
image=/vmlinuz
label=RAID
read-only

Useful 'mdadm' commands

Always zero the superblock of a device before adding it to a RAID device. Why? Because the disks decide what array they are in based on the disk-id information written on them. Zero the superblock first in case the disk was part of a previous RAID device. Also, if a partition was part of a previous RAID device, it appears to store the size of it's previous partition in the signature. Zeroing the superblock before adding it to a new RAID device takes care of cleaning up that, too.

Erase the MD superblock from a device:

mdadm --zero-superblock /dev/hdx

Remove disk from array:

mdadm --set-faulty /dev/md1 /dev/hda1
mdadm --remove /dev/md1 /dev/hda1

Replace failed disk or add disk to array:

mdadm --add /dev/md1 /dev/hda1

(that will format the disk and copy the data from the existing disk to the new disk.)

Create mdadm config file:

echo "DEVICE /dev/hda /dev/hdc" > /etc/mdadm/mdadm.conf
mdadm --brief --detail --verbose /dev/md0 >> /etc/mdadm/mdadm.conf
mdadm --brief --detail --verbose /dev/md1 >> /etc/mdadm/mdadm.conf

To stop the array completely:

mdadm -S /dev/md0

^

XIII. Troubleshooting


The main problems people encounter is:

Kernel must have support for raid compiled in or loaded correctly in initrd.

You will actually have 2 configurations of raid. You boot to the failed raid volume,

then add in the original disk, then boot the final raid configuration.

Performance is too slow:

See Performance Optimizations
^

XIIII. Raid Disk Maintenance.


You need to configure raid to monitor for errors.

It will email you when it detects and error

Once a failed disk is detected, remove it and then add it back in.

Create an mdadm.conf file

See mdadm commands
You can also configure hot spare, that will come online if a disk fails.

Finish directions on smart monitoring and mdadm configuration to monitor disks,and hot spares.

^

References

RAID 1 Root HowTo PA-RISC
http://www.pa-RISC-linux.org/faq/RAIDboot-howto.html

Lilo RAID Configuration:
http://lists.debian.org/debian-user/2003/debian-user-200309/msg04821.html

Grub RAID Howto
http://www.linuxsa.org.au/mailing-list/2003-07/1270.html

Building a Software RAID System in Slackware 8.0
http://slacksite.com/slackware/RAID.html

Root-on-LVM-on-RAID HowTo
http://www.midhgard.it/docs/lvm/html/install.disks.html

Software RAID HowTo
http://unthought.net/Software-RAID.HOWTO/Software-RAID.HOWTO.txt

HowTo - Install Debian Onto a Remote Linux System
http://trilldev.sourceforge.net/files/remotedeb.html

Kernel Compilation Information and good getting started info for Debian
http://newbiedoc.sourceforge.net

Initrd information and Raid Disaster Recovery,

http://www.james.rcpt.to/programs/debian/raid1/

^