Tibor's Musings

Software RAID

How to install, monitor and repair Software RAID on Debian GNU/Linux.

Installing Software RAID on Debian GNU/Linux

A care is to be taken when installing Software RAID 1 on Debian/woody onto the boot partition. One of the best recent guides is written by Marcus Schoppen and is available at http://wwwhomes.uni-bielefeld.de/schoppa/raid/woody-raid-howto.html. With small changes, you can follow his procedure and install Software RAID 1 onto non-RAID system remotely.

Some comments to his guide:

  • in step 1, you better do this:

    $ sfdisk -d /dev/sda > partitions.sda
    $ cp -a partitions.sda partitions.sdb
    $ perl -pi -e 's,/sda,/sdb,g' partitions.sdb
    $ sfdisk /dev/sdb < partitions.sda
    
  • in step 7, you better do this:

    $ mount -v /dev/md0 /mnt # let's start with md0
    $ cd /  # since md0=/, see note below
    $ find . -xdev | cpio -pm /mnt
    $ umount /mnt
    

for each filesystem (md0=/, md1=/var, md2=/tmp, ...). But beware, cpio or mirrordir do not work for files greater than 2GB! Have to use cp for those.

  • in step 9, it is not necessary to make boot floppy, if you do:

    $ cp /etc/lilo.conf /tmp # to keep "good" lilo.conf handy
    $ vi /tmp/lilo.conf # and put there root arg, like this:
    #                        image=/boot/....
    #                          label=Linux
    #                          root=/dev/md0
    #                          read-only
    $ raidstop /dev/md0 # otherwise may have problems
    $ raidstop /dev/md1 # stop raid for each mdX filesystem
    $ lilo -C /tmp/lilo.conf
    $ reboot # FIRST REBOOT if you started from RAID-capable kernel
    

Then the installation can be fully remote. (tested)

  • in step 11, do not put partition argument in lilo.conf. Alternative working configuration is as follows:

    $ cat lilo.conf
    lba32
    restricted
    boot=/dev/md0
    root=/dev/md0
    install=/boot/boot-menu.b
    map=/boot/map
    password=foobar
    delay=20
    vga=normal
    raid-extra-boot="/dev/sda,/dev/sdb"
    default=Linux
    image=/vmlinuz
            label=Linux
            read-only
    image=/vmlinuz.old
            label=LinuxOLD
            read-only
            optional
    

After step 11 is done, do SECOND (and final) REBOOT. You are done.

Software RAID Runtime Monitoring

All our Linux servers run Software RAID-1 disks. So beware in case of failure. A command to check the status of the RAID array is:

$ cat /proc/mdstat

and should show "[UU]" for each volume when everything is fine:

$ cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 sdb1[1] sda1[0]
      96256 blocks [2/2] [UU]

md1 : active raid1 sdb2[1] sda2[0]
      995904 blocks [2/2] [UU]

md2 : active raid1 sdb5[1] sda5[0]
      586240 blocks [2/2] [UU]

md3 : active raid1 sdb6[1] sda6[0]
      995904 blocks [2/2] [UU]

md4 : active raid1 sdb7[1] sda7[0]
      2931712 blocks [2/2] [UU]

md5 : active raid1 sdb8[1] sda8[0]
      5855552 blocks [2/2] [UU]

md6 : active raid1 sdb9[1] sda9[0]
      6457984 blocks [2/2] [UU]

unused devices: <none>

If this is not the case, read section on repairing below.

Note that our machines usually run the mdadm daemon that periodically scans the health status of RAID devices and that alerts root by email in case it spots something wrong.

Software RAID Repairing

How to repair a degraded RAID device? If you do:

$ cat /proc/mdstat

and you see a line containing U_ such as:

md3 : active raid1 sdb6[1] sda6[0]
      979840 blocks [2/2] [UU]

md4 : active raid1 sda7[0]
      2931712 blocks [2/1] [U_]

then it means that md4 is running in a degraded mode and that sdb7 has crashed.

Firstly you should check whether the disk is physically okay. Look into /var/log/messages and search for lines like:

$ sudo grep I/O /var/log/messages
Sep 15 02:32:06 pcwebc00 kernel:  I/O error: dev 08:21, sector 139017744
Sep 15 02:32:32 pcwebc00 kernel:  I/O error: dev 08:21, sector 139017752

If you see this, then the disk should be physically replaced before continuing, and repartitioned exactly like the old one or the one it is going to mirror.

(Sometimes the system can detect the disk as faulty and will mark it as (F) in /proc/mdstat output, for example:

$ sudo cat /proc/mdstat
[...]
md6 : active raid1 sdb9[1](F) sda9[0]
      12329792 blocks [2/1] [U_]

and you can double-check that /var/log/messages indeed indicates an I/O error:

Jan  4 00:16:26 pcdh90 kernel: scsi2: ERROR on channel 0, id 1, lun 0, CDB: Read (10) 00 01 a8 d9 ef 00 00 50 00
Jan  4 00:16:28 pcdh90 kernel: Info fld=0x1a8da02, Current sd08:19: sense key Medium Error
Jan  4 00:16:28 pcdh90 kernel: Additional sense indicates Unrecovered read error
Jan  4 00:16:28 pcdh90 kernel:  I/O error: dev 08:19, sector 16661768

asking for physical disk examination, as stated above.)

If you don't see any symptoms of a disk failure, then you may repair the RAID device onto the same disk and onto the same partition.

To repair the RAID /dev/md4 of the example above, do:

$ sudo raidhotadd /dev/md4 /dev/sdb7

or, if you use mdadm instead of raidtools, like this:

$ sudo mdadm /dev/md4 -a /dev/sdb7

and watch the progress:

$ cat /proc/mdstat
md4 : active raid1 sdb7[2] sda7[0]
      2931712 blocks [2/1] [U_]
      [====>................]  recovery = 21.4% (629440/2931712) finish=1.5min speed=25177K/sec

After a while, the RAID should be repaired:

$ cat /proc/mdstat
md4 : active raid1 sdb7[1] sda7[0]
      2931712 blocks [2/2] [UU]

You are done.

unix