SUMMARY: RAID 0+1: LSM or LSM+AdvFS ?

From: A. Mahendra Rajah (Mahendra.Rajah@URegina.CA)
Date: Tue May 14 2002 - 18:36:31 EDT


   WOW... I have received several detailed replies to my query
   which was:

       Should I use LSM alone to create RAID 0+1 filesets or
       LSM for RAID 1 and AdvFS for RAID 0?

>>>>> All the replies advised me to use LSM for both <<<<<

   My sincere thanks go to the following for taking the time to
   explain:

        Ballowe, Charles
        alan @ cpqcorp.net
        Jeffrey Hummel @ albemarle
        E. Richard Glazier
        Degerness, Mandell
        Nemholt, Jesper Frank
        Dr. Thomas.Blinn@Compaq.com

   ..................... My question ..........................
   I have been tasked with setting up our disks farm in a
   RAID 0+1 configuration using software alone (no hardware
   RAID card!).

   We have an ES40 runnig Tru64v5.1A+pk1 that has 3 SCSI
   buses serving 16 disks. We run Oracle & Banner on this
   ES40 and all the file systems are AdvFS type.

   We have full licenses to run LSM and AdvFS. LSM is the
   only means of setting up RAID 1 (mirroring), but for RAID
   0 (striping) I am thinking that I could:

   1. Use sliced plexes in LSM, or

   2. Create every domain on a few disks and join them
       using AdvFS's 'addvol' command.

   I would like to solicit opinions from the list as to the
   best method to use. I think method 2 is more dynamic in
   that I can add more volumes as needed and also use
   AdvFS's 'balance' utility to balance out the contents of
   the domain. I could even stripe a large file across
   multiple volumes. (I am used to AdvFS commands and may be
   I am biased!)

   I don't see any advantage in doing 1, but the AdvFS
   Administration manual (Page 4-17) warns not to use AdvFS
   striping if LSM is already in use.

   Comments, suggestions or horror stories, please?

   ............................................................
From: "Ballowe, Charles"

   This is just my thoughts, not authoritative and I have not
   done any testing.

   I think that using LSM 1+0 is better than LSM+AdvFS. Using LSM
   only, the I/O for all files will be striped across the
   mirrorsets and should increase performance. Using AdvFS you
   only gain that striping for large files that you specify to
   stripe. Also, with AdvFS, you will have to monitor file sizes
   and make sure that large files get striped because otherwise
   they could fill the disk that they reside on and cause
   problems (and they don't have to be particularly large to get
   there if there are lots of them). It seems to me that the
   administrative overhead gets larger with AdvFS, I don't know
   if AdvFS or LSM stiping provides better performance overall.

   ............................................................
From: alan @ cpqcorp.net

   I'd use LSM for both mirroring and striping. AdvFS striping
   can be useful, but often inconvenient to use (you can only
   stripe(8) an empty file, then write to it). If the system can
   support them, I'd get more SCSI adapters and spread the I/O
   out more.

   A multi-volume domain using AdvFS is useful independent of
   those volumes are constructed.

   ............................................................
From: Jeffrey Hummel

   Add each entire disk to LSM (at 3:00 am, you need really good
   notes on your configuration not to get confused with slices).
   Beyond that, read about logs and the maximum number of group
   entries. For growth reasons, I bump both of the number of
   blocks for these by one.

   ............................................................
From: E. Richard Glazier

   The AdvFS book is right - you want to stay away from 'addvol'
   if possible. The idea of raid is to have redundancy /
   stability / hardiness / etc. If one disk (or part of a disk)
   in an AdvFS multi-volume domain goes bad, the whole file
   domain is shot. The more disks you have in the domain, the
   higher your vulnerability. It's impossible even to recover
   using AdvFS tools. You could have the AdvFS multi-volume
   domain mirrored with hardware RAID, but it souds like that's
   not an option. Even in that case, I wouldn't want to do it.
   I think addvol is for a "worst case" fix when you need to
   throw another disk at a domain that's filling up.

   ............................................................
From: "Degerness, Mandell"

   My only comment is that LSM striping will give you the speed
   advantage automatically for files larger than the stripe
   width. AdvFS "striping" requires that you manually stripe any
   large files you want to have performance gains on, and
   re-stripe them if they get re-written.

   You have to weigh this against the advantage of dynamic
   expansion of the AdvFS file domain (which you could do in any
   case).

   ............................................................
From: alan @ cpqcorp.net

   My question: Will LSM balance the striped set automatically?

   Define "balance". Striping distributes the data of the
   logical device it presents evenly among all the devices being
   used. But, it presents what looks like a disk to the higher
   layers of the operating system. How a file system or database
   system organizes the data may cause more data to be allocated
   to some back-end devices than others. The I/O load may put
   create "hot-spots" on some back-end devices and not on others.
   The consumers of such a "disk" have no clue how the data is
   organized on the disk.

   Balance in the AdvFS sense tries to allocate data to a
   multiple volume domain so that each volume has about the same
   amount of data. This doesn't do anything for balanceing the
   I/O load.

   My question: Use LSM for striping and mirroring?

   I have a DEC 3000 running Digital UNIX V3.2D-1 and do have the
   users space setup this way. It is old, not remotely year 2000
   complaint, but as a file server is stable enough for the
   purpose.

   If I had the choice, I'd prefer to use a mix of hardware RAID
   and software RAID. Having both simply allows more choices.
   Hardware subsystems that support Mirroring usually do a better
   job of handling disk failures and allow sparing. Some will
   event detect that a disk may be failing, replace it with a
   spare while it can still be read, not ever having to
   regenerate the data, just copy it.

   However, the disadvantage of a hardware subsystem is that it
   often has a single "wire" back to the host making it a single
   point of failure. This is less common a SAN configurations.
   The single wire/connection can also limit performance to that
   of the interconnect. Where host based striping can use more
   than one interconnect.

   Each piece has to be viewed as a tool that offers some
   benefit. LSM mirroring, independent of anything else offers
   redundancy and availability, when properly configured. LSM
   striping offers performance benefits when properly configured
   (which may include I/O load balanceing unless the load happens
   to create hot-spots). And being able to recognize that, the
   configuration can often be changed to spread those out more.
   AdvFS space management offers the ability to add and remove
   space dynamicially.

   Each is a tool that can be used together or independently.

   My question: If I already have LSM striping enabled, wouldn't
   creating a multi-volume AdvFS domain confuse LSM?

   How would it? AdvFS (basically) doesn't know that the
   underlying device is an LSM volume, or a simple plain old
   (large) SCSI disk. AdvFS and LSM are independent pieces. LSM
   (mostly) doesn't know how its LBN space is being used, whether
   by a file system, database system or some other consumer of
   disk space.

   ............................................................
From: "Nemholt, Jesper Frank" <JesperFrank.Nemholt@hp.com>

   There are pros & cons for them both/all.

   I assume your 3 SCSI busses are connected to 3 independant BA
   boxes or some other type of storage cabinet. If this is the
   case, the first thing is to eliminate single point of failure
   in these by always mirror (in LSM) a disk on one bus with a
   disk on another bus. This gives you a mirrorset that is able
   to survive a complete failure of not just one disk, but also
   the whole path to that disk. LSM utilize parallel read access
   for mirrorsets, so what you get in read performance from 2
   disks in a LSM mirrorset more or less equals read performance
   of a stripeset of 2 disks. Secondly, when putting the 2 disks
   in the mirrorset on seperate SCSI busses, you also get a
   faster path to the combined device.

   Next thing is the partitioning. With 16 disks you end up with
   8 mirrorsets. I would not partition these at all. Instead I
   would make one AdvFS domain out of each mirrorset and then, if
   necessary, create filesets with quotas. It's much more
   flexible than partitioning in LSM. Secondly, when you start to
   partition striped or mirrored disks and use these partitions
   to add space to varions AdvFS domains you will bit by bit
   cause yourself a performance problem because these partitions
   likely will have different I/O patterns allthough being
   situated on the same physical devices. This means the physical
   devices will start spending more time seeking and less time
   reading & writing. That shows as high disk servicetime,
   waittime, ative queue and wait queue (in Collect).

   I would not use AdvFS striping at all. It's not flexible, and
   often it's more important to spread out the Oracle load on
   many independant devices rather than having everything on a
   few striped devices. A few striped devices may be very fast in
   sequential I/O, but a typical Oracle database doesn't generate
   pure sequential reads & writes in normal usage unless there's
   only one user logged on and he/she only executes one query at
   a time.

   Related to AdvFS balance command, it doesn't really stripe, it
   just distributes data equally. It's only the stripe command
   that does real stripe, and it works only on a per file basis.
   Its usage is rather limited. LSM stripe is better as it does
   full stripe.

   So to summarize, what I would do :

   Mirror the 16 disks 1 to 1 in LSM and always mirror with a
   disk on another SCSI bus. Mirror the disks as they are,
   without partitioning them.

   Make 1 AdvFS domain on each of the 8 LSM mirrorset.

   Make AdvFS filesets as needed, and make sure to spread out
   Oracle data & index files so they're balanced between the 8
   domains, and try to put index files on other domains than the
   datafiles they relate to. AdvFS filesets are by far the most
   flexible way of "partitioning" disks.

   Put soft & hard quota on each fileset so people don't start to
   "steal" space from each other.

   This will give you a clean & simple solution. Later, if you
   run out of space, add more disks, mirror them in LSM like the
   previous and use addvol to increase your domains, or make new
   domains (better for distributed performance).

   And now the horror story :

   I've seen a machine with an average of 10-20 partitions
   (subdisks) on each disk in LSM (it was a Sun, so read Veritas
   Volume Manager). Later these partitions were used to stripe &
   mirror volumes. These volumes then ended up as filesystems.
   The result is that the same (poor) disk serve some 10-20
   filesystems, each with completely different I/O patterns.
   Result : Perfomance problem... big performance problem.

   ............................................................
From: "Dr. Thomas.Blinn@Compaq.com"

   Our team took a look at your question and asked me to pass
   this along:

   An AdvFS development engineer responds ...

   I would use LSM for the entire RAID 0+1 configuration. While
   AdvFS supports file striping and multi-volume domains, there
   are no real benefits in breaking up the RAID0+1 functionality
   between AdvFS and LSM. The comment in the AdvFs Admin guide
   regarding AdvFS and LSM striping refers specifically to using
   AdvFS's file striping capability in addition to LSM striping
   (the two could negate each other).

   My recommendation would be to use one LSM volume (striped
   mirror set) for each domain needed. The only drawback would be
   if/when you would need to add storage to the domain. At that
   point the only option in V5.1A would be to add a second LSM
   volume (striped mirror set) to the domain. FYI, in a future
   release of Tru64 we will be providing support to dynamically
   grow the file system without adding additional volumes.

   [ from me: You'll be able to expand the LSM volume using
   standard LSM facilities, and the AdvFS domain above it will
   just magically expand to use the available space; this will be
   simpler to manage over time than having multiple "volumes" in
   the domain. -- Tom ]

   .................... END of responses ......................



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:48:41 EDT