L9 LTO Performance Problem

From: David Eisner (cradle@umd.edu)
Date: Wed Jul 31 2002 - 19:26:29 EDT


Apologies in advance for the length, but I wanted to include
as much pertinent information as I have:

Setup:

    Ultra 30 running Solaris 2.6, 256 MB Ram, with
    two dual differential Ultra/Wide SCSI host bus adapters
    
    Two A1000's, each connected to it's own connectors
    on the first SCSI HBA.
    
    StorEDGE L400 tape library (a rebranded EXABYTE EXB-220 with
                                two Mammoth drives)
        connected to one connector on the second SCSI HBA.

    StorEDGE L9 LTO tape library (a rebranded HP SureStore 1/9)
        connected to the other connector on the second SCSI HBA.
    
    RaidManager 6.1.1 software
    Legato Netorker 6.1.2.
    
    Ultra 30 PROM version: OBP 3.27.0 2000/08/23 15:43
    L9 Loader firmware version 2.33.S
    L9 Drive firmware version E1AV

    Note: one of the mammoth drive's LEDs indicates a hardware failure.
    It has been disabled in Networker.

Problem:

I recently installed the L9 library, and was hoping to see significant
performance improvements over the L400 (the uncompressed
transfer rate for the L400 is 3 MB/sec, and for the L9 (Ultrium) it's
supposed to be 15 MB/sec).

To test it out, I did a full backup of the A1000's, about 137 GB.
This took 23 hours, or roughly 1.7 MB/sec. Yuck. To see
what would happen, I tarred a 2.5 GB directory directly to
the drive and got about 13.7 MB/sec, which seemed
about right. Note: I'm using /dev/rmt/2cbn to access the tape drive.

After speaking briefly with a Legato tech, I was told to do this:

    Please edit your /kernel/drv/st.conf file and add these three lines.
    
    tape-config-list=
    "HP Ultrium", "HP Ultrium", "LTO_Ultrium",
    LTO_Ultrium = 1,0x36,262144,0xd639,4,0x00,0x00,0x00,0x40,3;
    
    Save the file. This will set your LTO drives block size to 256KB.
    
    Then...
    
    # cd /dev/rmt
    # rm *
    # drvconfig -i st (Or a reboto with boot -r)
    
    Verify the devices with ...
    # tapes
    # /etc/LGTOuscsi/inquire

After figuring out the hard way that the tape-config-list should
end with a semi-colon (thanks Legato), the backup performance was
much better: about 13.6 MB/sec.

Then I tried recovering some of this data (the 2.5 GB directory mentioned
before). Including the seek time, at the beginning, the throughput
on the recover was about 4MB/sec. Eye-balling the recover progress
after the initial seek, it worked out to between 5 and 6 MB/sec.
I expect the recover to be slower than backup since the data
isn't necessarily contiguous on the tape, but this still seems
slow compared to the (uncompressed) 15MB/sec figure.

So I looked at this:

  http://www.sunmanagers.org/pipermail/sunmanagers/2002-March/011480.html

and changed the blocksize in the LTO_Ultrium property above from
262144 to 0 (variable block size). I haven't done a full backup
and restore with Networker using the new settings yet, but
the tar extraction is still slow, about 5 MB/sec:

I also tried doing a dd of a big file (actually a tar of the
2.5 GB directory) directly to the tape. If I use the default
ibs and obs (512 bytes), speed is about 3.35 MB/sec (what
you'd expect with such a small block size). If I set the obs=256K
bytes, I get about 9 MB/sec. Going in the other direction, if
I dd that from the tape to disk (now with ibs=256K), I get
about 5.9 MB/sec throughput.

Another conundrum: I wanted to be sure that the st.conf changes
were being picked up, so I wrote a program to do the MTIOCGETDRIVETYPE
ioctl. But whether I have the blocksize set to 262144 or 0, I still
get 0 reported for the block size. Also, the device options and
the density don't match what is specified in the st.conf:

    Drive information:
    name = 'HP Ultrium'
    vendor/id = 'HP Ultrium'
    drive type = 0x36
    block size = 0
    options = 0xf639
    density 0: 0x0
    density 1: 0x0
    density 2: 0x0
    density 3: 0x40
    default density: 0x18

I think the st.conf change *is* getting picked up, though, because
I changed the "pretty print" value in the tape-config-list to
"HPOUltrium", and it was reflected by the drive-type ioctl.

Finally: If I run /etc/LGTOuscsi/inquire, it reports all
the scsi devices, including the A1000's at the end, and then hangs.
Meanwhile, in the kernel log and on the console,
I get this never-ending error message:

 Jul 31 11:34:45 cannes.umd.edu unix: pseudo0: invalid op (11) from rdnexus1
 Jul 31 11:37:35 cannes.umd.edu last message repeated 16994 times
 Jul 31 11:37:35 cannes.umd.edu unix: pseudo0: invalid op (11) from rdnexus1
 Jul 31 11:44:15 cannes.umd.edu last message repeated 39979 times
 Jul 31 11:44:15 cannes.umd.edu unix: pseudo0: invalid op (11) from rdnexus1
 Jul 31 11:50:55 cannes.umd.edu last message repeated 39938 times
  
There's a similar problem reported in Solaris Bug ID 4171107, but
in that case <n> for rdnexus<n> increments, and it doesn't repeat like this.
So I try not do inquire anymore. Other than this, there are no errors
in the logs.

Thanks to everyone who has read this far. Questions:

1. Do I have a performance problem? Or is 5MB/sec the best I can
   expect when recovering data from an LTO drive?

2. Should I remove the bad drive in the L400? Could it be causing
   problems even though it's on another scsi bus?

3. I know I should upgrade RM to 6.22.1 (maybe it would solve my invalid
   op error). But could this be causing the problem with the L9?

4. What's up with the ioctl not reporting the same data as st.conf?

5. The steps recommended by the Legato tech for reseting the st
   parameters only work if I do a boot -r, which I'd prefer not do do.
   What's the right way get the st driver to re-load st.conf? Do
   I need to do a rem_drv/add_drv before the drvconfig?

Thanks again.

-David

------------------------+--------------------------+
David Eisner | E-mail: cradle@umd.edu |
CALCE EPSC | Phone: 301-405-5341 |
University of Maryland | Fax: 301-314-9269 |
------------------------+--------------------------+
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:24:41 EDT