SUMMARY: 130gig file, poor backup performance & high IOWaits

From: Tim Chipman (chipman@ecopiabio.com)
Date: Wed Jul 09 2003 - 10:49:55 EDT


Hi All,

Many thanks for the responses I've gotten (in no particular order) from:
Paul Roetman, Hichael Morton, Jay Lessert and also Vadim Carter (of
AC&NC, HWVendorCo for the Jetstor disk array). Please see the end of
this email for text of replies.

Bottom line // range of suggestions include:

-> there should *not* be an OS limitation / cache issue causing the
observed problem ; folks report manipulation of large (90+gigs) files
without observing this type of problem.

-> In future, as a workaround, request that DBA do dumps to "many small
files" rather than "one big! file". This is apparently possible in
Oracle8 (although not as easy as it used to be in Ora7, I'm told?) and
is a decent workaround.

-> Possibly, depending on data-type of tables being dumped, subsequent
(or inline... via named pipes) compression using gzip MAY result in
oradmp files that are smaller / more managable. [alas in my case the
large table being dumped has very dense binary data that compresses poorly].

-> Confirm performance of system for small file backup NOW? (yes - it
was OK) ; that filesystem wasn't corrupt (it is "logging" and fsck'ed
itself all OK / quickly after freeze-crash-reboot of yesterday AM) ;
that large file isn't corrupt (believed to be OK since fsck was OK)

However, it gets "better". I did more extensive digging on google /
sunsolve using "cadp160" as the search term, since this was cited in a
message logged shortly before the system froze-hang yesterdayAM (when
loading began to pickup on MondayAM as users came in to work). What
I've learned is **VERY GRIM**, assuming I can believe it all. ie,

-> CADP160 driver [ultra160 scsi kernel module driver] on SolarisX86 has
a long history of being buggy & unreliable especially at times of
significant load to the disk. This can result in such a fun range of
things as data corruption, terrible performance, freeze/reboots, etc
etc. There are entries in sunsolve which date back to '2000 and as
recent as May/31/03 which are in keeping with these problems, including
such things as:

BudID: Description:
4481205 cadp160 : performance of cadp160 is very poor
4379142 cadp160: Solaris panics while running stress tests

-> there is a "rather interesting" posting I located via google which
appears to have been made by someone who claims to be the original
developer of a low level SCSI driver module commonly used in Solaris //
which is the basis of many other such drivers subsequently developed
(Bruce Adler, driver is GLM). If this posting is true, it suggests that
Sun has known about this problem with CADP160 for quite a long time ;
that it came about for absurd reasons, and that it is quite disgusting
that it remains unresolved. And .. IFF this story is true, then it
certainly suggests that the cadp160 driver needs to be rewritten from
scratch, and that until this happens, it should **NEVER** be anywhere
near a production server. For anyone interested in the details, the
(long) posting / sordid tale is available at the URL,

http://archives.neohapsis.com/archives/openbsd/2002-02/0598.html

So. As a temporary workaround, I believe I'll add an entry to
/etc/system reading "exclude: drv/cadp160" - which should force the use
of the older (apparently more reliable) cadp driver - albeit at
non-ultra160 performance, but hopefully infinitely more stable / less
buggy. After making this change I'll be doing some trivial tests (ie,
attempt to copy the 130gig file between slices ; re-initiate the
netbackup job) -- and observe the performance and iowait loading. My
expected / hoped-for result will be better performance / less iowait
loading.

I hope this summary is of some use to others. IF in the unlikely even
anyone from Sun reads this, I would encourage you to try to inquire
about when the cadp160 driver redevelopment will begin :-)

Thanks,

Tim Chipman

====paste====original text of replies======
...

have you thought about compressing the file on the fly - with database
exports, generally get over 90% compression.

Create a bunch of pipes (eg file[0-20].dmp ), and a bunch of gzip processes
   gzip < file0.dmp > file0.dmp.gz &

then export the database to to the pipes...
   exp ... file=(file0.dmp, file1.dmp, .. ,file20.dmp) \
        filesize=2147483648 ....

that way you end up with a bunch of 200 meg compressed files to backup,
and even if you do uncompress them, they are smaller than two gig.

I have a script that generates all this on the fly, and cleans up after
itself if you are interested.

note: import can be done straight from the compressed files using the
same pipe system!

Cheers

----------------

Have you confirmed ~12MB/s *now* with a 10GB file in the same file
system as your 100+GB file?
...
Do you have any interesting non-default entries in /etc/system?

I've manipulated 90GB single files on SPARC Solaris 8 (on vxvm
RAID0+1 volumes) with no performance issues.

Are you positive the RAID5 volume is intact (no coincidental failed
subdisk)?

...
You *could* try bypassing the normal I/O buffering by backing it up
with ufsdump, which will happily do level 0's of subdirectories, if you
ask. Not very portable, of course.

------------------

...

the dba should be able to split the file into smaller files for backup.
...
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:26:44 EDT