How to diagnose scsi/disk error on M80

From: Green, Simon (Simon.Green@EU.ALTRIA.COM)
Date: Mon Jul 14 2003 - 11:25:09 EDT


One of our M80s is giving me a bit of trouble.

Last week, one of the internal SCSI disk started to act up: hardware errors,
some LVM I/O errors and some stale partitions.
This was hdisk1, which was half of the mirrored rootvg, so no immediate
danger: I just dropped all the copies and removed it from rootvg,
preparatory to replacing it again.

I then realised that this was actually a replacement disk, installed the
week before after its predecessor exhibited similar problems.

Obviously I was concerned at that point that maybe it was actually a SCSI
problem, so I ran some diagnostics.

Diags on scsi1 showed no problems. Although the error log analysis showed a
disk error, a certify of the disk was clean.

After running several certifies on the disk without problems, I put it back
into rootvg and re-built the mirrors, to see what would happen. That was on
Wednesday.

For several days, nothing much happened. I had two block-relocations. Then
this afternoon it failed again: this time it even got declared missing!

The ELA show a fault again, but certify says it's OK. Diags against scsi1
show no problems.

I'm running a format & certify of the disk, now. What else can I do? I'm
particularly interested in how I might confirm that the SCSI adapter is OK.
(Please note: I'm in England and the server's in Switzerland.)

I suppose this _could_ just be another faulty disk, but I'd be a lot happier
if I could find something which showed me exactly what was broken.

Simon Green
Altria ITSC Europe Ltd

AIX-L Archive at http://marc.theaimsgroup.com/?l=aix-l&r=1&w=2
AIX FAQ at http://www.faqs.org/faqs/aix-faq/

N.B. Unsolicited email from vendors will not be appreciated.



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 22:17:01 EDT