[HPADM] Summary 10.20 - FIlesystems increased beyond 128 GB ** UPDATE **

From: Julian Rogan (Julian.Rogan@Unilever.com)
Date: Thu Feb 05 2004 - 11:54:47 EST


Hi,
I had loads of good help and feedback on this issue:

>From Bill Hassell:

Unfortunately, the 128Gb limit was never enforced until a recent patch.
The reason there is a 128Gb limit is that there is unstable code beyond
that limit that can cause corruption. I don't think it can ever be fixed
without days of work using fsdb.

On the further point of the disparity in sizes (as shown by bdf) of the
old corrupted filesytems and the new copy David Antoch pointed out that
the sum of the output of the du command actually equals the size of the
new filesystem so the problem is actually a corrupt superblock in the
original filesystem showing the wrong capacity.

Many thank to the others who replied with advice.

I will add some of the replies here for completeness:

Craig Johnson:

The error 28 indicates a lack of space on the filesystem needed to increase
its size (catch 22). You shouldn't have run the fsck on the filesystem,
you
should have simply removed something, reran the fsadm, and you would have
been fine. I suspect the "I/O errors were related to the application
complaining about the lack of space.
 
As far as fixing it goes, make sure it is unmounted, then run fsck -F vxfs
-y /dev/vg01/oradbs. See if that helps. I'm concerned about the message
"cannot perform log replay".

Jim Turner:

AFAIK, you'll need JFS 3.3 (vice the factory-delivered JFS3.1) to get a
filesystem bigger than 128GB. I've done that on one of my 11.00 boxes,
and I presume it is available for 10.20. I believe JFS 3.3 is a free
upgrade.

Steve Hamilton:

I don't think that you can trust the 'bdf' output - it reads the
superblock,
which invariably is all messed up at this point in time.

In my opinion, if the subdirs are similarly sized, then you should be OK...

Bill Hassell again on sparse files:

In my opinion, if the subdirs are similarly sized, then you should be OK...
Really, really common circumstance, especially databases. The files are
perfectly fine. When you copy something, it is read serially, something
that is seldom done in a database. Database files are created as 'sparse'
files. A simple example: create a new file with one 100 byte record. Then
using lseek, write one more 100 byte record at position 1,000,000. The
total occupied space is 200 bytes. All the in-between records arenot
defined. Unix is happy with that. So you backup or copy the file and
WOW, it changes from a 200 byte file into a 100megabyte (100 byte
records X 1,000,000) file.

You see, when Unix reads the file, undefined records are still returned
as a string of zeros. So when you copy it, you create a stream of
zeros for the undefined records in the original file. Both files are
*exactly*
the same as far as content. Checksums, record counts, they are
exactly the same.

thanks again.
Julian

-----Original Message-----
From: Julian Rogan [SMTP:Julian.Rogan@Unilever.com]
Sent: 04 February 2004 19:04
To: 'hpux-admin@dutchworks.nl'
Subject: RE: [HPADM] 10.20 - FIlesystems increased beyond 128 GB ** UPDATE
**

I have had a few very interesting replies which I will summarize later.
I have had an interesting developement

After I copied the Oracle data from the read-only (corrupted) filesystem I
found I had approx 16 GB MORE data in the new
filesystem:

/dev/vg01/oradbs 163840000 63084053 94458701 40% /oradbs
/dev/vg01/oradbsnew 89620480 79731763 9270712 90% /oradbsnew

However if I use "du" to compare the disk usage of the subdirectories I
get the following:

du -sk /oradbs*/mnt/EKA
18221746 /oradbs/mnt/EKA
18161282 /oradbsnew/mnt/EKA

du -sk /oradbs*/mnt/EKATST
13915403 /oradbs/mnt/EKATST
13915395 /oradbsnew/mnt/EKATST

du -sk /oradbs*/mnt/RLINK
47383706 /oradbs/mnt/RLINK
47365634 /oradbsnew/mnt/RLINK

du -sk /oradbs*/mnt/RLINKDEV
266321 /oradbs/mnt/RLINKDEV
266321 /oradbsnew/mnt/RLINKDEV

i.e. du is showing LESS in the new filesystem than the original. However
as you can see it is not 16 GB less.

I unmounted and remounted the new filesystem but there was no change

I now don't know whether I can trust the copy.
Any advice out there?

thanks

Julian

-----Original Message-----
From: Julian Rogan [SMTP:Julian.Rogan@Unilever.com]
Sent: 04 February 2004 17:11
To: hpux-admin@dutchworks.nl
Subject: [HPADM] 10.20 - FIlesystems increased beyond 128 GB

Hi,

I have just done the following (after forgetting about the 128 GB limit on
filesystems)

I increased a logical volume to 160 GB and then issued the fsadm command
to increase the filesystem

I got the message:

fsadm: /etc/default/fs is used for determining the file system type
fsadm: /dev/vg01/roradbs is currently 81920000 sectors - size will be
increased
fsadm: attempt to resize /dev/vg01/roradbs failed with errno 28

The filesystem starting spewing out I/O errors and we lost access to the
applications. I tried to fsck and got the
following output:

fsck -y -o full /dev/vg01/oradbs
fsck: /etc/default/fs is used for determining the file system type
intent log marked bad in super-block
cannot perform log replay
pass0 - checking structural files
pass1 - checking inode sanity and blocks
pass2 - checking directory linkage
pass3 - checking reference counts
pass4 - checking resource maps
free block count incorrect 100755947 expected 84011499 fix? (ynq)y
free extent vector incorrect fix? (ynq)y
OK to clear log? (ynq)y
set state to CLEAN? (ynq)y

Mount still fails with filesystem corrupt.
I have managed to mount th efilsystem in readonly mode and currently
copying the data over to a new filesystem.

So finally to my question: I there a way to fix the filesystem?

regards,
Julian

--
             ---> Please post QUESTIONS and SUMMARIES only!! <---
        To subscribe/unsubscribe to this list, contact majordomo@dutchworks.nl
       Name: hpux-admin@dutchworks.nl     Owner: owner-hpux-admin@dutchworks.nl
 
 Archives:  ftp.dutchworks.nl:/pub/digests/hpux-admin       (FTP, browse only)
            http://www.dutchworks.nl/htbin/hpsysadmin   (Web, browse & search)


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 11:02:38 EDT