rad_gh_regions on ES80

From: Joachim Jaeckel (tru64-unix-managers@jj-it.de)
Date: Fri Jun 18 2004 - 13:47:13 EDT


Dear Tru64 Managers,

on an ES80 we try to use rad_gh_regions.

System config:
ES80, 6 CPU, 12 GB in 3 Drawers.
Tru64 TruCluster V5.1B PK3 NHD7 + aio Patch
Sybase database with nearly 6BG shared memory

MBM> show memory
Cab Drw CPU Memory Size
 0 0 0 2048MB
 0 0 1 2048MB
 0 1 0 2048MB
 0 1 1 2048MB
 0 2 0 2048MB
 0 2 1 2048MB

Total Physical Memory: 12288MB (12.000GB)

We tried following sysconfigtab entries:

ipc:
        # sybase does not start if ssm_threshold is not zero
        ssm_threshold=0
vm:
        # new_wire_method=0 suggested by HP support
        new_wire_method=0
        rad_gh_region[0]=2048
        rad_gh_region[2]=2048
        rad_gh_region[4]=2048

Systems crashes on boot:
Loading vmunix ...
Loading text at 0xffffffff00000000
Loading data at 0xffffffff01000000

Sizes:
text = 9859808
data = 2167616
bss = 4011920
Starting at 0xffffffff00013e20

Loading vmunix symbol table ... [2409832 bytes]
bcm: DEGXA driver V1.0.21 NUMA lanlog
GH value too large
Setting GH size to 1951Meg for RAD 0.
GH value too large
Setting GH size to 1983Meg for RAD 2.
GH value too large
Setting GH size to 1983Meg for RAD 4.
Alpha boot: available memory from 0x7dbde000 to 0x2480000000
Compaq Tru64 UNIX V5.1B (Rev. 2650); Wed Jun 9 17:27:50 CEST 2004
physical memory = 12288.00 megabytes.
available memory = 6098.89 megabytes.
using 24324 buffers containing 190.03 megabytes of memory

trap: invalid memory read access from kernel mode

    faulting virtual address: 0x00000000000000a8
    pc of faulting instruction: 0xffffffff0005c838
    ra contents at time of fault: 0xffffffff0005c714
    sp contents at time of fault: 0xffffffffffff7660

panic (cpu 0): kernel memory fault

For me, this means the system has adjusted to rad_gh_regions
to lower values.

Q1: What values should we supply, pointer to documentation?
Q2: Why does the system crash?

To repair the sysconfigtab, we tried an interactive boot:
P00>>>boot -fl is
...
Enter: <kernel_name> [option_1 ... option_n]
  or: ls [name]['help'] or: 'quit' to return to console
Press Return to boot 'vmunix'
# vmunix vm:rad_gh_regions[0]=0 vm:rad_gh_regions[2]=0
vm:rad_gh_regions[4]=0
Loading vmunix ...
...
Loading vmunix symbol table ... [2409832 bytes]
bcm: DEGXA driver V1.0.21 NUMA lanlog
Kernel argument vm:rad_gh_regions[0]=0
Kernel argument vm:rad_gh_regions[2]=0
Kernel argument vm:rad_gh_regions[4]=0
GH value too large
Setting GH size to 1951Meg for RAD 0.
GH value too large
Setting GH size to 1983Meg for RAD 2.
GH value too large
Setting GH size to 1983Meg for RAD 4.
Alpha boot: available memory from 0x7dbde000 to 0x2480000000
Compaq Tru64 UNIX V5.1B (Rev. 2650); Wed Jun 9 17:27:50 CEST 2004
physical memory = 12288.00 megabytes.
available memory = 6098.89 megabytes.
using 24324 buffers containing 190.03 megabytes of memory

trap: invalid memory read access from kernel mode

Q3: We specified new values for rad_gh_regions on the vmunix command line.
Kernel says that it sees these paramerters, but then ignores.
What could be going wrong?

We had to boot from CD to repair the sysconfigtab (boot -sc does not work
in a cluster).

The reason for this tuning attempt is a sybase database showing very bad
performance on this system. CPU system time is factor 10-20 higher than
CPU user time, system is heavy overloaded.
Same application works good on a much smaller ES40, but older Tru64 version
and older sybase version.

Q4: Other hints? Both HP and Sybase have a case open, but yet no result.

Greetings,

--
Joachim Jaeckel
System Consultant


This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:50:01 EDT