SUMMARY: Tru64 5.1b pk4 and big pages memory allocation + oracle

From: David J. DeWolfe (sxdjd@ts.sois.alaska.edu)
Date: Tue Feb 08 2005 - 13:25:52 EST

Next message: Chad W Baker: "SUMMARY: Problems with 5.1B PK4 causing machine to hang"
Previous message: King, Ed: "alpha2000 5/250, tru64 4.0g, Oracle 7.3"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

All;

I received 2 responses, one from Thomas Sjolshagen of HP Tru64 Unix
engineering and one from Rob Leadbeater. Thomas' response was:

>As of right now (PK4), BigPages are still "being improved" and we
>(still) recommend you stick with rad_gh_regions.

Thomas also provided information regarding the fact that big pages, as well
as rad_gh_regions and gh_chunks, uses Granularity Hints:

>A minor nit, really, BigPages too (as well as rad_gh_regions and
>gh_chunks) uses "Granularity Hints" (GHs are a HW feature allowing a
>single entry in the Memory Translation Buffer (TLB) to represent regions
>of memory (up to 4MB) rather than the "traditional" 1 entry == a single
>8K page thus reducing the number of TLB updates required to address
>large memory configurations.

Rob's response was:

>We've got a similar(ish) set up to yourself, and have had bad experiences
>of Big Pages...
>
>We're running a pair of ES47 Model 2's each with 2 CPUs and 8GB ram,
>running 5.1B PK4 with Oracle 9i and 10g RAC instances, connected to
>EVA5000 storage. Both nodes are clustered using dual memory channel.
>
>Following a performance review by HP, we were advised to set
>vm_bigpg_enabled = 1
>
>We duly did this only to have both machines panicking with the big page
>related errors, so vm_bigpg_enabled = 0 was restored very quickly.
>
>Unfortunately I didn't read the information here:
>http://h30097.www3.hp.com/unix/erp/BU040915_EW02.html
>
>until well after the event. That says essentially that it shouldn't happen
>on PK4 but they've been unable to find a cause. If it does happen get the
>crash dumps sent off to software support.
>
>I have been toying with the idea of turning big pages back on to see if we
>could get a crash dump to send to support but I doubt I'll get permission
>to do so with everything now being in production :-(
>
>I guess the bottom line is, if you can afford the potential for some
>downtime try it. You should find out pretty quickly if you're going to
>have problems...

Thanks very much to Thomas and Rob for their responses. Based on this
information we will not be implementing big pages on our GS1280 cluster. My
original post is included below.

Thanks again.

>All;
>
>We, the University of Alaska, are running several Oracle 9i databases on a
>pair or GS1280's which are each hardware partitioned in to 2 identical
>nodes of 16G RAM and 8 - 1.1Ghz CPU's each. All 4 nodes are clustered
>together via dual memory channel rails. The storage in use is an EVA 5000
>running VCS 3.010 (soon to be upgraded to 3.020) connected to the nodes
>via dual redundant fabrics. Our 2 largest production Oracle databases have
>SGA's of ~2G. We have several other pre-production databases running on
>the cluster which are much smaller.
>
>When we went live with this setup almost a year ago HP support indicated
>that big pages memory allocation was where we wanted to be, but at that
>time (Patch Kit 2 days) there were issues with big pages and they
>recommended that we use granularity hints. We did so, wiring 2G of memory
>on each node via setting rad_gh_regions[0] through rad_gh_regions[7] to
>512 in sysconfigtab.cluster. We also set gh_fail_if_no_mem = 0 to allow
>our databases to consume > 2G should the need arise. After applying PK3 we
>began looking at implementing big pages until we saw all the big page
>related fixes in the PK4 release notes not to mention the summary at:
>
>http://www.ornl.gov/lists/mailing-lists/tru64-unix-managers/2004/03/msg00029.html
>
>which talks about PK3 breaking big pages. We have recently applied PK4 and
>are once again in the process of looking at implementing big pages. We've
>been rock solid using granularity hints and I muss confess that we're a
>little bit gun shy regarding big pages. We have had big pages enabled on 1
>of our internal test clusters (4 DS10's) but the size of the test
>databases and load on the DS10's pales in comparison to the production
>GS1280 cluster.
>
>Other than hoping to learn from other's experiences I do have 1 specific
>question. I presume that given our 4 node cluster we could implement big
>pages 1 node at a time which, as I see it, would give us the ability to
>run some pre-production databases using big pages. If all went well with
>that we could relocate one of our production databases to a node using big
>pages and if all did not go well with that we could relocate it back to
>one of the nodes using granularity hints. If all did go well, we'd just
>continue to implement big pages on the remaining nodes.
>
>I would be interested in hearing from anyone, particularly anyone running
>Oracle, who is running 5.1b and who has implemented big pages. Any advise,
>thoughts, concerns, do's, don'ts, etc would be most welcome.
>
>TIA and I will summarize.

David
mailto:sxdjd@ts.sois.alaska.edu

Next message: Chad W Baker: "SUMMARY: Problems with 5.1B PK4 causing machine to hang"
Previous message: King, Ed: "alpha2000 5/250, tru64 4.0g, Oracle 7.3"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 10:50:15 EDT