Stored Procedure Maintenance (sp_recompile, query trees and the like)

A common DBA maintenance task is to issue frequent "update statistics"
and "sp_recompile" commands on a particular server.  What does it mean
exactly to do this, and what is going on?

- Update Statistics : causes Sybase to re-initialize
the stored quantitative statistics about data that it keeps (distribution,
frequency of values, etc).  The optimizer uses these statistics to 
complile query plans and to select the proper index to use to return
data for queries.

- sp_recompile : causes any stored procedure or trigger 
to have its stored query plan dropped and recompiled the next time it
is executed.

Why are they run together?  After you update the statistics of a table,
the Optimizer might end up using a different Index strategy to resolve
queries.  Sybase, for speed and performance, stores a compiled version
of a stored procedure's SQL code and query plan instead of re-checking
SQL syntax, re-compiling and re-resolving query plans each time.  Thus,
if the data has changed, and if the best index strategy has changed,
and if the stored procedure isn't re-compiled...it won't take advantage
of the most efficient Optimizer choices.  Thus after updating statistics,
you MUST issue an sp_recompile on the table as well.  

---
Notes
- sp_recompile will result in a performance degredation after being run,
as stored procedures will need to be recompiled and their query strategies
re-analyzed, slowing the execution for the end user.  Also, If a large
scale update-stats/sp_recompile job is run, it can flush out the 
memory buffers (and their frequently-accessed data), causing additional
performance lags as normally-memory resident pages of data are re-retreived.

- Creating a new non-clustered index will automatically run update index
statistics for that particular index.  Creating a new clustered index will
not only re-create all non-clustered indexes, but will issue a full
update statistics on the table.

- stored procedures that depend on statistics for accuracy (sp_spaceused,
sp_ls, eg) will not be accurate unless statistics have been recently updated.

---
Improvements from 11.0.3
- ASE 11.5 introduces some new alternatives to the update statistics command;
x update partition statistics: if you have a partitioned table, the Optimizer
can issue parallel scans to process queries.  These statistics are used
to determine query strategies inre: parallel scanning.
x update all statistics; updates index and partition statistics (basically,
it runs a plain vanilla update statistics command if your table isn't
partitioned).  Avoids making two calls to update both index and partition
stats on a table.

- ASE 11.9.2 introduces these new features: 

x Statistics are now kept on a per-column basis, not a per-index basis

x update statistics: can now be passed a column name to just update
stats per column (defaults to every indexed column in the table).  To
update statistics a particular column, you must manually specify each 
column name.  To run stats on *every* column, use update all statistics.
***If you just run update statistics, you will only update stats for
the Leading column in the index!!  This is a huge change from 11.5 and
below, and could be causing all sorts of performance problems in locations
that did not modify their maintenance scripts after upgrading.

x update index stats; passed an index name or a table name; updates all
columns that are used in an index, or all indexed columns.

x update all statistics: change in functionality; now this command creates
Optimizer stats for *every* column in the table, as well as issuing an
update partition statistics command (if the table is partitioned).

x delete statistics; used to force the statistics to be deleted, as is
necessitated by the dropping of an index.  Also used to delete statistics
for specific columns (you must then run update stats of some kind to 
restore/recreate them)
x optdiag; displays statistics from the systabstats and sysstatistics
system tables, which store Optimizer statistics.  Gives incredibly detailed
information for troubleshooting advanced problems, and gives the ability
to manually override statistics information.  (See Sybooks 11.9.2 New
Functionality manual for information).

x reorg command: with the advent of RLL, the efficiencies of page storage
are eliminated.  Reorg helps reclaim unused/wasted space by reorganizing
the table.  This is important to this topic because of the "reorg rebuild"
command.  Reorg Rebuild reclaims unused page space, rewrites the rows
according to the clustered index, and drops/recreates all indexes on a 
table.

- when upgrading to 11.9.2, it is recommended to issue a "delete statistics"
command and then re-creating indexes.  This is because one particular 
statistic (specifically, the number of "steps" for the histogram) is kept
going forward as a default value, and is not ever overwritten by the issuance
of normal, maintenance intended "update statistics" commands.

---
Query Tree Growth

In older versions of Sybase 4.9.2, the use of sp_recompile was subject 
to some caution, because the query trees of stored procedures grew
with each recompilation.  This was a problem, because the query plan's
size (and the query tree's size) was limited to 64K.  So, 
after a few recompiles the query plan could grow so large that the proc 
wouldn't recompile (and thus wouldn't execute).  This is Error #703.

(note; i've seen in the literature that this limit was 64 PAGES, not 64K.)

What causes a Stored Procedure to grow? 
x drop/recreate table SP based on
x sp_recompile table
x execute the proc w/ recompile
x you bind a new rule/default to the table
x reload the database the table/SP are in.

With the advent of Sybase 10, the 64K limit was lifted, and with Sybase 
11, the problems associated w/ tree growth were mostly eliminated.  The only
limiting factor with stored procedures (and any object which has code
stored in the syscomments table like triggers, rules, defaults, views
and constraints) is a 16mb limit on the size of the text block that
comprises the user-defined object code.  Query trees will still grow 
slightly, but should never cause problems.  At an 
extreme level, the proc will grow so large as to fill up
the procedure cache; The only resolution is to drop and recreate the
proc.  You'll see serious performance degradation as you reach upper
limits of the procedure cache, and you'll see Error #701 when you've
reached the limit.  (Procedure cache %, amounts and usage can be
seen with sp_configure and sp_sysmon).

---
Re-compilation versus Re-Resolution
- Re-compilation is a physically ordered command; sp_recompile, exec
proc with recompile, or create proc with recompile options.
- Re-resolution is the re-creating of a query plan due to an underlying
table change (which is major enough to make the Optimizer believe the
existing query plan is invalid).  Database loads, table recreates, and
rule/default unbindings/rebindings are the main causes.

Both result in query tree growth, and thus query plan growth.

---
Doing update statistics/sp_recompile maintenance intelligently

- Its relatively easy to write a script that will iteratively issue
these two commands for each table in a database (in fact, see the Nightly
section for a perl script example).  However, a better way is to analyze
the tables and selectively choose which tables have the most data
activity on a day-to-day basis, and only update those stats nightly.
Other tables can wait for a weekly maintenance job (or longer, if they're
incredibly static tables).  

If you're on 11.9.2, you have much more flexibility in statistics now.
I'd suggest a schedule like this:

initially: update all statistics on all tables (this might take a while)
nightly: update index stats on selected (frequently changing) indexes
weekly: update all stats on selected (frequently changing) tables
 and update index stats on more slower changing tables
monthly/quarterly: update all stats on more slower changing tables, consider
 running reorg rebuilds, or actually delete stats and update all stats again.

---
How big are my query trees?

execute this code; it will give page counts for each proc.  Individual
query results won't help much; analyze growth trends over a period of
a few days/weeks (depending on server activity) to see if a proc warrants
dropping/recreating to clean up its query tree.

SELECT  ObjectName = object_name(id), 
        PageCount = (count(*)/8) + 1
FROM    sysprocedures 
GROUP BY object_name(id) 
ORDER BY object_name(id) 

(thanks to Teresa Larsen, posted to Sybase-L 10/31/00)