You are here

Life in the Fast Lane

This article describes how to keep your Geodatabase performing well.  We will focus on Oracle for this article, but most of the concepts are also applicable to SQL Server.  Also, I am not going to give an overview of Versioning, that has been done ad nauseum.  For that, check out SSP's great series Versioning for Dummies or Esri's article on the same topic.  You should be familiar with the basic concepts of versioning before jumping into this deep discussion. 

States, Versions and Lineages

The first point to make about keeping a geodatabase performing well is that the number of states (and state_lineages) should be kept to a minimum.  That is, the number of records in the SDE.STATES table should be as close as possible to the number of records in the SDE.VERSIONS table.


SQL> select count(*) from sde.versions;

  COUNT(*)
----------
        23

SQL> select count(*) from sde.states;

  COUNT(*)
----------
        28

SQL> select count(*) from sde.state_lineages;

  COUNT(*)
----------
       108

The secret to keeping the number of STATES and STATE_LINEAGES records low is to RECONCILE, RECONCILE, and RECONCILE.


Recommended Reconcile Order

Reconciling is the process of merging edits from one version (source version) into another (target version).  As users edit a Geodatabase, edit change creates a new STATE, and the STATE_LINEAGES defines the list of those changes.  In order to keep the number of STATES and STATE_LINEAGES small, you must reconcile any edits posted to Default back into all the other versions. 

It's important to note that the versions should be reconciled in the correct order.

     See appendix 1 for VBA code to get Recommended Reconcile Order (pre-10.1)

Any conflicts that arise during the reconcile process should be resolved immediately, before the next version is reconciled.

At ArcGIS 10.1, the ArcPy module has a ReconcileVersions method that can automate the reconcile process while using the Recommended Reconcile Order.

     See appendix 2 for Python code to Batch Reconcile Versions (10.1)

Nightly Tuning (Compress, Indexes and Statistics)

DBAs are creatures of the night, and they LOVE down time on their servers, so they can get some  tuning accomplished while they have exclusive access.  Spatial databases are no exception, and there are some basic tuning tasks that your Geodatabase should go thru each night.


1) Compress – A compress is an operation that performs two key tasks.  First, it removes unreferenced states (it keeps the SDE.STATE_LINEAGES table small), while also removing records in the Delta tables.  Second, it moves any records that it can from the Delta tables up to the Base tables.


There are several ways to run a compress.  ArcCatalog has an interactive tool for running the compress, but this is not ideal because it doesn't allow the compress to run as a scheduled task.  Therefore, the recommended approach for installations prior to ArcGIS 10.1 is to run the ArcSDE command line:

sdeversion -o compress -u sde -p <password> -i esri_sde -N


This command can easily be placed in a batch file or shell script file and run as a scheduled task.


2) Indexes – Running that nightly compress helps to keep row counts low in the database.  Typically, a database can see 10% or more of their rows moving during a compress operation.  This wreaks havoc on the indexes for those tables.  So typically after running a compress, a DBA should look for indexes that need to be rebuilt.  Most GIS customers have found that rebuilding indexes at least weekly will help keep their database at peak performance.


Note:  There has been much speculation by the DBA community as to whether or not indexes should be rebuilt on a schedule, and that debate is beyond the scope of this article. For more detailed insight to the index rebuilding discussion, please visit Oracle's Support site and look for Doc ID: 989093.1.


3) Statistics – Once the compress has been performed, and any possible indexes rebuilt, the most important step in maintaining database performance is to analyze the schemas.  When Oracle is asked to perform a query, the Optimizer tries to determine the most efficient way to respond to that query. 

In order to accomplish that, the Optimizer needs some information about the tables and indexes involved in that query:  row count, row length, index levels and leaf blocks, etc.  These are the types of statistics that need to be refreshed after all the nightly tuning operations are complete.  As of Oracle 10.2, Oracle has an automatic job (GATHER_STATS_JOB) that will gather statistics for schema objects if those statistics are either missing or stale (out of date).


The following PL/SQL code block is used to perform steps 2 and 3 above (rebuilding all indexes and updating DBMS statistics):


declare
    CURSOR owner_cur IS
        SELECT DISTINCT(OWNER) OWNER from SDE.TABLE_REGISTRY ORDER BY OWNER;

    CURSOR index_cur IS
        SELECT owner, index_name FROM dba_indexes
            WHERE owner in (select distinct(owner) from sde.table_registry)
        and INDEX_TYPE = 'NORMAL' ORDER BY owner, index_name;

    SQL_STMT VARCHAR2(200);

begin
    FOR IndexRec in index_cur LOOP
        SQL_STMT := 'alter index ' || IndexRec.owner || '.' ||
                   IndexRec.index_name || ' rebuild nologging';
        EXECUTE IMMEDIATE SQL_STMT;
    END LOOP;

    FOR OwnerRec in owner_cur LOOP
        dbms_stats.gather_schema_stats (OwnerRec.owner, CASCADE=>TRUE);
    END LOOP;
end;
/


Oracle Tracing

Even when the above performance tuning recommendations are followed, there are still occasional user  complaints that surface.  When they do, the DBA should be prepared to troubleshoot!


Oracle provides the ability to capture all the SQL queries, how long it took to respond to those queries, and how it answered the query.  This is called an Oracle Trace (or more specifically, an Oracle 10046 trace event).  This is the single most useful tool a DBA has when trying to solve performance issues.


There are several methods of capturing an Oracle trace file.  We are going to focus this article from a DBA perspective, so we will use only Oracle tools to capture and analyze the trace. 


Note: It is VERY important that the trace file be properly scoped.  We don't want to trace EVERYTHING the user does, we only want to trace a small portion of their activity that they consider “slow”.  For example, users typically claim that the following types of activities are slow:  “opening a specific stored display”, or “adding a transformer”.  So, before we start tracing the user activity, the user should have their session ready up to the point of starting that activity.  This is a coordinated effort between user and DBA.


Once the user is ready to start their activity, they alert the DBA.  The user should not start their activity until the DBA gives them a green light.  The DBA needs time to find their session and begin the trace.  So in our example, user “Jeff” has complained that drawing service points is slow.  The user opens his stored display, and gets ready to draw Service Points, then tells the DBA to trace his session.  The DBA can use the following to start the Oracle trace:

Step 1)  Find the Process ID (or PID) of our session:

SQL> select pid from v$process where addr = (select paddr from
v$session where username = 'JEFF');

       PID
----------
        70


Step 2) Use OraDebug to attach to the session:

SQL> oradebug setorapid 70
Oracle pid: 70, Windows thread id: 9784, image: ORACLE.EXE (SHAD)


Step 3) Use OraDebug to start tracing:

SQL> oradebug event 10046 trace name context forever, level 12;
Statement processed.


Step 4)  The user performs the activity, and immediately tells the DBA when it's complete.  Upon completion of the activity, the DBA will use the following commands to stop the Oracle trace:

SQL> oradebug event 10046 trace name context off;
Statement processed.


Step 5) Locating the trace file

SQL> oradebug tracefile_name
C:\ORACLE\diag\rdbms\gisprod\gisprod\trace\gisprod_ora_9784_JEFF.trc



Once we know the trace file name, we can use Oracle's TKPROF utility to format the trace file into more human-readable text.  We can also use TKPROF to sort the output, and/or generate an explain plan.  Here is one example of using TKPROF on the tracefile:

tkprof gisprod_ora_9784_JEFF.trc jeff.txt sort=fchela explain=sde/sde@gisprod

This will create an ouptut textfile called “jeff.txt”.  The output is sorted by Elapsed Time to Fetch each statement, and will generate an explain plan to show us how Oracle is solving each query.  Continuing with our example, here is the first SQL statement in our output file:

SELECT  /*+  USE_NL(V__2108)  INDEX(SHAPE F1001_UK1) */  SHAPE,  SubtypeCD  ,
  V__2108.eminx,V__2108.eminy,V__2108.emaxx,V__2108.emaxy ,SHAPE.fid,
  SHAPE.numofpts,SHAPE.entity,SHAPE.points,SHAPE.rowid 
FROM
 (SELECT /*+ LEADING */ b.SHAPE,b.SUBTYPECD  ,S_.sp_fid,S_.eminx,S_.eminy,
  S_.emaxx,S_.emaxy  FROM (SELECT  /*+ INDEX(SP_ S1001_IX1) */ DISTINCT
  sp_fid, eminx, eminy, emaxx, emaxy FROM ARCFM.S1001 SP_  WHERE SP_.gx >= :1
  AND SP_.gx <= :2 AND SP_.gy >= :3 AND SP_.gy <= :4 AND SP_.eminx <= :5 AND
  SP_.eminy <= :6 AND SP_.emaxx >= :7 AND SP_.emaxy >= :8)S_,
  ARCFM.ServicePoint b WHERE S_.SP_FID = b.SHAPE AND b.OBJECTID NOT IN
  (SELECT /*+ HASH_AJ */ SDE_DELETES_ROW_ID FROM ARCFM.D2108 WHERE DELETED_AT
  IN (SELECT l.lineage_id FROM SDE.state_lineages l WHERE l.lineage_name =
  :source_lineage_name AND l.lineage_id <= :source_state_id) AND SDE_STATE_ID
  = :"SYS_B_0") UNION ALL SELECT /*+ LEADING */ a.SHAPE,a.SUBTYPECD  ,
  S_.sp_fid,S_.eminx,S_.eminy,S_.emaxx,S_.emaxy  FROM (SELECT  /*+ INDEX(SP_
  S1001_IX1) */ DISTINCT sp_fid, eminx, eminy, emaxx, emaxy FROM ARCFM.S1001
  SP_  WHERE SP_.gx >= :1 AND SP_.gx <= :2 AND SP_.gy >= :3 AND SP_.gy <= :4
  AND SP_.eminx <= :5 AND SP_.eminy <= :6 AND SP_.emaxx >= :7 AND SP_.emaxy >=
   :8)S_, ARCFM.A2108 a,SDE.state_lineages SL WHERE S_.SP_FID = a.SHAPE AND
  (a.OBJECTID, a.SDE_STATE_ID) NOT IN (SELECT /*+ HASH_AJ */
  SDE_DELETES_ROW_ID,SDE_STATE_ID  FROM ARCFM.D2108 WHERE DELETED_AT IN
  (SELECT l.lineage_id FROM SDE.state_lineages l WHERE l.lineage_name =
  :source_lineage_name AND l.lineage_id <= :source_state_id) AND SDE_STATE_ID
  > :"SYS_B_1") AND a.SDE_STATE_ID = SL.lineage_id AND SL.lineage_name =
  :source_lineage_name AND SL.lineage_id <= :source_state_id ) V__2108,
  ARCFM.F1001 SHAPE  WHERE V__2108.SHAPE = SHAPE.fid


callcountcpuelapseddiskquerycurrentrows
-------------------------------------------------------
Parse00.000.000000
Execute30.000.000000
Fetch1480.350.39079932014622
------------------------------------------------
total1510.350.40079932014622

There's some UGLY looking SQL in there, but the end result is that our database is fetching 14,622 rows (in this case, from ARCFM.ServicePoint feature class) in under ½ a second.  This is an example of a database performing well!  The only problem with this user operation is that the user is probably drawing too many ServicePoint records, and should set a proper scale dependency on the layer.


Appendix 1:  VBA code to generate the Recommended Reconcile Order

Public Sub RecOrder()
On Error GoTo EH:
    ' Get our Application object
    Dim pApp As IGxApplication
    Set pApp = ThisDocument.Parent
   
    ' Get our selected object.
    If Not TypeOf pApp.SelectedObject Is IGxObject Then
        MsgBox "Error 1.  Exiting."
        Exit Sub
    End If
   
    Dim pObj As IGxObject
    Set pObj = pApp.SelectedObject
   
    ' Check for a database connection.
    Dim pDB As IGxDatabase
    If Not TypeOf pObj Is IGxDatabase Then
        ' Check for a database feature class/dataset
        If Not TypeOf pObj.Parent Is IGxDatabase Then
            MsgBox "Please select an Oracle database connection and retry."
            Exit Sub
        Else
            Set pDB = pObj.Parent
        End If
    Else
        Set pDB = pObj
    End If
   
    ' QI for the workspace
    Dim pWorkSpace As IWorkspace
    Set pWorkSpace = pDB.Workspace
   
    ' Make sure it's a remote database.
    If pWorkSpace.Type <> esriRemoteDatabaseWorkspace Then
        MsgBox "Please select an Oracle spatial database connection and retry."
        Exit Sub
    End If

    Dim pVWS As IVersionedWorkspace2
    Set pVWS = pWorkSpace
   
    Dim pEnumVersion As IEnumVersionInfo
    Set pEnumVersion = pVWS.RecommendedReconcileOrder
   
    Dim pVI As IVersionInfo
    pEnumVersion.Reset
    Set pVI = pEnumVersion.Next
   
    Do Until pVI Is Nothing
        Debug.Print pVI.VersionName
        Set pVI = pEnumVersion.Next
    Loop
  
    Exit Sub
EH:
    MsgBox Err.Description
End Sub




Appendix 2:  Python Code for batch Reconciling versions (requires ArcGIS 10.1 or later)

import arcpy
try:
    # set a variable for the workspace
    workspace = 'c:\\connections\\gisprod_sde.sde'

    # get a list of connected users.
    userList = arcpy.ListUsers(workspace)
    print userList

    #disconnect all users from the database.
    arcpy.DisconnectUser(workspace, "ALL")

    # set the workspace
    arcpy.env.workspace = workspace

    #block new connections to the database.
    arcpy.AcceptConnections(workspace, False)

    # Get a list of versions to pass into the ReconcileVersions tool.
    versionList = arcpy.ListVersions(workspace)

    # Execute the ReconcileVersions tool.
    arcpy.ReconcileVersions_management(workspace, "ALL_VERSIONS", "sde.DEFAULT", versionList, "LOCK_ACQUIRED", "NO_ABORT", "BY_OBJECT", "FAVOR_TARGET_VERSION", "POST", "DELETE_VERSION", "c:/temp/reconcilelog.txt")

    # Run the compress tool.
    arcpy.Compress_management(workspace)

    #Allow the database to begin accepting connections again
    arcpy.AcceptConnections(workspace, True)

except Exception, e:
    print e
    print arcpy.getmessages()
    arcpy.AcceptConnections(workspace, True)


Author Information

Add new comment