Confidence and Control: Managing SDE Versions

September 2, 2014 — SSP Innovations

The multi-user, versioned geodatabase is a powerful architecture and framework for geographic data storage and management.  The ArcGIS geodatabase is successfully deployed at hundreds, if not thousands, of Esri user sites across the globe.

For many, at some point in its operation at their organization, SDE version management becomes frighteningly overwhelming and uncontrollable, afflicting the administrator with a growing unease and queasiness, many times affecting sleep patterns with nightmares of “pinned state trees”.  End-users and managers start to grumble (“why is the system so slow and bulky?”), posting work becomes cumbersome … “you’re falling behind!!”

This unwanted condition is typically marked by a quantity of versions (hundreds, thousands, tens-of-thousands) that accumulate and stew in the geodatabase, backlogged there by unresolved conflict management processing, lax geodatabase administration and various other causes, whereby one day it is realized, “things aren’t right with this database”.

Many can relate to this almost universal rite-of-geodatabase-passage in one’s ArcGIS journey.  Oh sure, there are those high-achievers that always have their game on, that stay ahead of the version processing curve, and then there are those single-user database elitists (oh the envy), but for many of us mere mortals, there comes a time when the version-conflict-non-management tsunami is too much to bear – “if only I woulda, I shoulda!!”

Over the course of versioned geodatabase history, there have been advances in the core software, numerous tools and utilities developed and distributed, an evolving set of best practices, etc., to provide the human a better grip over managing versions and conflict resolution.  This post highlights some recent work SSP has engaged in that complements the collective base of knowledge, experience and toolsets in the endeavor to exert confidence and control over SDE version management.

For further background and technical insight on these topics, please refer to the wide-ranging set of materials available from Esri and related SSP Posts such as Versioning for Dummies – Part One, Part Two, Part Three, Part Four, and Life in the Fast Lane.

SSP has always been interested in addressing the challenges with managing the geodatabase and recently had the opportunity to assist an organization confronted with a significant backlog and bottleneck of version and conflict resolution processing (up to 1,500 versions in database).  To apply the solution to the problem, SSP’s Nightly Batch Suite (“NBS”) was leveraged and further expanded with new modules.

NBS is a programming framework and set of configurable applications built and maintained by SSP, effective at monitoring and administrating ArcSDE geodatabases (please refer to Nightly Batch Suite product page for further information).

For this particular effort, two NBS applications were developed to: 1) mine the SDE instance to gather statistics on versions and report out this information for review and subsequent decision-making by the organization (define input parameters for the ensuing application), and 2) process versions based on input parameters that would perform version reconcile and automatically resolve conflicts, and then submit the version to the ArcFM™ Geodatabase Manager™ Posting Queue.

Let’s break this down into further detail and explanation.

 

#1) SDE Version Analysis and Reporting Application

A primary function of the Version Analysis and Reporting application is to query and gather current state and statistical properties of each version in the geodatabase.  The data items collected for each version are listed below:

  • Total Number of Recorded Edits – the total number of inserts, updates and deletes at the row level, to provide to the administrator the volume and type of changes made to the version.
  • Number of Edits by Table/Field – each table/field combination that has been edited will be counted. This count should be equal to or greater than the above total record count, because it is at the field level.
  • Number of Edits by User – using the Meta-Data fields “CreationUser/LastUser” from edited records, the processing will obtain counts of the record edits by user. This is a means to indicate which user performed the edits which may drive decision-making based on confidence levels among various editors or editing processes.
  • Total Number of Conflicts – if a version is in conflict, a total number of conflicts will be provided.
  • Number of Conflicts by Table/Field – each record in conflict will be analyzed to determine the field(s) in conflict.  These counts will be summarized in the format [TableName].[FieldName] along with the count conflicts in the version.
  • Creation Date – creation date on the version based on Esri version creation date.
  • Version Age – the age (in days) of the version using the Esri creation date as basis for the calculation.

In addition to the version-level data categories listed above, the program will gather the following SDE instance-level statistics:

  • Total Number of Versions Processed – the total number of versions loaded by the application for processing.
  • Number of Versions Processed Successfully – the number of versions that were processed from start to finish without error. The number effectively indicates the number of versions where statistics were fully gathered.
  • Number of Versions with Errors – an indication and count of versions that error-out, for whatever reason, during processing and will need to be reviewed manually.
  • Number of Versions in Conflict – the total number of versions that were successfully processed that are currently in conflict.

The other primary function of the Version Analysis and Reporting application is to provide the collected data and statistics in a report format – a file that can be persisted and used for further review and decision-making. An HTML file format was used, given the volume and nature of the data.  

The HTML format allowed for link-navigation within the file and the nesting of statistical information on a per-version basis, providing for the expansion and closing of the nested data elements.  Sample content from the reports are provided below.

Version List in HTML File

HTML Sample Content 2

 

HTML Sample Content 3

Of course, the purpose of the report is to use the collected data to make decisions on how to categorize and process (clean-up) versions in a bulk manner.  The administrator will use the report to identify patterns and criteria, and decisions will drive the definition of a configuration file that will be used by the second NBS application, Version Conflict Resolution & Posting.

For example, the administrator will identify versions with no conflicts and edit statistics that provide confidence that the version (and many others like it) can be targeted for automatic posting.

Another set of conditions are versions with conflicts that can holistically be resolved in favor of either the parent or child version. Perhaps keying off of a particular data element such as Date Modified, trusted user XYZ will be favored over other user sessions, or mass update process X has priority over the edits in conflict.

The report alone has proven to provide great value because version information with the data elements listed above resonate with the geodatabase admin team, as SDE versions stand out for various actions to be taken immediately (e.g. the oodles of versions with zero edits, zero conflicts and an age of 40-plus days = delete).

 

#2) Version Conflict Resolution & Posting Application

A set of configurations managed by the SDE administrator are used to run the Version Conflict Resolution and Posting application. Configuration will determine the set of versions for processing and how they should be processed, abiding by business rules and standards of the organization.  Specific actions by the application include:

  • For a submitted list of version names that do not currently have conflicts, these will be submitted directly to the ArcFM™ GDBM posting queue.
  • For a submitted list of version names that have conflicts that should be automatically resolved, the application will do so and then optionally place them in the ArcFM™ GDBM posting queue.

For the processing path that entails automatic conflict resolution, a combination of rules are used that are specified in the configuration file for the application.  The SDE version-level parameters are used to automate conflict resolution decisions for a particular batch of versions that will be processed in like manner.  

The administrator has complete control over how conflicts will be resolved; the NBS application will loop through all specified versions and will attempt a reconcile on each.

During the reconcile the conflicts can be interpreted at the field level. The underlying attributes (including geometry/shape) can be updated to either the child or the parent value based on configuration.  The conflict can then be effectively removed from the queue.  

The end result of the application process is for all conflicts to be resolved for the specified versions. Once processing reports that no conflicts are present, the version is released to be submitted for entry to the GDBM posting queue. 

An example of the input configuration with settings to process three versions for automatic resolution of conflicts:

Config Sample Content

The NBS applications described in this post are executables that can be scheduled to run in a batch/unattended mode. However, given external variables related to hardware, operating system, RDBMS, memory and/or other environment properties, the application should be monitored for unintended stoppage.

The processing can be quite extensive, depending on the database, the number of versions and the condition of versions (number of edits, conflicts, etc.).  For further information or follow-on questions, please leave a comment or contact SSP. Thanks for reading!

We Wrote the Book

The Indispensible Guide to ArcGIS Online

Download It for Free

SSP Innovations

2 comments

  • I was using GDBM to manage our small number of versions (no more than 5-10 a day) to reconcile them at night, until I discovered ESRI’s built in GDB management tools. I just built a couple of python scripts to kick out all stale users, reconcile, compress and rerun statistics on my DB every day. Performance seems to have skyrocketed since I have done this.. Now I wish I just could figure out how to run Orphan Version Cleanup as a scheduled task instead..

    And now I have thrown out GDB Manager.. I never liked how complicated it was, and how inconsistent it would be with running..

What do you think?

Leave a comment, and share your thoughts

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>