Migrating Data into the Utility Network from PLS-CADD

January 13, 2020 — Shannon Smith

Project Overview

Migrating data into a Utility Network environment may seem like a challenging experience, but the process can be streamlined with Python. A recent project of mine involved taking data created in PLS-CADD projects and transferring it to an existing Utility Network implementation to import new features and update the features that were already in the GIS. The task can be daunting at first, but by using the ArcPy library and the ArcGIS Server feature services, we can easily design a straightforward process that can be used for many different sources of data beyond PLSCADD. This blog will cover the steps that we took at a high-level to accomplish the task, and any observations or issues we encountered during development.

Preparation & Staging

The general gist of how this data is translated requires that we ask a few questions about the format of the data now, and how it needs to be defined in the Utility Network. What fields do we need to migrate, and what are they now? Is there any additional information that we may need that we need to calculate or change during the import? Do we have any static and reliably-unique fields we can use to easily identify matching features with?

Broadly speaking, having an end-goal definition of what the data will look like is a fundamental aspect of the design simply because it defines the components of the migration that may need to be changed to fit the data we are working with. This is where we can revisit our data models and remove any legacy fields that are no longer maintained or add new fields to features for future planning. We may take advantage of a data-source matrix to identify to help identify what our data would look like once it’s been brought in and finalized.

For example, we might identify that the Electric Support Poles in our source data are Structure Junctions in the Utility Network. Subsequently, we also define that the “Structure Height” field in the Utility Network should populate with data from the “Height” field in our source. Building out a map for each feature is instrumental in determining whether data was brought over correctly for QA to review. Microsoft Excel or any other similar software works very well for planning this out.

Processing

Now that we’ve got a plan for how we want our data to translate, we can get into the programming side of the task. The overall approach can be broken down into a four simple steps: import, reprocess, stage, and commit.

Step One – Import the Source Data to a Geodatabase

In this project, we used XML as the format for our source data. This is important to note because it means that this pattern can be expanded and applied to any GIS data that can be exported as an XML – not just PLSCADD. The first major step is to import the data from the XML into a feature class in a geodatabase. In our project, this meant translating the XML tables into standalone tables and feature classes after converting them into python dictionaries.

Step Two – Reprocess the Features

The second step is to reprocess this imported data according to the plan. Technically, this is likely going to involve several scripts to handle each type of feature that is being imported appropriately. In our project, this consisted of one script to handle reprocessing Electric Transmission Lines and another to handle Structure Junctions.

These scripts are where we reshape features to fit our expectations – culling unused fields by skipping them or adding new fields with specific domains, recalculating values for individual features, etc. It is also a perfect opportunity to identify associations between features that need to be connected! While reprocessing the data to fit the Utility Network schema, we can also build standalone tables that contain association information regarding which features are associated and how so, which we can bring in later to generate our associations programmatically.

What defines an association?
In ArcGIS Pro, we define associations using the interface controls available to us. This provides a user-friendly and intuitive way of isolating specific features and associating them together in attachment, containment, or connectivity associations. As a bonus, the Utility Network also checks these associations against a set of rules to ensure that they are fair associations between valid features.

These associations are defined and stored in their own table, consisting of a few fields that can be reliably used to identify both features in a given association. Since we’re building associations programmatically, we must be able to generate CSV rows that populate each field with several fields:

  • (FROM/TO) (Defining two features per association)
    • NETWORKSOURCEID
    • GLOBALID
    • TERMINALID
  • ASSOCIATIONTYPE
  • ISCONTENTVISIBLE

The difference in the overall approach between programmatically generating these associations and manually building them is the barriers that prevent a user from attempting to create an invalid association. Since we’re effectively stepping away from the geoprocessing tools and the Utility Network to create these associations, we do not have those barriers in place. As a result, we can only find out if we’ve properly generated our associations once we attempt to import them into the Utility Network again. If any of the data is improperly formatted or representing an association incorrectly, the ImportAssociations geoprocessing tool will identify that issue and inform the user, rolling back any associations that would have otherwise been imported.

Step Three – Stage the New Features in a Version

The final step is to transfer the file geodatabase data into the published utility network and review it. In this project, we took an approach to this that involved explicitly using the feature services provided by our ArcGIS Server implementation.

We opted to push the new features to a new version labeled by a user so that we can perform QAQC on the results without potentially compromising data integrity on the default version. To do this properly, we use the Version Management Server to request a list of all current versions and scan it to make sure the version name is unique and create a new version if so. However, most of the ArcPy tools do not support isolating a specific version. Since we’re creating a new version during the process, we stored the new version GUID and resort to using the feature service endpoints available for each layer to perform our edits to the data which expose a field for specifying the intended version by GUID.

The approach we take for each feature class might seem somewhat peculiar, but in the context of importing features from an external data source. We initially start by using a geoprocessing tool to dump our features out to a CSV file, then read that CSV back into memory with the python csv library. Remember earlier, when I asked if there was a unique and reliable field for the data we’re importing? This is where it comes into play – in situations where you have pre-existing data that you intend to update during the import.

ApplyEdits and Updating Features:
When sending a POST request to the ApplyEdits endpoint, one may pass features they wish to add, update, or delete in a JSON format. Among many other options, one important option is to “Use Global Ids.” Without this option enabled, features that are imported will be assigned new Global IDs in the UN than what was in the original feature class. Enabling it requires that added features also have their own Global IDs when imported – which is extremely important for being able to programmatically generate associations and bring them over as well since they’re bound to features by Global ID. Enabling this also requires that updated and deleted features are specified by Global ID instead of Object ID.

Bearing that in mind, if the data being imported is intended to update features, it means that we’ll have to re-assign the Global IDs and Object IDs of our features and specify that they’re updates wherever applicable. We can do this now since we have every feature of the feature class stored in memory as a list of python dictionaries! This becomes a much easier task if the data has a field that is a uniquely identifiable value in the source data that would otherwise match in the Utility Network – like another GUID field.

Bringing the associations into the UN:
Once we’ve brought all our features into the new version, we can also bring in our associations that we’ve generated. We can either use another POST request or dump the corrected version of the associations to a new CSV file and use the Import Associations Geoprocessing Tool. However, we’ll need to ensure that our associations will import properly before we do.

If any association in our imported data is invalid due to a lack of attribute rule, both approaches will fail. If this comes up, it’s likely just a matter of figuring out the FROM/TOGLOBALID properly in the association table. However, both approaches will also fail if either condition is true:

  • An association exists in the import data that does not have a valid FROMGLOBALID or TOGLOBALID field (either feature in the association does not exist).
  • An association already exists in the table with the same FROMGLOBALID and TOGLOBALID fields.

In order to make sure this runs properly, we also check each row for both conditions before importing. If a row is found that would fall under either scenario, we remove it from the import to make sure that we are able to properly import the associations that are valid.

QAQC, Reconcile and Post

Now that everything has been brought into a version in the Utility Network, we can properly review the data and verify that it was translated correctly. By running traces, reviewing features, and performing various queries, we can ensure that our data is reliable and available as expected before moving to post the new features into default.

In Summary

Despite new features in the Utility Network, it is very reasonable to migrate from any other GIS system into the Utility Network if proper planning is executed before development begins. Understanding the fundamental components and how they interact is a critical aspect of this type of work, but it is a generalized approach that can be expanded upon to work with most types of GIS data migration.

A Framework For Understanding & Keeping Pace With

The Future of GIS

Download It for Free

Shannon Smith

Software Engineer

2 comments

  • I did something similar to this back in the “Aughties” for a different utility and GIS. bACK THEN, THERE WAS ONLY A LIMITED AMOUNT OF DATA AVAILABLE TO extract FROM pls-cadd . TO GET A MORE COMPLETE MODEL, YOU HAD TO DELVE INTO SOME PROPRIETARY STUFF, WHICH pls-cadd forbade.

    dID YOU WORK WITH pls-cadd TO OPEN UP THE FORMAT OR DO THEY NOW PUBLISH IN EITHER ogc OR OTHER xml FORMATS WITH A WELL-DOCUMENTED SCHEMA?

  • Shannon Smith says:

    Most of the work on this project was done with the data being provided to us in the appropriate format as per the clients needs. However, I spoke with our contact who manages the data for their PLS-CADD implementation to gain further insight into their workflow for this export process.

    In this case, we are able to specify what tables we want to be included in the XML export on a per-table basis, generally allowing us to export only the data we know we’ll need in ArcGIS to complete the migration. It seems as though there are some certain fields that cannot be exported within these tables, which we have had to manage as a post-processing step (either by recalculating or manual QC).

What do you think?

Leave a comment, and share your thoughts

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>


This site uses Akismet to reduce spam. Learn how your comment data is processed.