SSP Replay Speed

SSP Replay Speed: Recent Product Improvements

December 8, 2017 — Matthew Stuart Colton Frazier

Many of you are familiar with the SSP All Edits State Zero work we have done for years. (For those of you are not — or for those of you who may want a refresher — check out this article here: All Edits and State Zero Combo Platter.) We recently changed the name of “SSP All Edits State Zero” to “SSP Replay.” (For a complete list of rebranded product names, check out the press release: SSP Innovations Rebrand is Announced Ahead of GeoConX 2017.) SSP Replay is a name that better represents the specific nature of the work we do. As part of that rebrand, we have also been making many improvements to our process — and we will document some of those improvements here in the coming months. First up, we will start with SSP Replay speed improvements.

Why did we want to enhance SSP Replay speed?

SSP Replay and SSP Delta, SSP Replay SpeedOne constant question we get is “How long will this take?” To make things faster, we built a multi-process interface that allows us to process multiple versions at once. With these improvements, estimates that would once be 30+ hours are now becoming a mere four hours. The other thing to note is that these new timings can be accomplished on relatively low resource environments. Depending on the resources that are dedicated to the application and database servers we are running against, these timings can turn out to be even better.

Using the multi-process code, it took a matter of hours.

We have done a full extraction for two clients and the results have been great.

  • Client 1: We extracted ~250,000 edits from ~2,000 versions in 2.5 hours running 10 processes simultaneously.
  • Client 2: We extracted ~400,000 edits from ~7,000 versions in 6 hours running 12 processes simultaneously.
For Client 2, the numbers came in right around 10 times faster than using a single process. The beauty is that we can adjust the amount of processes with one config to match the current environments we are running on. We can also add more processes while the others are running, and they will start processing the next version to be processed. If given production-grade environments, we can get up to 25+ processes going in parallel. The bottleneck, of course, is either memory/CPU on the client machine, or the database server’s processing power (if whatever we are running is database intensive such as reconciles).

Above is a screenshot of the multi-process code in action.

In future articles, we will talk about additional Replay improvements we will be making.

We Wrote the Book

The Indispensible Guide to ArcGIS Online

Download It for Free

Matthew Stuart

Director, SI Delivery

Colton Frazier

Senior Software Engineer

What do you think?

Leave a comment, and share your thoughts

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>


This site uses Akismet to reduce spam. Learn how your comment data is processed.