In part two of this series published last month we covered all the gory details of what the State ID is within versioning, how state ids are generated for every edit we make in a version, how state ids are used to match up adds, deletes, and updates, and how the path from a state id back to state 0 creates the state lineage. We summed it up by explaining that a version name is simply an easy to remember text-based name for a state id/lineage.
In this month’s installment we want to take a look at what happens to the states in a database when we reconcile, post, and ultimately compress our geodatabase. If you’ve stuck with me this far you probably have a pretty good idea that there are many, many different state lineages in a geodatabase that has a lot of versioned edits. For the sake of digging into this topic, I want to use the example of a very simple geodatabase. In this geodatabase there is currently only a single version, SDE.Default:
Because there are no edits in the database the current state id of Default is pointing to 0 which indicates that it is rendering data directly from the base business tables and that the A&D tables are currently empty.
Now I am going to create a second version as a child of Default and I am going to add three feature records to this version. As we saw in the last article, each edit is assigned its own state id but when I save the edit session, all of those edits are assigned the same state id corresponding to the max state id used in the version. For simplicity we will assume this version used state ids 1, 2, and 3. So when I save this version, my geodatabase now looks like this:
Ok, now let’s take it another step further and create a third version with three more new features. These edits will get state ids 4, 5, and 6 and when I save the version the state id points to 6:
If I’ve already lost you, please go back and read the first two installments of this series where we explain what a version is and how the state ids are assigned (no offense intended.) Otherwise let’s keep going. Each of the two child versions above will have its own state lineage. Child1 will have a state lineage of 3 → 0 and Child2 will have a state lineage of 6 → 0. Please remember that these are two different lineages that meet back at the common state id of 0.
So now we want go ahead and reconcile Child1 to Default. To do this we open Child1 in ArcMap and click the reconcile button on the versioning toolbar. A reconcile operation syncs any edits that have been posted to Default down into Child1. In our above example, there are no additional edits available in Default at this time. Our next step is to post the edits in Child1 up to Default by clicking the post button on the versioning toolbar. When we do this Default is essentially synchronized with Child1. What this means behind the scenes is that both Default and Child1 point to the same state id of 3:
The other major change we see here is that Default is no longer representative of state 0. It is a common misconception that Default always represents state 0. However, in the above example we see that we have now separated out state 0 because Child2 still needs to reference the state 0 data without the posted edits that exist within Child1 and Default (i.e., state id 3). All three versions still eventually trace back to state 0, but our state tree has changed.
If you are using a versioned geodatabase you are certain to have heard about an SDE compress operation. Most of us know that a compress operation maintains the health of our database by compressing the state tree. But do you really know what that means? Many folks believe that the compress moves all posted edits from the A&D tables into the base business tables. In our example above we've posted edits from Child1 into Default. If we ran an SDE compress against the geodatabase in its current state, what would happen? The answer is absolutely nothing. In this case, no edits can be moved (compressed) into the base tables because the geodatabase must maintain the current state tree to enable Child2 to render state 0 plus the three edits (state ids 4, 5, and 6.) We can run the compress operation over and over again, but it won’t do a darned thing.
To put this further into perspective, we could create another 50 versions in the geodatabase, create edits, and then reconcile and post those edits into Default. Each time we post, the state id of Default will be updated to reference a lineage including the new edits. And we can run the compress after each post but it will continue to do nothing as long as Child2 is referencing state 0 directly.
Each time we create an edit, records are added to the A&D tables. And as records pile up in the A&D tables, the database has to work harder and harder to render a specific state lineage, because it always has to start with state 0 and then apply all of the edits that exist within the specific lineage. The more edits you have... the slower the database responds. Eventually your system will grind to a halt with performance that is unbearable.
We were called into a small utility a couple of years ago that only had two editors. But they had a case similar to the above scenario with a single version that was directly referencing state 0. After many months, they ended up with 45,000 edits in one of their ADD tables and the system performance was just awful. They couldn’t understand how this was possible... because they compressed every day. It’s an important lesson that hopefully makes more sense now. The compress is only effective if your state tree has been fully reconciled.
So to continue our exploration, let’s forget the extra 50 versions and get back to our original example. We have two child versions with edits and have posted Child1 to Default. Here is the same picture as above for reference:
Next, we'll open Child2 and perform a reconcile against Default. This time there are edits that exist within Default (state id = 3) that do not exist within Child2 (state id = 6.) When we reconcile, the software essentially performs a mini edit session and moves the edits 1, 2, and 3 down into Child2.
However, these are considered new edits in Child2 and they are therefore given new state ids of 7, 8, and 9. When we then save Child2, the state id is now 9, corresponding to the final edit that was reconciled:
Our state tree has once again been modified because we reconciled down the edits from Default into Child2. Child2 will now have a state lineage of 9 → 3 → 0 whereas both Child1 and Default have a state lineage of 3 → 0. Keep in mind that there are still edits within Child2 that do not exist within Child1 or Default (edits 4, 5, and 6). BUT this reconcile operation has brought the two child versions back into a common lineage. They both traverse through state 3 to get to state 0.
In the geodatabase, all of the edits still exist in the A&D tables and no modifications have been made to the base business tables... yet.
Now we will run the compress operation once again. This time our state tree has been fully reconciled and our child versions share a common lineage from state id 3 to state id 0. This is important because it indicates that state id 3 is no longer needed because it contains edits that are common to ALL versions in the geodatabase.
When we run the compress, the software recognizes that state id 3 is obsolete. The compress then moves all of the edits associated with state id 3 (edits 1, 2, and 3) from the A&D tables into the base business tables. They become part of state 0. The compress then deletes the state with id 3 from the states table because it is no longer a referenced state.
Our state tree has now been simplified and looks like this:
The term "compress" is pretty accurate, because it has shortened the state tree by moving common edits into the base business tables and deleting the unreferenced states. Note that Default is now once again referencing state 0. Version Child2 still references state 9 because it has outstanding edits that have not been posted but the state lineage has been shorted back to 9 → 0.
As the number of edits decreases in the A&D tables, database performance goes back up and we can now go to sleep happy because we know our compress has been effective.
The concept remains exactly the same when you have 50 or more versions. As soon as you can fully reconcile all posted edits down into all child versions, you will cause intermediate states to become obsolete. Each state corresponds to edits in the A&D tables and when you compress, all of the edits corresponding to obsolete states get moved into the base business tables and all of the obsolete states get deleted from the geodatabase. I may sound like I am saying the same thing over and over again... I am. But it’s a very important concept which can make all the difference in the health of your geodatabase.
And now, I want to make a final comment about versioning conflicts. A conflict occurs when we have edited the same record and/or more specifically the same attribute on the same record in two different versions. The conflict appears after we post the first version and attempt to reconcile the second version. The reconcile operation detects the conflict and halts the reconcile operation.
I don’t want to get into how to resolve conflicts... there is plenty of existing documentation out there on that topic. The more important point is that it halts the reconcile operation on that version. And as we just reviewed, when a reconcile is not performed on even a single version it can cause all sorts of performance problems.
The result of this case is exactly the same as our example of not reconciling a version for a long period of time. This is why it is vitally important that you resolve those conflicts on a regular basis, which will allow the reconcile operation to complete and your compresses to be effective.
If you work in a large organization that typically has more than a few versions in your geodatabase at any given time (cough, cough Telvent Designer™ users) it is imperative that you have a process to detect and report your conflicts regularly (daily is great). Ideally this is an automated process so you can use your manpower efficiently. You should attack the oldest outstanding conflicts first because they will represent the state ids that are holding back the compress from doing its job. It’s not important to have a geodatabase without any conflicts (in fact it’s almost impossible) but it is important to resolve the older conflicts on a regular schedule to keep the system motoring (dare I say speeding) along. If you need direction or assistance with any of this, feel free to give us a shout. We deal with it every day. Off my conflict soapbox...
In summary, we’ve covered how the reconcile, post, and compress operations affect the states in your geodatabase and why your compress may or may not be working effectively. The state tree in your geodatabase is constantly changing but with a little art and a bit more science you can master the state tree and keep it firmly rooted in your organization. I know that was way too cheesy but it’s late at night and I’m up writing about state ids! I often wonder how much separation there is between a versioning geek and a geeky member of The Lonely Island.
Anyhow... in the next article we will begin taking a look at Esri multi version views - what they are, how they work, and how to use them in your business. Until then, happy versioning.