Are you a GIS manager who wants and/or needs to understand Esri versioning and has been searching the Internet for an Esri versioning overview? You’re in luck.
Esri versioning is one of the most common topics we are asked about in the consulting field and a good one to unpack here in layman’s terms. We created this whiteboard video (and our free Versioning eBook) to help you understand versioning as well as provide you some useful version management tips and tools.
There are many articles out there that define what versioning is but we’re going to attempt to get behind the scenes to look at how versioning works.
In Esri’s simple definition, versioning “is the mechanism that enables concurrent multiuser geodatabase editing in Esri ArcSDE geodatabases.” This is a good place to start. Versioning is one of the true benefits of enterprise GIS because it allows multiple users to be editing the same geographic area and even the same database record at the same point in time. Each user edits the data within their own version in the geodatabase and then Esri ArcSDE provides the tools to merge those edits into the master public version.
Hi, my name is Skye Perry with SSP Innovations and I'm here to talk to you today a little about one of my favorite topics over the years, Esri versioning. Now this is a topic we've written many articles on, we've created an e-book that's been released on versioning. It'll get you way more detail and information than we are covering today. I wanted to start off with a brief overview of the concept of versioning to get you started. Versioning is all about Esri geodatabase, and as soon as you are watching this, you’re using Esri geodatabase likely that has versions.
Let's talk about what happens behind the scenes when we use an enterprise geodatabase with versioning. We are going to start off here in our Esri geodatabase. Imagine we are going to create a new feature class. In our case we are going to start off and create a feature class called Electric, which is our data owner, dot Pole. Now when we create this, it creates the table within the geodatabase. One of the key things I want to point out is that it's firing an update into the table registry. Or an insert into the table registry which is a table that manages all the other tables, and it's basically going to give it an ID. We are going to say the ID in this case is equal to 123 just to keep it simple. So we are going to ID 123 assigned to Electric.pole.
Now this table is not versioned yet, we haven’t taken that step. The next piece we do want to talk about is registering this as versioned (that's a ArcCatalog operation). When we do that it creates two additional tables as well as some other sequences, but there are two tables I really want to call out to you. Our first is still electrical, I'll skip that part on the board here. It's Electric and it creates A (and uses that same ID here) 123 for an add table.
The second important table that it creates is called D 123. The D is for delete. When we talk about the tables we really have three separate tables that we're utilizing in the concept of versioning. The base table is the Electric.Pole. We have Electric.A 123 which is our add table. The D 123 which is our delete table. We call these the add and delete table for short. In versioning, the concept of versioning really runs heavily off of the thing called the state id. So the state id is used to really manage edits through time.
It's very important for each of these tables uses that state id as new records are added to it. So this is the foundation of versioning, using the registry ID in the A and D tables. The next thing to look at is how do version edits get applied to these table. On the right, I just simulated here three different edit types which are commonly made. The add new record, the delete record, and the update record. We are going to talk about them all in the concept of Electric.Pole.
If we add a new record, we draw a new record, we've created a version to be cleared within an ArcMap, we've started in edit session, and we've put a new record onto the map (placed a new pole). At that point in time we are actually writing to the A table. So we've put a simple record in. It puts it into the A table, not into the base table because this is a version. We have to see how it existed beforehand, so it has not been posted (it has not been pushed up). So it's just a simple record into the A table. Now D deletes, as you can imagine, goes pretty much the same way. Delete number two comes down here to the D table. This allows us to see the deletion within the existing record, if you imagine a pole that already existed up here. The D says we have deleted that record. Now both of these as they are entered are capturing state IDs within the versioning tree.
Now our third one gets a little complicated. This is the delta changes. This is an update to an existing record. It could be an update to that add we already made. It could be an update to an existing pole in the base table. It may be, as you are thinking ahead, this update comes in and it first has to put in a delete, and it also has to put in an add. We deleted the old record. We’ve now added a new version of that record, so we have two individual versions making an update event. Now all of those changes can be in a single version and they are maintained as a group of edits, again, using state ids. This data, however, is isolated to that version.
If I go back and look at a new version or maybe at the SDE.Default version, I'm only seeing the base table or whatever states are already posted up in that case. Let's move forward and talk about what happens with these edits. If you are familiar with versioning, you've certainly heard of SDE for spatial database engine, that's the SDE user in the database. We have the default version. This is your top level version and we have no edits to this system. This would be the single source of the truth, all edits would be pointing to here and probably right out of our base tables. So underneath the SDE.Default with the concept of versioning, we have many different edit versions.
Let’s just say: one, two, and three. In this case, this was already edited previously. As we pull these versions, they can't see all of those other edits. That's because SDE.Default’s, even though it has a name and all versions has a name, core points to a state ID that tells us what edits are included. So you are seeing state id here and also seeing state ID in our add and delete table. State ID is really important; we have an article dedicated to the state ID because it's so important it's the core of how versioning works. So SDE.Default says it's looking at this state ID, which edits are included. We take the base table and depending on where my state id is and the point in time, it will apply varies add records and delete records to render a view of the data. So as you might imagine, (progress through time) we have many different versions. Applying many different edits to our pole feature class.
These tables here, the delta tables, are going to get bigger as more edits are applied. So the question really then becomes: “how do edits get from the A and D tables into the base tables?” That's really important for the performance of our geodatabase. When I've edited a single version (referencing a state ID, a set of edits). I will eventually reconcile and post that version up. We are taking a post operation and that post operation is taking the edits from here saying make these publicly available. Some people say when I post it, those edits get moved to the base table, but that's a common misconception.
What's really happening here in this post, it's just re-referencing SDE.Default, the public top level version to a new state ID. A state ID that now includes those edits that have been posted. But again, other versions may reference an earlier point in time, before the versions are there. So that state IDs, the core versioning, are lining up but just re-referencing SDE.Default. So how do we get those edits down? The key operations that we have to undertake to get this posted is first a reconcile; so, we don't have to post all versions, that's another misconception, to get things into the base table. What we take here is the edits in this state ID that have been pushed up to SDE.Default, and we need to reconcile them down.
Reconcile is an Esri operation, mostly often done through Esri ArcMap or maybe through a batch process. We reconcile them down to all these one, two, and three versions. Of course, if you have 1000 versions you get them into all 1000 versions. Now when we talk about reconcile, this is a hot topic on its own, if we have a single version that is un-reconciled, we don't have full reconcile across the system. If we have single version that is in conflict. A conflict means that an edit in number three conflicts with one of the states I'm trying to pull down for my posted version. That will not allow for that reconcile to work.
That basically invalidates a full reconcile. The hardest part to understand is that if you have 1000 versions and one version in that state tree is not reconciled, you're done. It doesn't matter, you have to get them all. This is why we harp so much on fixing your conflicts as the data is reconciled down on a daily basis to make sure that this is a continuous process. You may have conflicts everyday but if you resolve them, it will be a moving process that continues moving forward. If you get that full reconcile, eventually, the IDs here in this state ID are reconciled down to all versions in your state tree, it invalidates those states. Only in that point in time, we can come up and run the operation of what we call a compress.
This can be done in Esri ArcCatalog, it can be done in the command line, or a batch application. So the compress effectively looks at the state IDs. Takes the state IDs and looks for any invalidated states that are no longer needed because they have been reconciled down to all versions across the state tree. As it finds those, it can then look at the A and D tables, in our case the pole A and D tables. In your case, many more A and D tables. It can take those edits and push them from the A and D tables up to the base table (Electric.Pole). Now again, just to beat the dead horse a little bit, if any one version in this state tree has not been reconciled, either on purpose or perhaps it could have conflicts, those state IDs do not get invalidated.
Now I can run this compress 10 times. It would not matter. It will not affect the state tree because it has to hold those old states; we have not invalidated them. The reconcile operation will invalidate state IDs, that allows this to be effective. The compress being effective will reduce the count of the A and D tables, and therefore, will keep my performance for my geodatabase humming right along. Very important for all these operations to work in tandem of very complex topics, and that's why we have series of articles and now the e-book are coming out on the topic. But hopefully, this scenario will give you an overview, and maybe gets you a little bit more interested.
If you are running one of these systems, and you are the GIS manager or perhaps a GIS analyst, it's one of the most important topics for you to be familiar with. To ensure your geodatabase runs effectively, that it performs well, and ensure your data is getting to the right location so that your geodatabase works efficiently as possible. Thanks for your time today, hopefully this helped.