The Pitch: Synthetic Stereo Photos for Geospatial Asset Realignment (SSPGAR)
What I proposed to the Shark Tank members was to hook a camera and a GPS receiver up to an Nvidia Jetson Nano, then write a bunch of code to run an object detection model that was trained to identify overhead electric utility assets and use the object’s position in multiple images to triangulate where the asset is on the earth. And then using the known locations of the assets, move data in a GIS to the actual location of the asset. I’ve seen lots of companies overhead electric data, and when it comes to spatial accuracy, there is some good, some bad, and some ugly. If it’s really ugly, this proposal won’t work, it’s not a silver bullet, but if the data is ~mostly~ right (the GIS data is off by a few dozen feet or less) using this method, I think we’ll be able to easily, get the spatial accuracy within a few feet of the asset’s true position. Having spatially accurate GIS data is critical to numerous aspects of operating a utility, from right-of-way maintenance and easement management to accuracy of outage management systems, and everything in between.
Getting the Model Trained
Once the idea was chosen to be developed, Justin and I got started. I took my trusty GoPro Hero 7 Black for a ride on my dashboard and snapped an image every second while I drove around town. I downloaded those images to my computer and then got busy “annotating” the images with a program called labelImg. The annotating process involves looking at the images and drawing a box around all the things in the image you want to be able to have the AI model identify. Once I had the images all annotated, I gave them to Justin. Justin did a bunch of research and it looked like YOLOv4 seemed like the way to go. He used some python on Google’s Colab environment to train a YOLOv4 model to recognize the devices. Once his training of the model was done, he gave me the files for the model. I then took the model and converted it to an ONNX model and then converted the ONNX model to a Tensor RT model so it would run REALLY fast on the Nvidia Jetson Nano.

A Setback Occurs, Time to Rethink
I had originally planned on taking this whole show on the road. I was going put the computer with Jetson Nano into the car, hooked up to a GPS antenna, and a webcam to take images/GPS readings and do the object detection all from the vehicle. Well, that didn’t work out so well. I got all the code working in my office, pointed the webcam at a monitor showing a pic of a pole and it all worked, I was pumped! However, when I moved everything out the vehicle, the images I was getting from my webcam were SUPER blurry and the object detection model was performing terribly as a result. I’m not going to lie, I was devastated, I really thought this was going to work and had not ever considered this as a risk. In the name of making progress, I pivoted direction. Instead of building a whole on-vehicle rig to do the image capture, GPS capture, and object detection in the field, I went back to the GoPro, this time mounted to the roof with a suction cup mount. With the camera’s top unobstructed by the windshield the GPS coordinates in the image metadata turned out to be reliable. I set the camera back up to take time-lapse photos and then after I drove around, I dumped the images onto the Jetson Nano, sitting on a desk in my office. In the end, although this method isn’t as fancy, I think it may be better, attaching a GoPro to a car and then copying files from an SD card is much easier than setting up the whole rig in the vehicle, and then needing a back-office process afterword anyway.

Triangulating Objects
With the Tensor RT model ready to find some electric facilities in the images, we needed to be able to identify where the devices were in the world. I wrote some code to extract the GPS coordinates from the images. But knowing where the image was taken is only a small part of the process. The objects in the image are not where the camera is. I used the location of the bounding boxes produced by the model to create a line from where the camera was when the image was taken in the direction of the object that was identified in the image. Then I did that for lots of images and found where the lines from the camera to the detected objects intersected each other. With the intersection points known, I ran a clustering algorithm on the points to determine where there were clusters of points and then computed the centroid of the clusters to do some fuzzy triangulation of the locations of the objects.

Moving the existing GIS data
Knowing where things are is great, but we aren’t done yet, we haven’t adjusted the existing GIS data. I then connected the poles that were identified by the model with lines that were broken where a primary conductor would branch off the mainline. Then I wrote an algorithm that takes into account the direction/length of the line as well as the numbers of the devices we were able to identify using the model and determines the characteristics of the sections of the lines. I applied that algorithm to the data we simulated, as well as the existing data in the GIS. Once the characteristics were computed I then ran a similarity comparison between the simulated data and the existing GIS data in the same area. Using those results we can determine which section of the simulated data with the correct spatial locations is the same as the data in the GIS. Then it was as simple as creating a “link” between the existing GIS data and the simulated data and using the “rubbersheet” tool to move the existing GIS data to the proper place on the map.

Results
The preliminary results of the effort are, in my professional opinion, nothing short of astonishing. I’ve created lots of pole locations that are less than 3 feet from their actual location, using nothing more than a camera that cost a few hundred dollars. No multi-thousand-dollar GPS receiver is needed. That’s not to say that this effort is done and perfect. Because the GoPro doesn’t store the direction of the camera when the image was taken, I’m synthesizing that information between GPS readings, and as a result, when the vehicle is turning the results are less than perfect, perhaps switching to another camera would solve that issue. More training of the model would result in better accuracy on the object detection front, and it could certainly benefit from learning how to identify more types of assets. On the GPS front, in heavily tree-covered areas or in the urban jungle with lots of tall buildings I’d expect the GoPro’s GPS reliability to drop dramatically, and perhaps another sensor suite may be necessary. Additionally, it may also be beneficial to convert this to work directly with a MPEG video stream instead of time-lapse photos, but that’ll have to wait for another day!
What do you think?