Recently, SSP had a chance to write a rather complex python program for use by one of our outstanding clients. Due to the complexity of the processing involved, we learned lots of great things about python, and wanted to share those best practices with you.
This is part 1 of a two part series.
The code is available as a Python template here.
Using a Configuration File
Every (decent) programmer knows that some things should NOT be hardcoded into a script. Specifically, things that change between uses of the script, such as paths, connection strings, passwords, etc., should be stored in a configuration file, and read at runtime. This makes the script much more re-usable.
Python includes a good configuration file module called ConfigParser. To use the ConfigParser, simply include the module in your python header:
Then you can create a variable using the ConfigParser object. You call the ConfigParser::read() method to open the configuration file.
config = ConfigParser.ConfigParser() config_file = CheckArgs(sys.argv[1:]) config.read(config_file)
In this example, the config_file parameter is passed into python script as an argument. The format of the ConfigParser file is well documented. Basically, you can define different sections of the file using square bracket  delimiters. If you only use one section, you should call it [DEFAULT].
[DEFAULT] # Path related variables base_dir: e:/GIS scripts_dir: %(base_dir)s/scripts logfile_path: %(base_dir)s/logs logging_level: INFO connection_dir: %(base_dir)s/connections source_connection_dir: %(connection_dir)s/source destination_connection_dir: %(connection_dir)s/destination workspace: e:/fgdb/workspace
This example defines the DEFAULT section of the configuration file. You can add as many sections as you need, just make sure to mark them using the square brackets [SECTION_NAME].
A couple other helpful notes about the config file:
- One variable can reference another using the syntax: myvar: %(myothervar)s
- Lists can be created using the syntax: mylist: [‘item1’, ‘item2’, ‘item3’]
- Booleans can be created using the syntax: myboolean: True
Finally, any variables you need to set in your python code, you get them using the ConfigParser::get() method.
workspace = config.get("DEFAULT", "workspace")
In this example, the workspace variable would be set to “e:/fgdb/workspace”. If you are retrieving a list or Boolean from the config file, you need to make sure to use the “eval” function, so the proper type is returned:
Myvar = eval(config.get(“DEFAULT”, “myboolean”))
Using a Logfile (Correctly)
Many of the python examples I’ve found across the web use simple PRINT statements for debugging output. While that can be easy and handy at times, learning how to use a proper logfile can make coding much easier, as well as simplifying your deployment from TEST into PRODUCTION. With PRINT statements, I eventually find myself going back thru the script to comment/uncomment PRINT statements as needed.
Python includes a good logging module (appropriately enough called “logging”). The logging module allows you to set a LOGGING_LEVEL as one of (DEBUG, INFO, WARNING, ERROR).
Building off the ConfigParser example above, we can include the logging module with:
import ConfigParser, logging # Get the logfile_path from our config file logfile = config.get("DEFAULT", "logfile_path") + "/python_script.log" # Get the logging_level from our config file logging_level = config.get("DEFAULT", "logging_level") # Based on our logging_level, set our logging.basicConfig if logging_level == "DEBUG": logging.basicConfig(filename=logfile, filemode='w', format='%(asctime)s %(message)s', \ datefmt='%m/%d/%Y %I:%M:%S %p', level=logging.DEBUG) elif logging_level == "INFO": logging.basicConfig(filename=logfile, filemode='w', format='%(asctime)s %(message)s', \ datefmt='%m/%d/%Y %I:%M:%S %p', level=logging.INFO)
The “format” option specified in the above example will print the date and time at the beginning of each line of the logfile. Very handy!
Now to use the logging, you use “logging” function calls instead of PRINT statements. The nice part of this is that, they will now be logged based on your logging_level! And the logging level is downward inclusive among levels. DEBUG is the most detailed logging, while ERROR is the least detailed. So if your logging_level is set to DEBUG, it will print any statements that include logging.debug, logging.info, and logging.error.
Some helpful hints with logging:
- Start each function with a call: logging.debug(“ Entering myFunction”)
- Every “except” block should include: logging.error(“Exception in myFunction”)
- Include a “finally” block in each function that includes: logging.debug(“ Exiting myFunction”)
- While you’re writing your code, set your logging_level to DEBUG. Once you go to Production, set to to INFO or even WARNING.
Here is an example function with logging statements used throughout. Notice in one function, we can include logging.debug, logging.info and logging.error calls.
def CopyFeatureClasses(fclist, fgdb, STATUS): try: logging.debug(" Enter CopyFeatureClasses") for src_fc in fclist: cls_name = src_fc[src_fc.rfind(".")+1:] logging.debug(" cls_name = " + cls_name) if arcpy.Exists(fgdb + "/" + cls_name): logging.debug(" Found. Skipping.") else: logging.info(" Copy feature class::" + cls_name) arcpy.Copy_management(src_fc, fgdb + "/" + cls_name) except: logging.error(" *** Exception in CopyFeatureClasses ***") logging.error(arcpy.GetMessages(2)) STATUS = 'Failed' finally: logging.debug(" Exiting CopyFeatureClasses") return STATUS
Now to call that function, we can use the following:
fclist = arcpy.ListFeatureClasses() logging.info("Copying feature classes…”) STATUS = CopyFeatureClasses(fclist, fgdb, STATUS)
The logging module will allow you to add a stream handler for additional flexibility. For example, if you want to show all logging.error calls on the console as well as capture them in the logfile, you can add a StreamHandler after setting your logging.basicConfig as follows:
console = logging.StreamHandler() console.setLevel(logging.ERROR) logging.getLogger('').addHandler(console)
I find it very helpful to carry a global STATUS variable throughout my code. I pass that variable into and back from every function call. For example:
def myFunction1 (var1, STATUS) try: logging.debug(“Entering myFunction1 – var1::” + var1) # do something with var1 Except: logging.error(" *** Exception in myFunction1 ***") logging.error(arcpy.GetMessages(2)) STATUS = 'Failed' finally: logging.debug(" Exiting myFunction1") return STATUS def myFunction2 (var2, STATUS) try: logging.debug(“Entering myFunction2 – var2::” + var2) # do something with var1 Except: logging.error(" *** Exception in myFunction2 ***") logging.error(arcpy.GetMessages(2)) STATUS = 'Failed' finally: logging.debug(" Exiting myFunction2") return STATUS
Then every call should look like this:
STATUS = myFunction1(var1, STATUS) STATUS = myFunction2(var2, STATUS)
If you need to check that STATUS between calls, you can. It will depend on what the functions are doing, and if they rely on each other for success. In many cases, I want my script to continue processing so I don’t need to check the status between calls, I want it to keep going, and give me a global STATUS at the end.
Also notice the “try-except-finally” block in each function. It should go without saying that this is a python “best practice”.
Lastly, another note on error handling. Typically, my “except” blocks will include several pieces of information:
except: logging.error(" *** Exception in ProcessDatabase ***") logging.error(arcpy.GetMessages(2)) logging.error(sys.exc_info()) STATUS = 'Failed' SendLogfile(logfile, config, STATUS)
If you are using the ARCPY module in your function call, your EXCEPT block should call arcpy.GetMessages(2), in order to log the arcpy error messages.
You should also include a call to “sys.exc_info()”, as this will give you any generic python exceptions (such as Windows file system errors, divide by zero errors, etc).
If you’re using the global STATUS variable, make sure you set that to ‘Failed’. And if it’s critical, you can even call the “SendLogfile” function (defined in part 2 of this series), to email the logfile to those who might be interested in the results.