Python Best Practices - Part 1

Recently, SSP had a chance to write a rather complex python program for use by one of our outstanding clients. Due to the complexity of the processing involved, we learned lots of great things about python, and wanted to share those best practices with you.

This is part 1 of a two part series.

The code is available as a Python template here.

Using a Configuration File

Every (decent) programmer knows that some things should NOT be hardcoded into a script. Specifically, things that change between uses of the script, such as paths, connection strings, passwords, etc., should be stored in a configuration file, and read at runtime. This makes the script much more re-usable.

Python includes a good configuration file module called ConfigParser. To use the ConfigParser, simply include the module in your python header:

Import ConfigParser

Then you can create a variable using the ConfigParser object. You call the ConfigParser::read() method to open the configuration file.

config = ConfigParser.ConfigParser()
config_file = CheckArgs(sys.argv[1:])
config.read(config_file)

In this example, the config_file parameter is passed into python script as an argument. The format of the ConfigParser file is well documented. Basically, you can define different sections of the file using square bracket [] delimiters. If you only use one section, you should call it [DEFAULT].

[DEFAULT]
# Path related variables
base_dir: e:/GIS
scripts_dir: %(base_dir)s/scripts
logfile_path: %(base_dir)s/logs
logging_level: INFO
connection_dir: %(base_dir)s/connections
source_connection_dir: %(connection_dir)s/source
destination_connection_dir: %(connection_dir)s/destination

workspace: e:/fgdb/workspace

This example defines the DEFAULT section of the configuration file. You can add as many sections as you need, just make sure to mark them using the square brackets [SECTION_NAME].

A couple other helpful notes about the config file:

One variable can reference another using the syntax: myvar: %(myothervar)s
Lists can be created using the syntax: mylist: [‘item1’, ‘item2’, ‘item3’]
Booleans can be created using the syntax: myboolean: True

Finally, any variables you need to set in your python code, you get them using the ConfigParser::get() method.

workspace = config.get("DEFAULT", "workspace")

In this example, the workspace variable would be set to “e:/fgdb/workspace”. If you are retrieving a list or Boolean from the config file, you need to make sure to use the “eval” function, so the proper type is returned:

Myvar = eval(config.get(“DEFAULT”, “myboolean”))

Using a Logfile (Correctly)

Many of the python examples I’ve found across the web use simple PRINT statements for debugging output. While that can be easy and handy at times, learning how to use a proper logfile can make coding much easier, as well as simplifying your deployment from TEST into PRODUCTION. With PRINT statements, I eventually find myself going back thru the script to comment/uncomment PRINT statements as needed.

Python includes a good logging module (appropriately enough called “logging”). The logging module allows you to set a LOGGING_LEVEL as one of (DEBUG, INFO, WARNING, ERROR).

Building off the ConfigParser example above, we can include the logging module with:

import ConfigParser, logging

# Get the logfile_path from our config file
logfile = config.get("DEFAULT", "logfile_path") + "/python_script.log"

# Get the logging_level from our config file
logging_level = config.get("DEFAULT", "logging_level")

# Based on our logging_level, set our logging.basicConfig
if logging_level == "DEBUG":
    logging.basicConfig(filename=logfile, filemode='w', format='%(asctime)s %(message)s', \ 
        datefmt='%m/%d/%Y %I:%M:%S %p', level=logging.DEBUG)

elif logging_level == "INFO":
    logging.basicConfig(filename=logfile, filemode='w', format='%(asctime)s %(message)s', \
        datefmt='%m/%d/%Y %I:%M:%S %p', level=logging.INFO)

The “format” option specified in the above example will print the date and time at the beginning of each line of the logfile. Very handy!

Now to use the logging, you use “logging” function calls instead of PRINT statements. The nice part of this is that, they will now be logged based on your logging_level! And the logging level is downward inclusive among levels. DEBUG is the most detailed logging, while ERROR is the least detailed. So if your logging_level is set to DEBUG, it will print any statements that include logging.debug, logging.info, and logging.error.

Some helpful hints with logging:

Start each function with a call: logging.debug(“ Entering myFunction”)
Every “except” block should include: logging.error(“Exception in myFunction”)
Include a “finally” block in each function that includes: logging.debug(“ Exiting myFunction”)
While you’re writing your code, set your logging_level to DEBUG. Once you go to Production, set to to INFO or even WARNING.

Here is an example function with logging statements used throughout. Notice in one function, we can include logging.debug, logging.info and logging.error calls.

def CopyFeatureClasses(fclist, fgdb, STATUS):
    try:
        logging.debug("    Enter CopyFeatureClasses")
        for src_fc in fclist:
            cls_name = src_fc[src_fc.rfind(".")+1:]
            logging.debug("        cls_name = " + cls_name)
            if arcpy.Exists(fgdb + "/" + cls_name):
                logging.debug("        Found. Skipping.")
            else:
                logging.info("    Copy feature class::" + cls_name)
                arcpy.Copy_management(src_fc, fgdb + "/" + cls_name)
    except:
        logging.error(" *** Exception in CopyFeatureClasses ***")
        logging.error(arcpy.GetMessages(2))
        STATUS = 'Failed'
    finally:
        logging.debug("        Exiting CopyFeatureClasses")
        return STATUS

Now to call that function, we can use the following:

fclist = arcpy.ListFeatureClasses()
logging.info("Copying feature classes…”)
STATUS = CopyFeatureClasses(fclist, fgdb, STATUS)

The logging module will allow you to add a stream handler for additional flexibility. For example, if you want to show all logging.error calls on the console as well as capture them in the logfile, you can add a StreamHandler after setting your logging.basicConfig as follows:

    console = logging.StreamHandler()
    console.setLevel(logging.ERROR)
    logging.getLogger('').addHandler(console)

Error Handling

I find it very helpful to carry a global STATUS variable throughout my code. I pass that variable into and back from every function call. For example:

def myFunction1 (var1, STATUS)
        try:
        logging.debug(“Entering myFunction1 – var1::” + var1)
        # do something with var1

        Except:
        logging.error(" *** Exception in myFunction1 ***")
        logging.error(arcpy.GetMessages(2))
        STATUS = 'Failed'
    finally:
        logging.debug("        Exiting myFunction1")
        return STATUS

def myFunction2 (var2, STATUS)
        try:
        logging.debug(“Entering myFunction2 – var2::” + var2)
        # do something with var1

        Except:
        logging.error(" *** Exception in myFunction2 ***")
        logging.error(arcpy.GetMessages(2))
        STATUS = 'Failed'
    finally:
        logging.debug("        Exiting myFunction2")
        return STATUS

Then every call should look like this:

STATUS = myFunction1(var1, STATUS)
STATUS = myFunction2(var2, STATUS)

If you need to check that STATUS between calls, you can. It will depend on what the functions are doing, and if they rely on each other for success. In many cases, I want my script to continue processing so I don’t need to check the status between calls, I want it to keep going, and give me a global STATUS at the end.

Also notice the “try-except-finally” block in each function. It should go without saying that this is a python “best practice”.

Lastly, another note on error handling. Typically, my “except” blocks will include several pieces of information:

    except:
        logging.error(" *** Exception in ProcessDatabase ***")
        logging.error(arcpy.GetMessages(2))
        logging.error(sys.exc_info()[0])
        STATUS = 'Failed'
        SendLogfile(logfile, config, STATUS)

If you are using the ARCPY module in your function call, your EXCEPT block should call arcpy.GetMessages(2), in order to log the arcpy error messages.

You should also include a call to “sys.exc_info()[0]”, as this will give you any generic python exceptions (such as Windows file system errors, divide by zero errors, etc).

If you’re using the global STATUS variable, make sure you set that to ‘Failed’. And if it’s critical, you can even call the “SendLogfile” function (defined in part 2 of this series), to email the logfile to those who might be interested in the results.

Empowered Solutions Blog

Sharing Our Insights on the Future of Utility Work

Python Best Practices – Part 1

Using a Configuration File

Using a Logfile (Correctly)

Error Handling

Esri GIS Python Scripts SSP Blogs

We Wrote the Book

The Indispensible Guide to ArcGIS Online

What do you think?

Leave a comment, and share your thoughts Cancel Reply

Connect

Follow us on social media