17.3. Parsing command line options with getopt

The getopt module helps script to parse a complex command line options. Example 17.5 shows an example.

Example 17.5. Getopt example

This example shows a piece of code for handling options in a program called, say, filteralig, taking an alignment and filtering some sites according to specific criteria. Options are of the form: '-o1 value1 -o2 value2' where 'o1' and 'o2' are the names of the options and value1, value2 are their argument, which might be optional for some of the parameters. You can use this script this way, for instance:
filteralig -t0.7 -f2,3 align_file
	
import sys
import getopt

def usage(prog="filtersites"):                                            (1)
    print """
filteralig : filter sites in alignments

 filteralig [-ch] [-t <threshold>] [-f <frames>] [-i <cols>] <alignment>

 -h                  print this message
 
 -c                  print colum numbers of the original alignment
        
 -t <threshold>      filter all colums with a conservation above <threshold>
 -f <frames>         filter all codonpositions of frames
                     possible values 1, 2, 3 
                     for more than one use syntaxe: '1,2'
 -i <cols>           filter this columns
                     syntaxe: give a string with the column numbers separated by
 ','

 <alignment>         the file has to be in clustalw format

 """

o, a = getopt.getopt(sys.argv[1:], 'ct:f:i:h')                            (2)
opts = {}
for k,v in o:                                                             (3)
    opts[k] = v
if  opts.has_key('-h'):                                                   (4)
    usage(); sys.exit(0)
if len(a) < 1:                                                            (5)
    usage(); sys.exit("alignment file missing")
	  
1

A usage function is very useful to help the user in case of an error on the command line.

2

The first parameter for the getopt function should be a string containing the actual arguments the script has been called with, not including the script name, available in sys.argv[0].

The second parameter is a string describing the expected options. The options string which is passed to getopt is here: 'ct:f:i:h'. This means that the following options are available: c, t, f, i and h. When a ':' is added just after, this means that the option expects a value. For instance, the '-t' option requires a threshold value. See the usage!

The getopt function returns tuple, whose first element is a list of (option, value) pairs. The second element is the list of program arguments left after the option list was stripped. Here, a filename for an alignment file is expected.

3

Storing (option, value) pairs in a dictionary.

4

If the user has entered a -h, help is printed.

5

Has the user provided a filename ? If so, it is available in a[0].