15.3. Recursive data structures

In the Chapter 10, we have worked with examples using nested list. If we examine them a little bit more, we will see that a nested list can be defined recursively.

Nested list

A nested list is an ordered collection of elements that are:

  • either items

  • or nested lists

The items define the content, whereas the lists define the structure of the collection.

How can we use this?  Let's use phylogenetic trees as examples for such a recursive list structure. Figure 15.3 shows an example of a phylogenetic tree topology that can be implemented in Python as follows:

>>> tree = [[[['human', 'chimpanzee'], 'gorilla'], 'orang-utan'], 'gibbon']
      
Figure 15.3 shows the representation of tree.

Figure 15.3. A phylogenetic tree topology

Figure 15.4. Tree representation using a recursive list structure

Working with recursive structures.  Assume that we would do something for each branching point in the above tree. A simple traversal using for or while is no more possible, because both loop structures handle only one level of the list. But with recursive functions we can do such things with the following strategy:

Procedure 15.4. tree_traversal(tree)

INPUT: a tree tree

  1. do what you have to do when you enter the branching point
  2. if tree is a list:
    1. *** recursion step ***
    2. for each element of tree:
      1. tree_traversal(element)
  3. otherwise:
    1. *** end condition of the recursion, here the leaves of the tree ***
  4. do what you have to do when you leave the branching point

Doing something when you enter the branching point, is also known as preorder traversal of a tree, whereas doing something when leaving a branching point is also called postorder traversal of a tree.

We will give two example applications of this strategy for our tree. The first one prints all species of the trees:

    tree = [[[['human', 'chimpanzee'], 'gorilla'], 'orang-utan'], 'gibbon']

import types

def print_species(tree):
    if type(tree) is types.ListType:
        for child in tree:
            print_species(child)
    else:
        print tree



>>> print_species(tree)
human
chimpanzee
gorilla
orang-utan
gibbon
  

and the second one that prints for a tree or a binary tree [2] the two populations that are split at this point.

    tree = [[[['human', 'chimpanzee'], 'gorilla'], 'orang-utan'], 'gibbon']

import types

def splits(tree):
    if type(tree) is types.ListType:
        all_leaves = []
        for child in tree:
            child_leaves = splits(child)
            print child_leaves,
            all_leaves += child_leaves
        print
        return all_leaves
    else:
        return [ tree ]
    

def binary_splits(tree):
    if type(tree) is types.ListType:
        left = binary_splits(tree[0])
        right = binary_splits(tree[1])
        print left, right
        return left + right
    else:
        return [ tree ]
    



>>> splits(tree)
['human'] ['chimpanzee']
['human', 'chimpanzee'] ['gorilla']
['human', 'chimpanzee', 'gorilla'] ['orang-utan']
['human', 'chimpanzee', 'gorilla', 'orang-utan'] ['gibbon']
['human', 'chimpanzee', 'gorilla', 'orang-utan', 'gibbon']
  

Finally, we will show the code that separates the work to do from the tree traversal itself by functions. The traversal function tree_traversal does not change for the task. The user has only to redefine the functions do_prework, do_postwork and do_leafwork.

    import types

def do_prework(node):
    print "prework:", node

def do_postwork(node):
    print "postwork:", node

def is_leaf(node):
    return type(node) is not types.ListType
        
def do_leafwork(leaf):
    print "that is a leaf:", leaf
    
def tree_traversal (node):
    do_prework(node)
    
    if not is_leaf(node):
        for child in node:
            tree_traversal(child)
    else:
        do_leafwork(node)

    do_postwork(node)




>>> tree=[[[['human', 'chimpanzee'], 'gorilla'], 'orang-utan'], 'gibbon']
>>> tree_traversal(tree)
prework: [[[['human', 'chimpanzee'], 'gorilla'], 'orang-utan'], 'gibbon']
prework: [[['human', 'chimpanzee'], 'gorilla'], 'orang-utan']
prework: [['human', 'chimpanzee'], 'gorilla']
prework: ['human', 'chimpanzee']
prework: human
that is a leaf: human
postwork: human
prework: chimpanzee
that is a leaf: chimpanzee
postwork: chimpanzee
postwork: ['human', 'chimpanzee']
prework: gorilla
that is a leaf: gorilla
postwork: gorilla
postwork: [['human', 'chimpanzee'], 'gorilla']
prework: orang-utan
that is a leaf: orang-utan
postwork: orang-utan
postwork: [[['human', 'chimpanzee'], 'gorilla'], 'orang-utan']
prework: gibbon
that is a leaf: gibbon
postwork: gibbon
postwork: [[[['human', 'chimpanzee'], 'gorilla'], 'orang-utan'], 'gibbon']