18.4. Combining objects

A program is generally not built on a single object, but rather on a combination of several objects that interact together. In the DNA example, we can have several other kinds of objects: protein sequences, motifs, ... The protein object could be either created from an initial sequence of protein amino-acids, or computed by the DNA translate method. Protein objects could have specific methods for protein sequences, such as hydrophobicity or molecular weight. These objects will be instances of the class Protein.

First of all, let us look at how we could represent a simplified Protein object. Figure Figure 18.3 shows an object having one method: mw, and 2 attributes: name and seq.

Figure 18.3. A Protein object.

The definition of the Protein class follows:

class Protein:

    weight = {"A":71.08,"C":103.14 ,"D":115.09 ,"E":129.12 ,"F":147.18 ,"G":57.06 ,"H":137.15 ,"I":113.17 ,"K":128.18 ,"L":113.17 ,"M":131.21 ,"N":114.11 ,"P":97.12 ,"Q":128.41 ,"R":156.20 ,"S":87.08 ,"T":101.11,"V":99.14 ,"W":186.21 ,"Y":163.18 ,"X": 110}

    default_prosite_file = 'prosite.dat'

    def __init__(self, name=None, seq=None):
        self.name = name
        self.seq = upper(seq)

    def mw(self):
        molW = 0
        for aa in self.seq:
            molW += Protein.weight[aa]
     
        #add water at the end of protein
        molW += 18.02
        #convert in Kda
        molW = molW / 1000
        
        return molW

    def setname(self, name):
        self.name = name
	  
You can notice that the class starts by the definition of weight and default_prosite_file variables. As we will see later, this class variable is available to all instances of the class.

Now, the DNA objects knows how to be translated, right? So it would be more clever for the DNA class translate method to return a Protein object... The new definition of the translate method is:

     def translate(self, frame=0):
        """
        frame: 0, 1, 2, -1, -2, -3
        """
        if frame < 0 :
            seq = self.revcompl()
            frame = abs(frame) - 1
        else:
            seq = self.seq

        if frame > 2:
            return ''

        protseq = ''
        nb_codons = len(seq)/3

        for i in range(frame,len(seq) - 2,3):
            codon = seq[i:i+3]
            protseq += Standard_Genetic_Code[codon]

        new_protein = Protein(name=self.name + " translation", seq=protseq)

	return new_protein
         
Look at the returned value: it is now a Protein object. The argument for the seq parameter of the Protein class's __init__ method is the value of the newly computed protseq, and the argument for the name is constructed from the Protein object's name.

In the Protein object, we might also be interested in keeping the reference to the initial DNA object. This can help to analyze the protein sequence later. Figure 18.4 shows the DNA and Protein objects, and the link between them. Now, the Protein object has 3 attributes: name, seq and dna.

Figure 18.4. Protein and DNA objects.

The __init__ method of the Protein class is now:

    def __init__(self, name=None, seq=None, dna=None):
        self.name = name
        self.seq = upper(seq)
        self.dna = dna
         
The final code for Protein object instantiation in the translate method is now:
        new_protein = Protein(name=self.name + " translation", 	seq=protseq, dna=self)
        return new_protein
         
Look at the value provided for the dna parameter of the Protein __init__ method. It is a reference to the DNA object, i.e: self. Notice that none of the parameters is mandatory (except self of course). In particular, the dna parameter does not have to be provided when the Protein is directly created from a file, as opposed to translated from a DNA object.

Exercise 18.2. A Point class (continued)

Add a distance method to our Point class that computes the distance between 2 points:

>>> p1 = Point(2,3)
>>> p2 = Point(3,3)
>>> p1.distance(p2)
1.0
	    
Solution 18.2