A program is generally not built on a single object, but rather on a combination of several objects that interact together. In the DNA example, we can have several other kinds of objects: protein sequences, motifs, ... The protein object could be either created from an initial sequence of protein amino-acids, or computed by the DNA translate method. Protein objects could have specific methods for protein sequences, such as hydrophobicity or molecular weight. These objects will be instances of the class Protein.
First of all, let us look at how we could represent a simplified Protein object. Figure Figure 18.3 shows an object having one method: mw, and 2 attributes: name and seq.
The definition of the Protein class follows:
class Protein:
weight = {"A":71.08,"C":103.14 ,"D":115.09 ,"E":129.12 ,"F":147.18 ,"G":57.06 ,"H":137.15 ,"I":113.17 ,"K":128.18 ,"L":113.17 ,"M":131.21 ,"N":114.11 ,"P":97.12 ,"Q":128.41 ,"R":156.20 ,"S":87.08 ,"T":101.11,"V":99.14 ,"W":186.21 ,"Y":163.18 ,"X": 110}
default_prosite_file = 'prosite.dat'
def __init__(self, name=None, seq=None):
self.name = name
self.seq = upper(seq)
def mw(self):
molW = 0
for aa in self.seq:
molW += Protein.weight[aa]
#add water at the end of protein
molW += 18.02
#convert in Kda
molW = molW / 1000
return molW
def setname(self, name):
self.name = name
You can notice that the class starts by the definition of
weight
and default_prosite_file variables. As we
will see later, this class variable is
available to all instances of the class.
Now, the DNA objects knows how to be translated, right? So it would be more clever for the DNA class translate method to return a Protein object... The new definition of the translate method is:
def translate(self, frame=0):
"""
frame: 0, 1, 2, -1, -2, -3
"""
if frame < 0 :
seq = self.revcompl()
frame = abs(frame) - 1
else:
seq = self.seq
if frame > 2:
return ''
protseq = ''
nb_codons = len(seq)/3
for i in range(frame,len(seq) - 2,3):
codon = seq[i:i+3]
protseq += Standard_Genetic_Code[codon]
new_protein = Protein(name=self.name + " translation", seq=protseq)
return new_protein
Look at the returned value: it is now
a Protein
object. The argument for the seq
parameter of the Protein
class's __init__ method is the value of the newly
computed protseq, and the
argument for the name is
constructed from the Protein
object's name.
In the Protein object, we might also be interested in keeping the reference to the initial DNA object. This can help to analyze the protein sequence later. Figure 18.4 shows the DNA and Protein objects, and the link between them. Now, the Protein object has 3 attributes: name, seq and dna.
The __init__ method of the Protein class is now:
def __init__(self, name=None, seq=None, dna=None):
self.name = name
self.seq = upper(seq)
self.dna = dna
The final code for Protein object
instantiation in the translate method
is now:
new_protein = Protein(name=self.name + " translation", seq=protseq, dna=self)
return new_protein
Look at the value provided for the dna parameter of
the Protein __init__
method. It is a reference to the DNA
object, i.e: self.
Notice that none of the parameters is mandatory
(except self of course). In
particular, the dna parameter does not
have to be provided when the Protein
is directly created from a file, as opposed to translated
from a DNA object.
![]() | Exercise 18.2. A Point class (continued) |
|
Add a distance method to our Point class that computes the distance between 2 points: >>> p1 = Point(2,3) >>> p2 = Point(3,3) >>> p1.distance(p2) 1.0Solution 18.2 | |