18.5. Classes and objects in Python: technical aspects

The aim of this section is to clarify technical aspects of classes and objects in Python.

18.5.1. Namespaces

Classes and instances have their own namespaces, that is accessible with the dot ('.') operator. As illustrated by Figure 18.5, these namespaces are implemented by dictionaries, one for each instance, and one for the class (see also [Martelli2002]).

Figure 18.5. Classes and instances namespaces.

Instances attributes.  As we have learnt, a class may define attributes for its instances. For example, attributes of s1, such as the name, are directly available through the dot operator:

>>> s1.name
The dictionary for the instance attributes is also accessible by its __dict__ variable, or the vars() function:
>>> s1.__dict__
{'seq: 'aaacaacttcgtaagtata', 'name': 'seq1'}
>>> vars(s1)
{'seq': 'aaacaacttcgtaagtata', 'name': 'seq1'}
The dir() command lists more attributes:
>>> dir(s1)
['__doc__', '__init__', '__module__', 'gc', 'translate', 'name', 'seq']
because it is not limited to the dictionary of the instance. It actually also displays its class attributes, and recursively the attributes of its class base classes (see Section 19.4). You can add attributes to an instance that were not defined by the class, such as the annotation in the following:
>>> s1.annotation = 'an annotation'
>>> s1.__dict__
{'seq: 'aaacaacttcgtaagtata', 'name': 'seq1', annotation: 'an annotation'}
Adding attributes on-the-fly is not something that is available in many object-oriented programming languages! Be aware that this type of programming should be used carefully, since by doing this, you start to have instances that have different behaviour, at least if you consider that the list of attributes defines a behaviour. This is not the same as having a different state by having different values for the same attribute. But this matter is probably a topic of discussion.

Class attributes.  It is also possible to define attributes at the class level. These attributes will be shared by all the instances (Figure 18.6). You define such attributes in the class body part, usually at the top, for legibility:

               class Protein:
                 weight = {"A":71.08,"C":103.14 ,"D":115.09 ,"E":129.12 ,"F":147.18 ,"G":57.06 ,"H":137.15 ,"I":113.17 ,"K":128.18 ,"L":113.17 ,"M":131.21 ,"N":114.11 ,"P":97.12 ,"Q":128.41 ,"R":156.20 ,"S":87.08 ,"T":101.11,"V":99.14 ,"W":186.21 ,"Y":163.18 ,"X": 110}
                 default_prosite_file = 'prosite.dat'
To access this attribute, you use the dot notation:
>>> Protein.default_prosite_file
>>> Protein.weight
{"A":71.08,"C":103.14 ,"D":115.09 ,"E":129.12 ,"F":147.18 ,"G":57.06 ,"H":137.15 ,"I":113.17 ,"K":128.18 ,"L":113.17 ,"M":131.21 ,"N":114.11 ,"P":97.12 ,"Q":128.41 ,"R":156.20 ,"S":87.08 ,"T":101.11,"V":99.14 ,"W":186.21 ,"Y":163.18 ,"X": 110}

Figure 18.6. Class attributes in class dictionary

You can also access to this attribute through an instance:
>>> p1.default_prosite_file
You cannot change the attribute through the instance, though:
>>> p1.default_prosite_file = 'myfile.dat'                                (1)
>>> Protein.default_prosite_file

This just creates a new default_prosite_file attribute for the p1 instance, which masks the class attribute, by the way.

The class attributes are displayed by the pydoc command, as opposed to the instance attributes (see Section 18.5.5).

Class methods are referenced in the class dictionary: but what is their value actually? As shown in Figure 18.6, the class dictionary entries for methods are pointing to standard Python functions. When accessing to a method through an instance name, such as in p1.mw, there is an intermediate data structure, that itself points to the class and the instance. Objects in this data structure are called bound methods:

>>> s1.gc
<bound method DNA.gc> of <__main__.DNA instance at 0x4016a56c>
They are said to be bound, because they are bound to a particular instance, and know about it. This is really important, for it is the only way to know what to do, and on which object to operate.

Figure 18.7. Classes methods and bound methods

18.5.2. Objects lifespan

Once it is created, an object's lifespan depends on the fact that there are references on it. Namely, as opposed to variables present within functions, an object can still exist after exiting the function or method where it has been created, as long as there is a valid reference to it, as shown in the following example:

class C1: pass

class C2:
    def show(self):
        print "I am an instance of class ", self.__class__
def create_C2_ref_in(p):
    p.c2 = C2()                                                           (1)

c1 = C1()
create_C2_ref_in(c1)                                                      (2)
c1.c2.show()                                                              (3)

This function creates an instance of class C2 and stores its reference in an attribute of p, an instance of class C1.


Creation of the C2 instance by calling create_C2_ref_in


This statement displays: "I am an instance of class __main__.C2"

As you can observe, the C2 instance exists after exiting the create_C2_ref_in function. It will exist as long as its reference remains in the c1.c2 attribute. If you issue:

c1.c2 = None
There will be no reference left to our C2 instance, and it will be automatically deleted. The same would happen if you would issue an additional call to the create_C2_ref_in function:
it would overwrite the preceeding reference to the former C2 instance, and delete it. You can check this by asking the c1.c2 reference for its identity:
Of course, another way to delete an object is to use the del function:
del c1.c2

18.5.3. Objects equality

Instances equality cannot be tested by the == operator.If we come back to our DNA class:

>>> a = DNA('seq1', 'acaagatgccattgtc')
>>> b = DNA('seq1', 'acaagatgccattgtc')
>>> a.__dict__
{'name': 'seq1', 'seq': 'acaagatgccattgtc'}
>>> b.__dict__
{'name': 'seq1', 'seq': 'acaagatgccattgtc'}
>>> a == b
>>> a.__dict__ == b.__dict__
This means that the equality operator must be defined by the programmer. We will see the __eq__ special method later in Section 19.3.3.

Instances identity means that two objects are in fact the same object, or more exactly, that two variables refer to the same object.

>>> a = DNA('seq1', 'acaagatgccattgtc')
>>> b = a
>>> b == a
>>> b is a
As for all Python objects, identity implies equality.

18.5.4. Classes and types

Types in Python include integer,floating-point numbers, strings, lists,dictionaries, etc... Basically, types and classes are very similar. There is a general difference between them, however, lying in the fact that there are literals for built-in types, such as:

'a nice string'
[7, 'a', 45]
whereas there is no literal for a class. The reason for this difference between types and classes is that you can define a predicate for recognizing expressions of a type [Wegner89], while, with class, you cannot, you can only define collections of objects after a template.

As shown in Figure 18.8, the Python type() can be used to know whether a variable is a class or an instance. It will very basically answer ClassType or InstanceType, as defined in module types, but it will not tell you which class an instance belongs to.

Figure 18.8. Types of classes and objects.

18.5.5. Getting information on classes and instances

It is important to know what classes are available, what methods are defined for a class, and what arguments can be passed to them. First, classes are generally defined in modules, and the modules you want to use should have some documentation to explain how to use them. Then, you have the pydoc command that lists the methods of the class, and describes their parameters. The following command displays information on the DNA class, provided it is in the sequence.py file:

	  pydoc sequence.DNA
See also the embedding module, which might bring additional documentation about related components. This may be important when several classes work together, as is described in Section 18.4.

When you consult the documentation of a class with the pydoc command, you get most of the time a strange list of method names, such as __str__ or __getitem__. These methods are special methods to redefine operators, and will be explained in the next chapter on object-oriented design (Section 19.3.3).

Caution: the defined instances attributes will not be listed by pydoc, since they belong to the instances rather than to the class. That is why they should be described in the documentation string of the class. If they are not, which sometimes happens..., run the Python interpretor and create an instance, then ask for its dictionary or use the dir() command:

>>> s1 = DNA()
>>> dir(s1)
['__doc__', '__init__', '__module__', 'gc', 'revcompl', 'translate', 'name', 'seq']

Information on instances.  There are some mechanisms to know about the class of a given instance. You can use the special attribute __class__:

>>> s1 = DNA()
>>> s1.__class__
class __main__.DNA at 0x81d1d64>
You can even use this information to create other instances:
>>> s2=s1.__class__()
>>> s2              
__main__.DNA instance at 0x8194ca4>
This can be useful if you need to create an object of the same class of another source object, without knowing the class of the source object. You can also ask whether an object belongs to a given class:
>>> isinstance(s1,DNA)
As mentionned above, the Python type() will not provide the class of an instance, but just: InstanceType (see Figure 18.8).