19.3. Abstract Data Types

19.3.1. Definition

What is an abstract data type? Well, a data type is what we have just presented in chapter Chapter 18: in Python, as in many object-oriented language, it is a class. So, why abstract? One of the main objectives in component building, as we have seen in section Section 19.2, is to provide components that a programmer (you, your colleagues, or any programmer downloading your code) can be confident in. In order to get this programmer, who is a client of your code, confident into your class, you have to make it as stable as possible, and the best method to get a stable class is to define it at a level where no implementation decision is visible. In other words, defining a class should consist in defining a data type providing some abstract services, whose interface are clearly defined in term of parameters and return value and potential side effects. The advantages of this approach are the following:

  • The implementor of the class can change the implementation for maintenance, bug fixes or optimization reasons, without disturbing the client code.
  • The data type is defined as a set of high-level services with a semantic contract defining both the output that is provided and the input that is required from the client. This actually correspond to the general definition of a type in programming: a type is defined as a set of possible values and a set of operations available on these values.

Among the methods defined in the interface of a data type, there are methods that build or change the state of the corresponding object, which are called constructors and modificators, and there are methods to access the object, which are called accessors.

Example 19.1. A Stack

Our example is a stack. It is directly inspired from [Meyer97], chapter 6.

Stacks serve to pile up objects of any kind, and take them out according to the rule "last in, first out" (LIFO). Imagine a pile of plates at home, in your cupboard, the safest way to take a plate out is to take the last one put in (Figure 18.8). Such structures are omni-present in programing, they serve to move along graphs, to compile programs, etc...

Figure 19.2. A stack

Conceiving an abstract data type (ADT) for the stack consists in describing the functions needed making abstraction of the implementation choice that will be made at the end (and that may change). For instance, you might use a python List to implement a stack. But will you add elements at the beginning or at the end of the list? This concrete decision is out of concern for the user of your class, who should be taken away from knowing about this. Indeed in all cases, the basic services are the same, and can be given common names, that do not "betray" how the list is used "inside" of the code. The set of services is listed in Table 19.1. In this table, the description of the services includes:

  • which input is required,
  • what the service will return as output,
  • which results occur, including changes within the internal state of the object.
  • A short description is provided, sometimes providing more details on the context and conditions under which the service is executed.

Table 19.1. Stack class interface

NameDescriptionInputOutputChanges
putplaces the item on top of the stacka stack and an item the stack with the item added
removeif the stack is not empty, removes the top item a stack the stack without the top item
itemif the stack is not empty, returns the item on top of the pile, without removing ita stackthe top item 
emptytells whether the stack is emptya stacka Boolean  
makecreates a new stack a stacka new stack is created

The description provided by Table 19.1 should suffice for the client to use the class. What is more important here, is that the client does not have to know anything about the internal representation of the stack.

In this ADT accessors are: item and empty, constructor is make , and modificators are: put and remove.

The next step,once this description of the ADT is made for the client, consists in specifying the ADT, that is, describing in formal or mathematical language what the functions are doing. This involves four steps:

  • Types: this corresponds in Python to the classes that are used for the ADT. In our example there is one class, Stack.
  • Functions: That is their names and the input and output they have, together with the changes that might occur, as shown in Table 19.1, first four columns.
  • Axioms: The rules to which the functions obey, and that are sufficient to describe them. For the stack, these axioms are, for any stack s and element e:
    • item(put(s, e))=e
    • remove(put(s, e))=s
    • empty(make) is True
    • empty(put(s, e) is False
  • Pre-conditions: The functions we need will not operate under all conditions. For example, you cannot remove an element from an empty stack. These pre-conditions can be enumerated as follows for the stack:
    • remove(s) requires : not empty (s)
    • item(s) requires not empty(s)

Axioms and pre-conditions formulate more precisely what is said in the fifth column of Table 19.1

19.3.2. Information hiding

It is a good design practice to protect an instance local data. Why is that? The first reason is that it makes the interface of the class more easy to grasp. The client just knows what needs to be known in order for the component to be used properly. A second reason is that class data should rather be handled by the class methods: there might be some coherence to be maintained among different attributes, that a method can check. Another important reason is to avoid the risk of collisions, that could happen between the class and the potential base classes, as we will explain in Section 19.4, devoted to inheritance.

The problem is that there is no real mechanism in Python to prevent a client code from accessing to an instance attributes, as there are in other languages (such as Java or C++), where you can declare private and public attributes. Python provides just the following: if you name a variable with a '__' prefix and no '_' suffix, such as:

          __list = []
	
within a class named, e.g: Stack, Python automatically changes it to: _Stack__list. You can still access it, but you are aware that you are probably doing something odd, and it can help to avoid name collisions.

Another way to distinguish public and private attributes and methods is to prefix the private ones with a single '_' character. This does not provoke any automatic addition by Python, but it is warning the reader that this part of the code is private. Another style matter is the naming of classes, where it is a good idea to capitalize the first letter of class names, to distinguish them from variable or function names. Be aware finally that some methods in Python are framed by double '__' on both side, they are called special methods and described in the next p aragraph.

The lesson to be learnt is that attributes you want to be accessible should be mentionned explicitely, and documented as such.

Now that the ADT is specified, that we know which attributes and methods are public or private, We can turn to the implementation part of our example. As shown in Example 19.2, we have decided to use a Python list, and to add new elements to the end of this list (another option wold have been to insert the new element at the beginning of the list).

Example 19.2. Stack class

		
class Stack:
    """
    A class to handle stacks
    """

    def __init__(self):
        self._list = []
        
    def put(self, item):
        """
        - places an item on top of the stack
        - does not return anything
        """
        self._list.append(item)

    def remove(self):
        """
        - if not empty, removes the top item, i.e the last item placed on the stack
        else does not do anything
        - does not return anything
        """
        if len(self._list) > 0:
            self._list.pop()

    def item(self):
        """
        - if not empty, returns the item on top of the stack
        without removing it
        else does not do anything
        """
        if len(self._list) > 0:
            return self._list[-1]

    def empty(self):
        """
        - tells if the stack is empty
        - returns a Boolean
        """
        return len(self._list) == 0



              

Exercise 19.1. ADT for the Point class

Describe the ADT for the class Point.

19.3.3. Using special methods within classes

The usual Python operators, such as '+', the '[ ]' operator to access an item or a slice, the assignment '=' operator, the equality '==' operator, but also print etc..., are defined with special methods that have different definitions for different types. Indeed adding two integers is not the same as adding two strings. It is very convenient to redefine these special methods inside the classes created with Python. The very basic special methods encountered in most classes is the one allowing to use print. Indeed, what happens if you try to print your DNA object?

>>> s1 = DNA('seq1', 'acaagatgccattgtc')
>>> print s1
<__main__.DNA instance at 0x6fd50>
          
We would prefer a more informative output, providing the name of the sequence and the sequence itself (a sort of Fasta formatted output):
>>> print s1
>seq1
acaagatgccattgtc
          
How do we get this? As mentioned above, we just have to add a special method to our DNA class, called __str__, which must return a printable character string:
    def __str__(self):
	return ">" + self.name + "\n" + self.seq + "\n"
          

Such special methods have obligate names, and are framed by '__', such as __str__ for print. A set a very precious pages in the book 'Python. Essential reference' concerns all the tables (3.7 to 3.10) listing these obligate names. Here below in Table 19.2, we will list a small number of examples.

Table 19.2. Some of the special methods to redefine Python operators

OperatorSpecial methodParametersDescription
[]__getitem__self, kaccess to item or slice specified by k
==__eq__self, othertest for equality between 2 objects
.__getattr__self, nameaccess to a non existing attribute named name
. and = __setattr__self, name, valuemodification of an attribute
len()__len__selflength computation
print__str__selfstring form for instance value
del__del__selfdeletion of an instance

Example 19.3. Defining operators for the DNA class

Remember what we said in Section 18.5.3: different objects, even with the same state, i.e with the same attributes values, are considered as not equal by Python. The following statements show that even with the same values for theur attributes, two objects will not considered as equal:
>>> s1 = DNA("seq1", "acaagatgccattgtc")
>>> s2 = DNA("seq1", "acaagatgccattgtc")
>>> s1 == s2
False
	    
There is a way however to specify how the == operator should behave for your class. We need to define ourselves what equality means for our class. The following code redefines the == operator as returning True whenever the name and seq attributes are equal.
class DNA:

    ...

    def __eq__(self, other):
        return self.name == other.name and self.seq == other.seq

>>> s1 = DNA(name='seq2', seq='acaagatgccattgtcccccggcctcctgctgctgctgctctccggggcca')
>>> s2 = DNA(name=s1.name, seq=s1.seq)
>>> s1 == s2
True
>>> s1 is s2
False
   
Another important operator that could be useful for sequences is the [] operator. For strings, this operator enables to access a character at a specific position. What we would like fo a DNA object is to access the ith character of the seq attribute:
>>> s1[3]
'a'
	    
This operator can be defined by the __getitem__ special method:
class DNA:

    ...

    def __getitem__(self, i):
        return self.seq[i]

>>> print s1[3]
'a'
   

Exercise 19.2. Operators for the DNA class

Which additional operators could you define for DNA class? Implement 1 or 2 of them.