With the introduction of Abstract Base Classes, Python once again shows its nature of a very innovative and flexible language. It is interesting to see how such a remarkable feature has been introduced into the language by a pure Python module. This demonstrates that Python is built in a way that is very open to changes, thanks to its foundations in pure polymorphism based on delegation.
Many Python programmers overlooked Abstract Base Classes and the classes in the
collections module, which are one of the simplest and useful applications of the concept. Sure enough, this is not a feature that you will use every day or that will change the way you are programming in Python. But neither is it something you shall discard before understanding what it brings into the language, and what sort of problems it can solve for you.
Python is a dynamically-typed object-oriented language strongly based on delegation, so its approach to problems is intrinsically polymorphic. This means that Python deals mostly with the behaviour of objects and not with their structure. The well-known EAFP protocol (it's Easier to Ask Forgiveness than Permission) comes from this approach. This code
try: someobj except TypeError: # object is not subscriptable ...
does not check if the object is a
list or a
dictionary (both would allow the
 notation), but if the object can be accessed by key (or index). When you accept a parameter in a function Python does not specify the type (leaving aside type hints) because you are not interested in accepting a given type or one of its derived types. You are interested in accepting something that provides the methods you will use.
The behaviour, in an object-oriented environment, is the run-time interface of the object. This is different from the static interface, which is the collection of the methods provided by the object. The run-time interface is the actual interface the object shows when it is used, and this encompasses the methods provided by its class, but also methods provided by parent classes, the metaclass and other entry points provided by
Sometimes, however, you need to perform complex checks, such as "it behaves like a list". How can you test this condition? You could test if the incoming object has some standard methods, but this is not only incomplete but also wrong. For example, I could write the following test
try: obj.append obj.count obj.extend obj.index obj.insert [...] except AttributeError: [...]
which tries to cover the methods a list-like object shall provide. This test, however, accepts objects that do not really behave like a list such as
class FakeList: def fakemethod(self): pass def __getattr__(self, name): if name in ['append', 'count', 'extend', 'index', 'insert', ...]: return self.fakemethod
It is unlikely that you will write such a class, but this shows you one of the potential pitfalls of the previous test, which is wrong because it tries to rely on the structure instead of testing the behaviour. The temptation to rely on
isinstance() is big
if isinstance(someobj, list): ...
If possible this approach is worse than before, since it tests the exact type. Even if
issubclass() are smart enough to walk through the hierarchy of parent classes, this excludes evary class that behaves like a list does not inherit from it.
Back to delegation
The idea proposed in PEP 3119 to face this problem is very elegant, and leverages the very nature of Python: that of being strongly based on delegation. The solution, implemented in Python 3 and backported to Python 2.7, changes the nature of the two
issubclass() builtins. Now the first thing that
isinstance() does is to call the
__instancecheck__() method of the queried class, basically giving it the chance to answer the call with a different algorithm than the standard one. The same happens for
issubclass(), which becomes
__subclasscheck__(). So the following code
does no more perform a pure external check of the relationship between
myclass. The first thing that
issubclass() does now is the following
This is very natural because, after all,
someclass is the best source of judgement about being a subclass of itself.
A new type of subclass
With the introduction of delegation-based instance and subclass checks, Python provides a new type of subclass, and thus a new way to relate classes together. Now a subclass may be a real subclass, obtained using inheritance
class ChildClass(ParentClass): pass
or can be a virtual subclass, obtained through registration
The difference between a real and a virtual subclass is very simple: a real subclass knows its relationship with the parent class through its
__bases__ attribute, and can thus implicitly delegate the resolution of missing methods. A virtual subclass knows nothing about the class that registered it, and nowhere in the subclass will you find something that links it to the parent class. Thus, a virtual parent class is useful only as a categorization.
Abstract Base Classes
Classes that can register other classes, thus becoming virtual parents of those, are called in Python Abstract Base Classes, or ABCs.
The name of this new language element is important. ABCs are first of all classes, just like any other class you can create in Python, and they can be subclassed in the usual way to create taxonomies. They are also meant to be base classes, that is classes that represent fundamental behaviours or categories. Last, they are abstract. This has a very precise meaning in Python and will be the subject of the last part of this post.
The classes provided by the
collections module are Abstract Base Classes, and they set themselves as virtual parents of some base types in the same module. If you check the
_collections_abc.py file in your Python 3 installation (for example in
/usr/lib/python3.4/_collections_abc.py) you will find code like this
[...] Sequence.register(tuple) Sequence.register(str) Sequence.register(range) [...] MutableSequence.register(list) [...]
Sequence and the
MutableSequence ABCs register some built-in types of Python.
It is very important to understand that registering a class does not imply any form of check about methods or attributes. Registering is just the promise that a given behaviour is provided by the registered class.
To demonstrate this let me provide you a very simple example made using one of the
>>> import collections >>> class MyClass(): ... pass ... >>> issubclass(MyClass, collections.Sequence) False >>> collections.Sequence.register(MyClass) <class '__main__.MyClass'> >>> issubclass(MyClass, collections.Sequence) True >>>
As you can see, the
MyClass class is initially not recognized as a subclass of
collections.Sequence, but after the registration
True, even if the class is still empty.
How to create ABCs
The example given by the official documentation is very simple and clear
from abc import ABCMeta class MyABC(metaclass=ABCMeta): pass MyABC.register(tuple) assert issubclass(tuple, MyABC) assert isinstance((), MyABC)
All you need to do is to create a class and use the
ABCMeta metaclass provided by the
abc module and you will obtain a class that has the
register() method and a suitable implementation of
__instancecheck__(). Checking again the
_collections_abc.py file you can see that this is exactly the way the
collections classes are implemented
[...] class Hashable(metaclass=ABCMeta): ... class Iterable(metaclass=ABCMeta): ... [...]
Are you scared of metaclasses?
Metaclasses are a strange topic in Python. Most of the times the advice given to the novice is "Don't use them", like they were an error of the language and something that shall be avoided.
I don't think so. As a matter of facts I definitely disagree with such position, for many reasons.
First of all, if you are programming in Python its better for you to understand everything Python provides you, both the good and the bad parts. Programming languages are tools, and you shall know their strengths and their limitations. Most of the times what we call "limitations" are just features that become a restraint just because we are not aware of them. The C language, for example, is not object-oriented. Is this a strength or a limitation? Python provides you a very powerful inspection mechanism. Is this a strength or a limitations? I could give countless other examples.
Second, powerful features are the one you should know better. After all, we use a language for the unique features it provides, not for the features it shares with other languages. I use Python because of its powerful polymorphism implementation, not because of loops and inheritance. Those are provided by Java and C++, too, for example. I write a device driver in C because of the closeness to the machine language and its speed, not because of the
float types, which are provided by many other languages. So, since powerful features are what let the language do what others cannot, those are the ones you have to master.
var keyword. You cannot afford being ignorant about those limitations, otherwise your software will be buggy. So, while studying the
var, which is the dangerous button of the whole device.
Back to Python. Metaclasses are not a last-minute feature put into the language just for fun. They are the foundation of Python itself, and the relationship between
type is something so beautiful that it is a pity that basically no one talks about it. So, plase stop complaining against metaclasses and telling people that they are dangerous or complex.
Metaclasses are part of the language. And they are not complex to understand.
Why metaclasses for ABCs?
If you program in Python you should be somehow familiar with classes and instances. You know that when you build an instance you use a class (like a blueprint) and that the class can put things into the instance. For example
# Class definition class Child(): def __init__(self): self.answer = 42 # Link instance and class c = Child() # Use the instance assert c.answer == 42
Now, when you build a class you use a metaclass (like a blueprint) and the metaclass can put things into the class.
# Metaclass definition class NewType(type): def __init__(self, name, bases, namespace): self.answer = 42 # Link class and metaclass class Child(metaclass=NewType): pass # Use the class assert Child.answer == 42
Sounds complex? Not at all, in my opinion. If you check the two examples you will see that they are exactly the same thing, the first referring to the instance-class relationship, the second to the class-metaclass one.
This is all you need to understand metaclasses. When you build a class you need to put things into it, for example you need to put the
__getattribute__ or the
__new__() methods. This is done by the metaclass, which is usually
type for every class. Indeed, if you check the
__class__ attribute into a class you get exactly this
>>> int <class 'int'> >>> int.__class__ <class 'type'> >>>
Metaclasses and MRO
A slightly advanced annotation: when I say that the metaclass puts the methods into the class I'm simplifying the whole thing. As a matter of fact, like a class provides methods to the instance at runtime through the
__class__ attribute and the MRO protocol, the metaclass provides methods for the class. Attributes, instead, are put inside the class by the
__init__ methods of the metaclass.
Let us review the MRO mechanism for instances and classes first. When you call a method on an instance Python automatically looks for that method in the instance first, then in the parent class and in every class in its hierarchy.
So in this example
class GrandParent(): pass class Parent(GrandParent): pass class Child(Parent): pass
calling the method
get_name() on an instance of
Child will look for it first into the
Child class, then into
GrandParent, in this order. Finally, it will check
What happens to the MRO when a class of this hierarchy defines a different metaclass? For example
class NewType(type): pass class GrandParent(): pass class Parent(GrandParent): pass class Child(Parent, metaclass=NewType): pass
In this case everything works as usual, but after checking
object the MRO will also check the
NewType metaclass (and its ancestors).
So, metaclasses can act as mixins, and they are queried only at the end of the usual MRO. This is exactly what happens using multiple inheritance if
NewType were a standard parent class that does not have
GrandParent as ancestors.
Metaclasses are not part of the MRO however, since the MRO just deals with standard inheritance. If you check the MRO of the
Child class, you will see that the metaclass is not included
>>> Child.mro() [<class '__main__.Child'>, <class '__main__.Parent'>, <class '__main__.GrandParent'>, <class 'object'>] >>> Child.__class__ <class '__main__.NewType'> >>>
Why are ABC called
abstract? ABCs can be instantiated, so they are after all not pure interfaces (like Java ones, for example)
>>> import abc >>> class MyABC(metaclass=abc.ABCMeta): ... pass ... >>> m = MyABC() >>>
They may however define some methods as abstract, using the
abc.abstractmethod decorator, which prevents the class from being instantiated if the method is not implemented. Let me give you a simple example: I define an Abstract Base Class with and abstract method
class MyABC(metaclass=abc.ABCMeta): @abc.abstractmethod def get(self): pass
and try to instantiate it. Python complains
>>> m = MyABC() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Can't instantiate abstract class MyABC with abstract methods get
I am forced to create a new class that inherits from
MyABC and implements the method
class Concrete(MyABC): def get(self): return 1
Now I can instantiate the class.
>>> c = Concrete() >>>
Check the official documentation of Abstract Base Classes for a complete description of
What about the behaviour?
So, what happened to the Python tenet "Check the behaviour and not the structure"? With collections, after all, we dropped the EAFP protocol, going back to a Look Before You Leap approach. Are we going against the philosophy at the very base of the language?
It is very interesting to see what Guido van Rossum, creator of the Python language, says about this in PEP 3119: Invocation means interacting with an object by invoking its methods. Usually this is combined with polymorphism, so that invoking a given method may run different code depending on the type of an object. Inspection means the ability for external code (outside of the object's methods) to examine the type or properties of that object, and make decisions on how to treat that object based on that information. [...] In classical OOP theory, invocation is the preferred usage pattern, and inspection is actively discouraged, being considered a relic of an earlier, procedural programming style. However, in practice this view is simply too dogmatic and inflexible, and leads to a kind of design rigidity that is very much at odds with the dynamic nature of a language like Python.
So the point is that forcing the use of a pure polymorphic approach sometimes can lead to solutions that are too complex or even incorrect. The key words here, in my opinion, are "dogmatic", "inflexible", and "rigidity", opposed to "dynamic nature". I really like this flexibility in a language and in its author.
if isinstance(obj, collections.Sequence) is not EAFP, neither is any conditional test you may write. Nevertheless, no one would replace conditional tests with a pure EAFP approach, simply because sometimes those tests are more readable. This is the exact purpose of collections in Python and ABCs in general: to allow parts of the code to be simpler.
I hope this post helped you understand that Abstract Base Classes, and in particular the standard collections, are useful and easy to understand. Metaclasses are also not that scary and dangerous, even if using them obviously requires some skill.
The official documentation of the
abc module is very well written. Here you find the version for Python 3.5. I also suggest to read the original PEP 3119 and the related PEP 3141 for a deeper understanding of the topic.