Why I don’t like Python

Lee Phillips

May 6^th, 2020

The title is a bit of a lie. I think Python is a good, practical, language, and relatively easy to learn and to teach. Its vast collection of libraries let you slap together a program to do virtually anything you want. It doesn’t have the beauty of APL or lisp, and certainly nothing like the speed or expressive power of Julia, but it’s a tool anyone can use to make something that solves a real-world, non-trivial problem in a day or two. And if you’re programming the kind of website that uses forms and a database, it has the Django web framework, which alone might be a good enough reason to learn the language.

For me, what sometimes makes Python a pain in the neck is that it is an “object oriented” language. These are not the awesome message-passing objects as they were originally envisioned and implemented in Smalltalk. These “objects” are class based, and are attractively outfitted with multiple inheritance. As an individual programmer, you never need to write a single class. I certainly don’t. I use Python as a functional language. I create a program from many small, composable functions acting on lists, tuples, and dictionaries. This is not a criticism. Python was designed in the 1980s, when most people thought that classes were the way to go. They may still be the best choice for certain types of problems, but now there is growing regret directed at the need to live with and maintain large systems built from piles of classes all keeping secrets and inheriting from each other.

My beloved Django is a good example. If I want to customize its behavior, I need to wander though a maze of magical framework code, and end up littering my own code with opaque references to this and that and super and meta. I don’t bother any more, and just write my own code to replace the parts of Django that I want to be different, bypassing larger and larger parts of the framework. Despite all this, Django is such a mature and solid piece of engineering that I am loathe to abandon it.

As I said, the power of Python is the ability to leverage its libraries, including the ones added by so many people over the years. And here is the problem for the user of the language who might want to avoid the class-based paradigm and write functional code. The libraries are written by real Python programmers, which means that they are object oriented. They often don’t mesh well with code written in the functional style, are a pain to modify, and do things you might not expect.

Here is a little example that, I hope, shows you some of what I mean. I needed to write a program that reads, parses, and writes bibtex files. These are plain-text databases of publications familiar to anyone who uses the LaTeX document preparation system. The database stores the bibliographic information about some list of publications; you just mention its location in your LaTeX document, and you can get citations and a reference section typeset in any style you need. Naturally, there’s a Python library for parsing bibtex: it’s called bibtexparser. Did I mention that there is a library for everything? That’s why we’re using Python. This library can read the file and parse it into a special object, the BibDatabase. This object has a property called entries; the entries is a list of dictionaries, where each dictionary contains the bibliographic information for one publication, with keys like author, title, etc. There is a special key called ID that uniquely identifies each publication. It’s chosen by the author as a mnemonic tag for use in citations. The documentation of this library is, frankly, not very good, and I had to experiment to see just how it works. I found that I could modify the entries, or delete from or add to the list, and my alterations were reflected when writing the new bibtex file back to disk. So far, so good.

The BibDatabase object has another way to represent the data. It has an entries_dict property, too, which is a dictionary of dictionaries. Each publication appears as a dictionary with a key that is the publication’s ID. In other words, it’s almost the same as the entries list, but with the data arranged a little differently. Instead of a list of dictionaries, it’s a dictionary whose values are dictionaries and whose keys are the IDs. This makes sense, and is the most natural way to represent this data.

At some point in my program, it became convenient to use the entries_dict version of the data rather than the entries view of it. I found that I could alter the contents of any of the dictionaries, and the changes, again, were faithfully preserved when I converted the BibDatabase into a file on disk. But then I tried to add a new publication by adding an entry to the entries_dict dictionary. The new publication did not appear in the output file.

The documentation was no help. I started reading the source code, and, no surprise, it was based on classes. Not transparent functions transforming data structures, but opaque classes with super-secret methods for doing everything. When the user asks for the entries or the entries_dict, or adds items to them, private code is run behind the scenes, as is common in class-based systems. These secret methods and properties add a publication to the object when you append to its entries, but not when you enlarge its entries_dict. It was easy to modify my code to take this into account. But I wished I had known about it in advance and not wasted an hour or two puzzling over it.

This is a typical Python library, and this is why using Python can be so annoying. If you want to use the library in a way a little differently from how its creator intended, you may trip over surprising behavior, and will have to wade through a maze (sometimes—this library happens to be quite simple) of class-based source code to figure out what all the secret methods are doing. It may be a matter of taste, but I find that using and understanding libraries in Clojure or Julia to be a refreshing contrast.

Why I don’t like Python

Tenuously related: