Python imports

Daniel Pope

Though I've been using Python for 10 years I still occasionally trip over the magic of the import statement. Or rather the fact that it is completely unmagical.

The statement

import lemon.sherbet

does a few simple things, effectively:

Unless it's already imported, creates a module object for lemon and evaluates lemon/__init__.py in the namespace of the module object.
Unless it's already imported, creates a module object for sherbet, evaluates lemon/sherbet.py in the namespace of the module object, and assigns the sherbet module to the name sherbet in lemon.
assigns the lemon module to the name lemon in __main__.

(Obviously, I'm omitting a lot of the details, such as path resolution, sys.modules or import hooks).

This basic mechanism has some strange quirks. Suppose the full source tree contains:

├── lemon
│   ├── __init__.py
│   ├── curd.py
│   ├── soda.py
│   └── sherbet.py
└── curd_machine.py

And curd_machine.py contains

import lemon.curd

At first glance, I find it odd that this code works:

import curd_machine
import lemon.sherbet
print(lemon)
print(lemon.curd)

I can access lemon, but I didn't explicitly import it. Of course, this happens because the import lemon.sherbet line ultimately puts the lemon module into my current namespace.
I can also access lemon.curd without explicitly importing it. This is simply because the module structure is stateful. Something else assigned the lemon.curd module to the name curd in the lemon module. I've imported lemon, so I can access lemon.curd.

I'm inclined to the view that relying on either of these quirks would be relatively bad practice, resulting in more fragile code, so it's useful to be aware of them.

The former of these quirks also affects Pyflakes. Pyflakes highlights in my IDE variables that I haven't declared. But it fails to spot obvious mistakes like this:

import lemon.sherbet
print(lemon.soda)

which when run will produce an error:

AttributeError: 'module' object has no attribute 'soda'

There's still nothing mysterious about this; Pyflakes only sees that lemon is defined, and has no idea whether lemon.soda is a thing.

I think the reason that this breaks in my mind is due to a problem of leaky abstraction in my working mental models. I tend to think of the source tree as a static tree of declarative code, parts of which I can map into the current namespace to use. It isn't this though; it is an in-memory structure being built lazily. And it isn't mapped it into a namespace, the namespace just gets the top level names and my code traverses through the structure.

Maybe I formed my mental models long ago when I used to program more Java, where the import statement does work rather more like I've described. I wonder if people with no experience of Java are less inclined to think of it like I do?

Comments