Skip to content

Typing in the context of dynamic languages 1: Types and subtypes in Python

In this article, we will discuss adding static typing on top of dynamically typed languages by looking at the case of Python. Of course, most of the ideas proposed here apply equally to Typescript, PHP, you name it. I only chose Python because that is what I’m familiar with.

In the second article of this series we will talk about variance in Python, and in the third article we will go deeper into defining custom types. But first, let’s focus on types and subtypes in Python.

Typing is a topic on which there are a lot of contradictory definitions and information out there, so for the sake of clarity, I will start with a couple of definitions that we will use in this article. If your definitions are a little different, these concepts will still probably be valid with a little adaptation.

The two concepts usually open to discussion are strong vs weak typing and dynamic vs static typing. In this article we talk about static typing when the type-checking happens at compile time, and dynamic typing when it happens at execution time. On the other hand, weak typing means that a type error can lead to a cast, whereas in strong typing, a type error is just an error and will stop compilation or execution.

It is important to understand that dynamically typed languages can be strongly typed (as Python) or weakly typed (as Javascript). In python, a line such as 1 + "1" will result in an error, whereas Javascript will use casts to evaluate it as 2. In that context, adding static typing on top of the language will avoid different types of bugs.

In Python, it will mostly avoid having exceptions raised at execution time, and in Javascript it will mostly avoid wrong values from being computed. But you can reap both benefits.

One last concept to keep in the back of our minds is that type systems have different properties, one of which is completeness. When a type system does not have completeness, it means that some expressions don’t have a type that can be inferred for them.

Unfortunately, when a static type system is added on top of a dynamic language, such expressions occur quite commonly, meaning that you can never get the full safety of some languages that are built with static typing from the ground up. You will need to help the type checker by telling it some of the types, but as a human, you can make wrong assumptions.

But still, although it is not perfect and you will still need to rely on dynamic type checking as well, adding static typing on top will be the icing on the cake, and may be a gateway drug to more complete type systems, like in Rust, C# or Haskell.

How to use types in Python

With this short introduction to typing out of the way, let’s focus on typing in Python.

Essentially, in Python, everything is an object or a type. Indeed, types are first-class objects and often also act as functions. Let’s look at an example:

l = [1, 2, 3]
print(type(l))
print(type(l)())

<class 'list'>
[]

As you can see, we can get the type of a list, and use it as a constructor to create a new list, so a type is indeed a first-class object that could be stored in a variable and manipulated at runtime. What about its type?

print(type(type(l)))
print(type(type(type(l))))

<class 'type'>
<class 'type'>

So the type of list is type, whose type is itself type. Python has this concept of metaclasses that we will not go any deeper into in this article. Let’s just remember that all types have a type themselves, which derives from type.

The dynamic typing part of Python, which is the only one really built into the language, uses this mechanism to tell the type of objects at runtime and check the validity of some operations. However, this information is not available for static typing outside of execution.

Python has also introduced type annotations that do nothing in themselves but can be used by external tools both statically and dynamically. We will assume a recent version of Python (3.10+). For older versions, you will need to import things from the typing module (List instead of list, Union instead of |). If you use an older version than that, you should probably consider updating anyway!

# We specify the types of the arguments and the return value of the function
# For the *k and **kw arguments we do not need to specify we have a list or a
# string dictionary
def create_dict(value: int, *keys: str, **extra: int) -> dict[str, int]:
    """Create a dictionary of string keys and integer values"""
    # we can specify the types of variables too
    d: dict[str, int] = {key: value for key in keys} | extra
    return d

Again, adding these annotations doesn’t do much in itself, and you will need to get your code through third-party tools like Mypy or Pyright to perform static type analysis.

Subtyping in Python

One important aspect of code analysis is subtyping, meaning which types are included in another type. To illustrate this, we will need to use a library.

import numbers
 
isinstance("1", numbers.Number)  # False
isinstance(1, numbers.Number)  # True
isinstance(1.5, numbers.Number)  # True
isinstance(1.5 + 5j, numbers.Number)  # True
# (how awesome is it that Python has built-in complex numbers ?)

As we can see, int, float and complex are all subtypes of numbers.Number: every integer, float or complex number is also a Number.

This can happen in several situations:

  • We have a class hierarchy, in which case the derived classes are subtypes of the base classes.
  • We have defined a new type that is a union of several types, such as None. In this case, basenum is the supertype and int and float the subtypes.
  • We are using generic types, which we will discuss in more detail in the section on the next article on variance.
  • Structural subtyping, aka “duck typing”, which we will discuss in the part about protocols in an upcoming article.
  • We have defined new types and declared them as subtypes of another type, which we will cover also in the same article.

Let’s remember that, although all of the above can be checked statically, no static checker can check for behavioural subtyping. It means that any property of a supertype must be true for all the subtypes, which is better known as Barbara Liskov’s “substitution principle”. That’s why we developers need to be careful.

What do I call precise typing

Writing good types demands some experience and discipline. It is very easy, when faced with a difficulty, to just use typing.Any without giving it a second thought, but then we won’t gain much from type checking. It is especially difficult to use types on values from libraries over which we have no control.

The type checker does some inference (guessing types), but to gain the most benefits, you need to give it precise instructions about what you expect. If you do not specify the type that you expect a function to return, the type checker can guess what will be returned, but not whether or not it was expected.

The more precise the information, the better the type checking will be. Let’s look at the following example:

basenumber = int | float
i: basenumber = 1
 
def next(n: int) -> basenumber:
    return n + 1
 
next(i)


When we run Mypy on this, we get the following:

foo.py:7: error: Argument 1 to "next" has incompatible type "Union[int, float]"; expected "int"
Found 1 error in 1 file (checked 1 source file)

Of course, we have not given the most precise type possible to i, meaning the “deeper” subtype. This would be a much better example, although still not perfect:

basenumber = int | float
i: int = 1
 
def next(n: int) -> basenumber:
    return n + 1
 
next(i)

Now Mypy will not complain anymore: by declaring that i is an integer, a subtype of basenumber, we have given it a more precise type. We can already see that our next function could work with floats, so we have not yet given it its most precise type.

To do so, in next week’s article we will talk about variance, a property of generic types that express how subtyping relations relate to each other. Stay tuned!

Search