May 27, 2019

Pushing static type checking in Python one step further

Type annotations can help catch tricky bugs in your Python code. In this article, you’ll find how you can do the extra push over the cliff up to eleven.

Note: although there are multiple static type checkers for Python available nowadays (such as pyre and pyright), this article assumes the usage of mypy.

My empiric experience with legacy codebases without proper code coverage proves that adopting static type checking has dramatically helped to keep the sanity of both developers and code. However, there are ways how to push static type checks a bit further than what is offered by default.

Static type checking 101

This section goes through the basics of how static analysis with mypy works. If you already have experience with it, you can safely skip it.

Let’s take a look at a custom function, which accepts dict as a parameter, and returns also a dict.

def check_flights(flights: dict) -> dict: ...

Now, there is a feature request: this function has to accept a list of dictionaries instead of a simple dictionary, and also add a new parameter next to the existing one:

from typing import List
def check_flights(flights: List[dict], strict: bool) -> List[dict]:
    ...

This function is used at multiple places in code already, thus all its occurrences need to be adjusted accordingly.

The ideal scenario is that all its occurrences are covered by tests, which means that any omitted instance should be discovered straight away. Unfortunately, the described scenario and reality are very often two disjunctive sets, so a mistake can slip in.

This is where the static analysis comes into play. Mypy parser goes through everything in code, even through the dead code branches, and it checks if the actual code matches function signatures and declared types.

Let’s consider one forgotten usage of our function:

if rare_condition: # following code path not covered by tests
    check_flights({"id": 123}) # old version of our function

Mypy is going to point out two errors here — a mismatching type of the first parameter, and also a missing second parameter. We can see the exact file and line where the errors occurred.

flights.py:10: error: Too few arguments for "check_flights"
flights.py:10: error: Argument 1 to "check_flights" has incompatible type "Dict[Any, Any]"; expected "List[Dict[Any, Any]]"

Great! As you can see, the errors have been caught before the code went into production, even though our tests — as shown in the example above — were giving us false safety.

Static type checking 102

The previous section covered the basic usage of static type checking. Now let’s see an example of what else we can do with it.

Handling floating-point arithmetic

Floating-point numbers are imprecise, which could be a problem when you are working, for example, with money. Here is a simple example:

>>> 100 * 1.12
112.00000000000001

As you can see, the result is not what we would expect. In Python, this issue can be solved by using Decimal type instead:

>>> from decimal import Decimal
>>> Decimal("100") * Decimal("1.12")
Decimal("112.00")

That’s better. However, if float type is passed as an input parameter, the result will be unsatisfactory again:

>>> Decimal("100") * Decimal(1.12) # float passed here
Decimal('112.0000000000000106581410364')

The solution is easy: never pass float numbers as a parameter to Decimal. You can try to keep enforcing this rule manually and live in fear that you or your colleagues forget about it or do it by accident. Nevertheless, a better alternative is to enforce this rule automatically by utilising static types.

However, there is still an issue — float is a valid parameter for Decimal. We need to convince mypy otherwise.

Mypy uses so-called stubs for the definition of types for both standard and third-party libraries. This way we can define our own stub for Decomal type, which won’t accept the float type as a parameter.

There is an official collection of stubs called typeshed, where we can find a stub for Decimal. Copy the file into a location where you should keep it for later (in this case a directory custom_typeshed), and prepare to edit it:

$ git clone https://github.com/python/typeshed/
$ cp typeshed/stdlib/2and3/decimal.pyi ./custom_typeshed/decimal.pyi
$ pico ./custom_typeshed/decimal.pyi

Now we’ll modify decimal.pyi to fit our needs:

class Decimal(object):
     # original signature of Decimal.__new__()
-    def __new__(cls: Type[_DecimalT], value: _DecimalNew = …, context: Optional[Context] = …) -> _DecimalT: …
     # our new signature with `value` type restriction
+    def __new__(cls: Type[_DecimalT], value: Union[str, int] = …, context: Optional[Context] = …) -> _DecimalT: …

The change above means we changed the existing types which can be passed to Decimal, and restricted the accepted values to either str or int only.

To take our stub into account, add the following directive into mypy config (mypy.ini), where mypy_path value is a path to directory with our modified decimal.pyi:

[mypy]
mypy_path = ./custom_typeshed

Let’s prepare a short file example.py, where we can check everything works as expected:

from decimal import Decimal
result = Decimal(100) * Decimal(1.12) # `float` should not pass

Runmypywith our config file:

$ mypy ./example.py --config-file ./mypy.ini

And we should see the following error had been discovered:

$ mypy ./example.py
example.py:2: error: Argument 1 to “Decimal” has incompatible type "float"; expected "Union[str, int]"

Now any time we’ll pass an undesired variable type to Decimal, it will be caught automatically, so you can focus on more important things in life instead ^^

What’s your experience with type annotations? Have you found any interesting or unexpected use for them? Let me know in comments, or join me at Kiwi.com, so you can tell and show me in person.

Search
Share
Featured articles
Don’t Fix Bad Data, Do This Instead
The Relevance Of Tech Conferences In A Post Pandemic World