Simple recursive structures with Python

2022 August 27

Something I’ve been leaning more into is the idea of recursive structures to teach induction and recursion. This is very obvious in computer science and programming classes. However, a lot of introductions to inductive proof in discrete mathematics still rely on examples dealing with sequences and sums. The classic example is the proof that $\sum\limits_{i=0}^n = \frac{n(n+1)}2$. But this isn’t particularly satisfying or motivating.

A natural starting point

Recall that (due to Peano) a natural number is defined as either:

zero, or
the successor of some natural number.

The key idea is that stuctural recursion, and by extension structural induction, follows the definition of our type. For example, the definition of addition is:

if $m$ is zero, then $m + n = n$.
if $m$ is the successor of a natural number $p$ (denote this by $S(p)$), then \[m+n = S(p) + n = S(p+n).\]

And similarly, any inductive proof about these objects will break down in the same way: you prove something about the base case and you prove something about the inductive case, which is built up in a specific way. This gives us a nice unifying theory of recursion and induction, which is important when we do more complicated things than just messing around with sequences.

Something that I could rely on in the undergraduate program was that students had a pretty good grasp of recursion via the introductory sequence because we explicitly build types for definitions like this. Then any function we want to define as a program follows straight from our mathematical definitions. In Typed Racket, this looks like the following.

(define-struct Succ
  (pred : Natural))
(define-type Natural (U 'Zero Succ))

(: plus (-> Natural Natural Natural))
(define (plus m n)
  (match m
    ['Zero n]
    [(Succ p) (Succ (plus p n))]))

Unfortunately, we will not be using Typed Racket very soon. Luckily, I was already working on such examples for the master’s version of the course, where students were most familiar with Python. For my first few runs of discrete math for the master’s students, I showed them this Racket code. Even if you don’t know the language, it’s pretty self explanatory since it looks identical to the definitions.

Our move to Python in our intro classes meant that I took a closer look at the world of Python for the first time in a long while. So really, this is more of a story about me re-learning Python. One surprising thing I learned was that Python introduced native type annotations (though curiously, they declined to include any built-in static type checking). This made me wonder whether I could come up with a version of this example that students could actually run on their own.

After some poking around, I arrived at the following.

from dataclasses import dataclass

@dataclass
class Succ:
    pred: Natural

Natural = Succ | None

def plus(m: Natural, n: Natural) -> Natural:
    match m:
        case None:
            return n
        case Succ(p):
            return Succ(plus(p,n))

This is a really nice implementation that looks almost identical to the one in Typed Racket. It’s certainly less spooky for those who may be intimidated by Lisp’s parentheses (a common gripe). However, there are some noticeable complications in using less common language features: type annotations, dataclasses, match statements (introduced in Python only surprisingly recently!). This is a problem for very new programmers.

While type definitions make things nice and explicit, they’re not strictly necessary—after all, neither Peano himself or Racket had these to begin with. For example, the (untyped) Racket version of this is typically written

(define (plus m n)
  (cond 
    [(symbol=? m 'Zero) n]
    [(Succ? m) (Succ (plus (Succ-pred m) n))]))

So is there a way to make this idea work “Pythonically” with the base language? That is: no type annotations, no dataclasses, no match. Originally, I had thought that this was impossible, but a bit of thinking proved otherwise.

def plus(m,n):
    if m == ():
        return n
    else:
        p, = m
        return (plus(p,n),)

There’s no type definition, because we don’t need one. So what are we doing here? Product types are just tuples and Python has tuples built in. Because of that, working with tuples is actually really convenient (thanks, tuple unpacking). Here, our definition of the natural numbers is:

zero, represented by the empty tuple (), or
a 1-tuple (n,) containing a natural number n.

The only slight hitch was trying to figure out what the syntax for a single-item tuple was. The problem is that parentheses do double duty in grouping terms in expressions, so (x) means x. It turns out the right way to do this is (x,) and the associated tuple unpacking assignment is x,. Problem solved!

Lisp-like lists

Taking this idea actually gives us some very nice recursive lists. Recall that a list is defined to be either:

empty, or
a pair containing a value and a list.

For example, in Typed Racket, we had the following implementation for lists and the function append for joining two lists.

(define-struct Cons
  ([first: Any]
   [rest: List]))
(define-type List (U 'Empty Cons))

(define (append xs ys)
  (match xs
    ['Empty ys]
    [(Cons head tail) (Cons head (append tail ys))]))

A typed Python version would look like the following.

from dataclasses import dataclass
from typing import Any

@dataclass
class Cons:
    first: Any
    rest: List

List = None | Cons

def append(xs: List, ys: List) -> List:
    match xs:
        case None:
            return ys
        case Cons(head, tail):
            return Cons(head, append(tail, ys))

Note that for simplicity’s sake, we used Any for the types above. If we wanted to go wild and properly attempt polymorphism, we’d have to do more and introduce type variables and such in Python, (notice that we had to import Any as a type name).

Again, we remember that pairs are just 2-tuples, so taking () to be an empty list, we arrive at the following definition.

def append(xs, ys):
    if xs == ():
        return ys
    else:
        head, tail = xs
        return (head, append(tail, ys))

Built-in lists

Okay, so this is still maybe a bit weird and artificial, since we're putting these things in tuples. But what if I want to program recursively with built-in lists? Is there an idiomatic way to do this? For instance, it's easy enough to write something like

def append(xs: list, ys: list) -> list:
    if xs == []:
        return ys
    else:
        head = xs[0]
        tail = xs[1:]
        return [head] + append(tail, ys)

The goal is to avoid using list indices. Surprisingly, there is a way to do this (again, this is my re-learning Python era).

def append(xs: list, ys: list) -> list:
    if xs == []:
        return ys
    else:
        head, *tail = xs
        return [head] + append(tail, ys)

It turns out that somewhere along the line, Python added an iterator unpacking operator. This is quite neat and gets us to 95% of the way to where we want to be. You can even use it with pattern matching:

def append(xs: list, ys: list) -> list:
    match xs:
        case []:
            return ys
        case head, *tail:
            return [head] + append(tail, ys)