# Turning bottom-up into top-down with Reverse State

13 June, 2018

I love bound - it makes De Bruijn indices mindlessly easy. I also love Plated for all sorts of whole-program transformations. I think they’re two indispensible tools for working with programming languages. Unfortunately, they’re not compatible.

The Scope datatype in `bound` is very safe. The type prevents you from creating invalid De Bruijn terms, like `λ. 3`. This means that you can’t write useful instances of `Plated` for types which contain a `Scope`. When it comes to choosing between `bound` and `Plated`, I choose `Plated` - because we can use it to build functionality similar to `bound`.

## Write some code!

### Warning: the code in this post is subtly broken. See the reddit thread

Let’s get some boilerplate out of the road. Here is a datatype for lambda calculus, with De Bruijn indices (`B`), as well as free variables (`F`). Notice that lambda abstraction (`Abs`) doesn’t give a name to the function argument, which means that only `B`s can reference them. This is called the “locally nameless” approach.

``````{-# language DeriveGeneric #-}

import Control.Lens.Plated (Plated(..), gplate, transformM)
import GHC.Generics (Generic)

data Expr
= F String
| B Int
| App Expr Expr
| Abs Expr
deriving (Eq, Show, Generic)

instance Plated Expr where
plate = gplate``````

The core of the `bound`-like API will be two functions:

• `abstract :: String -> Expr -> Expr`
• `instantiate :: Expr -> Expr -> Maybe Expr`

Let’s do `abstract` first.

`abstract name expr` finds all the `F name` nodes in an `expr` and replaces them with the appropriate De Bruijn index, then wraps the final result in an `Abs`. The “appropriate index” is the number of `Abs` constructors that we passed on the way.

For example, `abstract "x" (F "x")` evaluates to `Abs (B 0)`, because we passed zero `Abs` constructors to get to the `"x"`, then wrapped the final result in an `Abs`. `abstract "y" (Abs (App (B 0) (F "y")))` evaluates to `Abs (Abs (App (B 0) (B 1)))` because we passed one `Abs` to get to the `"y"`, then wrapped the final result in an `Abs`.

“Do this everywhere” usually means transform `:: Plated a => (a -> a) -> a -> a` is appropriate. Though in this case, it doesn’t give us any way to count the number of `Abs` it passes. Instead we will use transformM `:: (Monad m, Plated a) => (a -> m a) -> a -> m a` with State.

Here’s how that looks:

``````abstract_attempt_1 :: String -> Expr -> Expr
abstract_attempt_1 name = Abs . flip State.evalState 0 . transformM fun
where
fun :: Expr -> State.State Int Expr
fun (F name')
| name == name' = B <\$> State.get
| otherwise = pure \$ F name'
fun (Abs expr) = Abs expr <\$ State.modify (+1)
fun expr = pure expr``````

If you see a free variable with the name we’re abstracting over, replace it with a De Bruijn index corresponding to the number of binders we’ve seen. If you see an `Abs`, increment the counter. If you see something else, don’t do anything special.

This is the right idea, but it doesn’t work because the `transform` family of functions act from the bottom up. When it sees a free variable it can abstract over, it will replace it with `B 0`, then go upwards through the tree, incrementing the counter. This is the reverse of what we want.

## etatS

Enter Reverse State. In reverse state, `get` accesses the state of the computation after it, not before it. Using regular state, `execState (modify (+1) *> modify (*2)) 0` will evaluate to `2`, because you set the state to zero, add one, then multiply by two. Using reverse state, the output is `1`, because you set the state to zero, multiply by two, then add one.

This means that if we swap regular state for reverse state in `abstract`, `get` refers to a state which is only calculated after bubbling all the way to the top, and counting all the `Abs` constructors.

So the correct code looks like this:

``````abstract :: String -> Expr -> Expr
abstract name = Abs . flip Reverse.evalState 0 . transformM fun
where
fun :: Expr -> Reverse.State Int Expr
fun (F name')
| name == name' = B <\$> Reverse.get
| otherwise = pure \$ F name'
fun (Abs expr) = Abs expr <\$ Reverse.modify (+1)
fun expr = pure expr``````

The logic remains the same, except now the state transformations run backwards.

Now for `instantiate`.

`instantiate (Abs body) x` substitutes `x` into the appropriate positions in `body`, and wraps the final result in a `Just`. If the first argument to `instantiate` is not an `Abs`, then the result is `Nothing`. We substitute `x` everywhere we find a `B` that contains the number of binders we have passed.

For example, `instantiate (Abs (B 0)) (F "x")` evaluates to `Just (F "x")`, because we found a `B 0` when we had passed zero binders (the outer `Abs` doesn’t count). `instantiate (Abs (Abs (App (B 0) (B 1)))) (F "y")` evaluates to `Just (Abs (App (B 0) (F "y")))`, because we found a `B 1` when we had passed one binder. The `B 0` is not replaced because at that point, we had passed one binder, and zero is not one.

We have the same problem as with `abstract`: counting binders proceeds from the top down, but `transformM` works from the bottom up. We can use reverse state again to solve this. Here’s the code:

``````instantiate :: Expr -> Expr -> Maybe Expr
instantiate (Abs body) x = Just \$ Reverse.evalState (transformM fun body) 0
where
fun :: Expr -> Reverse.State Int Expr
fun (B n) = do
n' <- Reverse.get
pure \$
if n == n'
then x
else B n
fun (Abs expr) = Abs expr <\$ Reverse.modify (+1)
fun expr = pure expr
instantiate _ _ = Nothing``````

And there we have it: a `bound`-like API for a datatype using `Plated`.

I think there are two pressing issues when comparing this code to `bound`: correctness and generalisation. This approach allows you to write bogus terms, like `Abs (B 3)`, whereas `bound` does not. I’m okay with this, because I highly value the tools `Plated` provides. Additionally, the `bound` combinators work over any term as long as it is a `Monad`, so `abstract` and `instantiate` only have to be written once, whereas we haven’t presented any means for generalisation of the `Plated` approach.

This is easily fixed: in a follow-up post, I’ll write about how we can use Backpacky Prisms to provide `abstract` and `instantiate` as library functions.