Saturday, June 8, 2013

Abstracting Abstract Machines in Maude

I've been helping a colleague with his research in analyzing languages with higher-order temporal contracts. While doing this, we ditched Promela (the language of explicit-state model checker, SPIN) in favor of the far more expressive rewriting engine, Maude, that just so happens to have a pretty good LTL model checking engine in it.

This was my first time using Maude. It's a quirky little language, far different from PLT Redex, my semantics engineering tool of choice. For one, it is not an embedded DSL. Redex allows me to escape to Racket whenever I need to do some more complicated computation that I don't need to model using reduction relations. Maude is strictly a term rewriter, and thus, certain tasks are tricky. This post is meant to be a tutorial for those of you who want to try Maude for prototyping analyses, considering its performance might justify the loss in expressiveness for you.

My intention is not to teach you all of Maude, because that would be silly. I will point out some pitfalls I fell into while getting AAM with store widening to work. We'll just model the untyped lambda calculus for brevity.

Gotcha #1: if you want to define a type and get a list of that type in the same module, you can't. The reason for this is that you need prove whatever type you want to get a list of fits into the theory the the LIST "functional" module expects, which just happens to be the trivial theory. This proof must be given at the top level using a "view," and not nested within the defining module.

Workaround: Since type declarations and the definitions of their variants don't have to be in the same module, we can have a stub module ToDefine that declares all the types we want to have views of.

mod ToDefine is
 pr STRING .
 sorts VAR Addr Time Expr Value Storable Point State StateStamp Frame Kont HeapDelta .
 subsorts VAR < String .
 op mt : -> Kont [ctor] .
 op kont : Frame Addr -> Kont [ctor] .
 subsorts Kont < Value < Storable .

I've thrown in other types (sorts) that we'll need later. This subsort thing is another funky feature of Maude: you can monkey-patch types by just saying something else is in that type. The reason we order Kont less than Value instead of just both less than Storable is that there should be a least sort for each term, in order to have unambiguous parsing (ya, parsing is type-directed). If you write everything in prefix form, this doesn't matter, but the mix-fix that Maude offers can get confused. A lot of the time, if your program doesn't typecheck, you get an unhelpful error that it has no parse. The op keyword declares an operation, and the ctor attribute means that we should treat it like a constructor of that type. mt is a constant, so we don't have to reference it by mt().

The "pr" thing is short for "protecting," which is like a require, not an include. It's for protecting equations/rules in the module from this monkey-patching stuff.

The boilerplate to show all these sorts are in the TRIV (trivial) theory follows. The names for views is the same as the sort just for notational convenience, so we can say Set{Value} instead of Set{ValueView}.

view VAR from TRIV to ToDefine is sort Elt to VAR . endv
view Addr from TRIV to ToDefine is sort Elt to Addr . endv
view Expr from TRIV to ToDefine is sort Elt to Expr . endv
view Value from TRIV to ToDefine is sort Elt to Value . endv
view Storable from TRIV to ToDefine is sort Elt to Storable . endv
view Point from TRIV to ToDefine is sort Elt to Point . endv
view State from TRIV to ToDefine is sort Elt to State . endv
view StateStamp from TRIV to ToDefine is sort Elt to StateStamp . endv

The way to alias instantiations of types, say a map from variables to addresses, is to define an entirely new module and use some funky import renaming syntax:

mod Env is
 pr ToDefine .
 pr MAP{VAR, Addr} * (sort Map{VAR, Addr} to Env) .
mod AStorable is
 pr ToDefine .
 pr SET{Storable} * (sort Set{Storable} to AStorable) .

And since we want to have a map from addresses to sets of storable entities (Abstract storables, or AStorable), we again have to do the view boilerplate:

view AStorable from TRIV to AStorable is sort Elt to AStorable . endv

Continuing on, we write our AST constructors (expressions are labeled so we can distinguish them for continuations' allocated addresses):
mod Syntax is
 pr Env .
 pr AStorable .
 pr INT .
 pr MAP{Addr, AStorable} * (sort Map{Addr, AStorable} to Heap) .
 op Var : Int VAR -> Expr [ctor] .
 op Lam : Int VAR Expr -> Expr [ctor] .
 op App : Int Expr Expr -> Expr [ctor] .

 op CloV : VAR Expr Env -> Value [ctor] .

The semantics I present a bit differently from the original AAM papers to make store-widening much easier and efficient to do. I separate changing the store into its own step by using a "apply this store change" state. This allows incremental updates to the shared store while stepping all the states in the worklist. Another difference is that I don't nondeterministically step to all values of a variable, since that is wasteful in terms of time and space - the precision gain is non-existent. The meta-functions alloc and tick from the paper are given the entire state to use, which introduces a circularity if you try to parameterize the semantics by a theory containing Addr, Time, alloc and tick, because states are defined by the semantics being instantiated. Oops. I've instead given the functions much less information, but it ends up being enough for commonly used abstract addressing schemes.

Gotcha #2: For a reason I don't fully understand, functional modules are not allowed to define rewrite rules. That means we couldn't parameterize the semantics by the AAM meta-functions and spaces even if we break the circularity. They can, however, define conditional memberships in a sort, so one could define the ReachableStates sort and encode the semantics as a backchaining one instead of a forward chaining one, but that defeats the purpose of a forward analysis. Due to this limitation, I present a very simple allocation and time-stamping scheme: 0CFA.

mod CESK is
 pr Syntax .
 pr AStorable .
 pr SET{Value} * (sort Set{Value} to AVal) .
 pr MAP{Addr, AStorable} * (sort Map{Addr, AStorable} to Heap) .

 op Eval : Expr Env Kont Time -> Point [ctor] .
 op Continue : Kont AVal Time -> Point [ctor] .

 op ar : Expr Env -> Frame [ctor] .
 op fn : AVal -> Frame [ctor] .

 op none : -> HeapDelta [ctor] .
 op binds : Addr AStorable -> HeapDelta [ctor] .

 op wn : Point Heap -> State [ctor] .
 op wd : Point Heap HeapDelta -> State [ctor] .

 op applyΔ : Heap HeapDelta -> Heap .
 op join : Heap Addr AStorable -> Heap .

The first part of the module here defines what states look like: they are composed of a point in the abstract state space, and its associated heap. We add a heap with wn ("wide to narrow"). The heap change constructor (wd) contains the heap itself so that we can apply the change as a separate step. These declarations out of the way, we enter another boilerplate land: meta-variables. Matching equations must have an idea what sort of patterns they're dealing with. Thus a variable, s, standing for a state must either be declared as "always used to match a state" (var s : State) or directly in each matching pattern (s:State). I personally like to have my equations clutter-free, just using naming conventions to know what I'm working with. I like this feature of the language, but I'd appreciate a way to export common meta-variables. Every time I want to match against an Expr in a different module, I have do declare e stands for an Expr. It gets copy/pasty quickly.

 vars ℓ : Int .
 vars e e0 e1 : Expr .
 vars ρ : Env .
 vars σ σ' : Heap .
 vars x : VAR .
 vars t t' : Time .
 vars v : Value .
 vars av av1 : AVal .
 vars ss ss' : AStorable .
 vars f : Frame .
 vars κ : Kont .
 vars a b c : Addr .
 vars δ : HeapDelta .
 vars ς̂ : Point .

I use unicode in my programs. That's right. The last couple steps before giving the rewrite rules that define the semantics are the meanings of join and allocation. First, changing the store. There is a bit of a gotcha here, that you can't update a map in-place; you must first match against the key you wish to change to see if it's already there, and then extend the rest with the key mapping to its new value. It's not too bad with the associative-commutative matching in Maude.

 eq join((σ, a |-> ss'), a, ss) = (σ, a |-> (ss', ss)) .
 eq join(σ, a, ss) = (σ, a |-> ss) [owise] .
 eq applyΔ(σ, none) = σ .
 eq applyΔ(σ, binds(a, ss)) = join(σ, a, ss) .

The owise attribute is nice. We can have a catch-all case if all previously defined cases didn't match - just like functional programming. Now comes allocation. In 0CFA there is no time. It's constant. That's why we add a singleton sort Unit less than Time, where the only constructor is the initial time. Variables are their own addresses, and continuations are allocated to the label of the expression that creates them.

 op vaddr : VAR -> Addr [ctor] .
 op kaddr : Int -> Addr [ctor] .
 sort Unit .
 subsort Unit < Time .
 op t0 : -> Unit [ctor] .

 op allocVar : VAR Time -> Addr .
 op allocKont : Int Time -> Addr .
 op tick : Value AVal Kont Time -> Time .
 eq allocVar(x, t) = vaddr(x) .
 eq allocKont(ℓ, t) = kaddr(ℓ) .
 eq tick(v, av, κ, t) = t .

Finally the semantics. Since the multitude of values at any one address aren't really needed until a function call, we keep all values in superposition until we really need to observe which value we looked up. The associative-commutative matching on the set of values in the fn frame gives us that resolution of non-determinism.

 rl [Var-lookup] : wn(Eval(Var(ℓ,x), ρ, κ, t), σ) => wd(Continue(κ, σ[ρ[x]], t), σ, none) .
 crl [Fn-eval] : wn(Eval(App(ℓ, e0,e1), ρ, κ, t), σ) =>
                 wd(Eval(e0, ρ, kont(ar(e1,ρ), b), t), σ, binds(b, κ))
                 if b := allocKont(ℓ, t) .
 rl [Lam-eval] : wn(Eval(Lam(ℓ, x, e), ρ, κ, t), σ) => wd(Continue(κ, CloV(x,e,ρ), t), σ, none) .
 rl [Arg-eval] : wn(Continue(kont(ar(e,ρ), a), av, t), σ) => wd(Eval(e,ρ, kont(fn(av), a), t), σ, none) .
 crl [Apply] : wn(Continue(kont(fn((CloV(x,e,ρ), av)), a), av1, t), σ) =>
               wd(Eval(e,(ρ, x |-> b), kont(f, c), t'), σ, binds(b, av1))
               if b := allocVar(x, t)
               /\ kont(f, c), ss := σ[a]
               /\ t' := tick(CloV(x,e,ρ), av1, kont(f, c), t) .

 rl [Heap-change] : wd(ς̂, σ, δ) => wn(ς̂, applyΔ(σ, δ)) .

So that's the simple part of Maude. The fun reflective stuff comes next, when we want to widen the store to get something that runs in polynomial time. The scheme here is to represent all states we have seen (at which heap), the states we still need to step (the "frontier"), and the shared heap. Well, the heaps get pretty big, and checking if a state has been seen before would need to compare equality on a large object. Instead, we use Shivers' trick and track how many times we've changed the store, and represent the stores in the seen set as just a number. The store monotonically increases, so this is perfectly sound. This is where the StateStamp type comes in: just a point and an Int. First a couple aliases:

mod Seen is pr ToDefine . pr SET{StateStamp} * (sort Set{StateStamp} to Seen) . endm
mod Frontier is pr ToDefine . pr LIST{Point} * (sort List{Point} to Frontier) . endm

The wide semantics will be reusing the original (or narrow) semantics we previously defined. In order to do that, we need to have an internalized notion of what it means to search for a possible reduct for a term. To do this, we use metaSearch, reflect up the CESK module, inject our state to step with the current shared heap, and let it search for all 1-step reductions. It's pretty cool. Redex doesn't have a notion of meta-reduction/search; I instead have to escape to Racket and use apply-reduction-relation, completely thwarting the debugging tools Redex offers, like traces. But first! Some boilerplate.
mod WIDE is
 pr CESK .
 pr Seen .
 pr Frontier .
 pr SET{Term} * (sort Set{Term} to TermSet,
                op _`,_ to _u_) .

 sort System .
 op sys : Frontier Heap Int Seen -> System [ctor] .
 op wt : Point Int -> StateStamp [ctor] .
 op Fail : -> State [ctor] .

 vars S S' S0 S1 SC SP : Seen .
 var ς̂ : Point .
 vars F F' F0 F1 ς̂s ς̂s' ς̂s'' : Frontier .

 vars name x y : VAR .
 vars e e0 e1 e2 : Expr .
 vars ς ς' : State .
 vars a a0 a1 : Addr .
 vars ss : AStorable .
 var i : Int .
 var b : Bool .
 var v : Value .
 vars ρ ρ0 ρ1 : Env .
 vars σ σ' σ0 σ1 σc σc' σp : Heap .
 vars δ δ' : HeapDelta .
 vars κ κ0 κ1 : Kont .
 vars vs vs' vs'' : AVal .
 vars δ? δ?' : Bool .

The reason we add a Fail state is so that we give the "reflect down" operation (downTerm from META-LEVEL) a fallback in the same expected kind, incase we can't reify a term that metaSearch produces back into a State. This shouldn't happen, but it's there so that downTerm is happy. A change from the previous semantics is that we build up changes to the store instead of apply them one-by-one. Also, we have to keep track if any of these "changes" actually do change the store. We thus have new join and apply operations that also return a Boolean saying if there was a change. We of course have to add a type and constructor for this - call it Heap/Δ?.

 sort Heap/Δ? .
 op hδ : Heap Bool -> Heap/Δ? [ctor] .

 op $join/δ? : Heap Addr AVal Bool -> Heap/Δ? .
 ceq $join/δ?((σ, a |-> vs'), a, vs, δ?) = hδ((σ,a |-> vs''), δ?')
      if vs'' := (vs' , vs)
      /\ δ?' := δ? or-else (vs'' =/= vs') .
 eq $join/δ?(σ, a, vs, δ?) = hδ((σ,a |-> vs), true) [owise] .

 op $applyΔ/δ? : HeapDelta Heap Bool -> Heap/Δ? .
 eq $applyΔ/δ?(none, σ, δ?) = hδ(σ, δ?) .
 eq $applyΔ/δ?(binds(a, ss), σ, δ?) = $join/δ?(σ, a, ss, δ?) .

The magic happens next. The heavy hitter is allOneStepAux, which takes the term to step, which search solution it's on (I'll explain), the pattern to match against (just "a state"), the shared heap, the number of heap changes, whether there has been a change for this step, a partial seen set, the next frontier, and the current seen set. What I mean by "search solution" is (and this tripped me up for a couple hours) which of the many possible search solutions, since this interface doesn't return a set of all solutions. Instead, you have to keep polling for solutions until it returns "failure," which I suppose I could see as a reasonable interface if users don't necessarily need all solutions (but really it's unsettling to me).

 op $step-all : Frontier Heap Heap Int Bool Seen Frontier -> System .

 vars T P : Term .
 vars N σt σt' : Int .

 op CESKMod : -> Module .
 eq CESKMod = upModule('CESK, false) .

 sort StepTuple .
 op stpl : Frontier Heap Bool Seen -> StepTuple .

 op metaTermToStateSet : Term Heap Int Bool Seen Frontier -> StepTuple .
 op allOneStepAux : Term Int Term Heap Int Bool Seen Frontier Seen -> StepTuple .

 eq metaTermToStateSet(T, σ, σt, δ?, S, F) =
    allOneStepAux(T, 0, 'ς1:State, σ, σt, δ?, empty, F, S) .

 op retag : Seen Int -> Seen .
 eq retag(empty, σt) = empty .
 eq retag((wt(ς̂, σt), S), σt') = wt(ς̂, σt'), retag(S, σt') .

 ceq allOneStepAux(T, N, P, σ, σt, δ?, SC, F, SP) =
     stpl(F, σ, δ?, (SP, SC))
     if metaSearch(CESKMod, T, P, nil, '+, 1, N) == failure .
 ceq allOneStepAux(T, N, P, σ, σt, δ?, SC, F, SP) =
     allOneStepAux(T, N + 1, P, σ1, σt, δ?', S', F', SP)
     if wd(ς̂, σ0, δ) := downTerm(getTerm(metaSearch(CESKMod, T, P, nil, '+, 1, N)), Fail)
     /\ hδ(σ1, δ?') := $applyΔ/δ?(δ, σ, δ?)
     /\ S' := if δ? then wt(ς̂, σt + 1), SC
              else if δ?' then retag(SC, σt + 1)
                   else wt(ς̂, σt), SC fi
     /\ F' := if δ?' or-else not wt(ς̂, σt) in SP then ς̂ F else F fi [owise] .

 eq $step-all(nil, σp, σc, σt, δ?, S, F) = sys(F, σc, (if δ? then (σt + 1) else σt fi), S) .
 ceq $step-all(ς̂ ς̂s, σp, σc, σt, δ?, S, F) = $step-all(ς̂s, σp, σc', σt, δ?', (S, S'), F')
     if stpl(F', σc', δ?', S') := metaTermToStateSet(upTerm(wn(ς̂, σp)), upTerm(ς̂), σc, σt, δ?, S, F).

 op CFA : System -> System [iter] .
 eq CFA(sys(F, σ, σt, S)) = $step-all(F, σ, σ, σt, false, S, nil) .

The allOneStepAux function will consider all seen states to be at the current store's timestamp until it encounters a change, at which point it will do a one-time retagging of all the states just marked as seen. This is why there is a partial seen set, SC, so that we only retag states seen in just this step. The arguments to metaSearch are the module it's searching in, the starting term, the pattern it's matching, any side conditions on the pattern (none), a mode of search (+ is 1 or more), the max depth of search (max 1), and which search solution we're after. *phew*

Apart from the boilerplate, this was a fairly pleasant experience. I haven't scaled up my toying around with it to see if it is competitive with my hand-rolled implementation of an analysis for R4RS Scheme, but I wouldn't be surprised either way. This thing is a beast for its rewriting efficiency, but needing to reflect each term for each step, considering that they don't use direct reflection (as far as I can tell), that has got to be expensive. It's been a fun experiment, though.

Monday, February 18, 2013

Is your analysis concrete?

In the area of program analysis, it is all-too-easy to marry yourself to a model of computation that does not capture the entire behavior of the language you're analyzing before you even begin. The classical data flow analysis algorithms from the Fortran folks have been mostly successful for all the compiler hackers out there, but computer security has become such a real danger that we can't just use the work of hackers (though their work is appreciated). Before I talk more about program analysis, I want to say a few words on reliable computing.

Reliable computing and programming languages

If we are to be confident about any property of a program that we write, not only should it pass a large test suite, but it should also have that property proven either automatically, by an interactive proof process resulting in a machine-checkable proof, or provided directly by the language it is written in.

Herein lies the problem. Computers speak only one language - machine code. In modern computers, machine code is a binary representation of some low-level imperative language. In just under 100% of the time, programs are not written in machine code, so we employ compilers - program translators - to change a nicer language to machine code. Sometimes we even chain these compilers together so that there is a higher level intermediate code that even higher level languages can target, so that different kinds of translation technology can be used at the different levels of abstraction.

How, then, do we know properties hold of programs written in high level languages that promise to maintain certain abstractions? Can programs written in other languages break these properties? Many engineers understand languages by their implementation, but not the intended semantics of the language. After all, a language only means something if it executes machine code, right? Well, not really. This line of thinking becomes far too intractable as high level language implementors take more liberties with their implementation strategy. Instead, if we assume a mathematically ideal computational model for the machine code (not really a problem unless you're in outer space or working on an airplane for over 12 hours), then we can employ powerful proof techniques to prove that such translations not only preserve properties, but that programs in the target language can't break them. These techniques require a the model (semantics) for the machine code AND a semantics for the high level language. What is a semantics if not an implementation?

A programming language semantics is a mathematical model of the meaning of a program in that language. Yes, an imperative implementation could be a semantics, but it's often too large and complex of a mathematical object to reason about. Instead, we have different styles for writing programming language semantics (c.f. the Redex book) that are much more amenable to standard proof techniques like induction and subject reduction. I am a personal proponent of languages maintaining a formal semantics in one of these styles. The easiest to learn and possibly the most successful is small-step operational semantics.

Analysis of a language, not a graph

The standard model of data flow analysis is of graphs; basic blocks form nodes and control flow forms edges between nodes. This form of analysis can be proven correct with respect to a concrete semantics, but the path is cumbersome. Furthermore, not all languages have the standard imperative block structure. There is another way to describe a program analysis and have its correctness as an immediate result of the way it's written. This is in the form of Abstracting Abstract Machines (AAM), which shows a straightforward way to phrase your programming language semantics such that abstraction to a finite automaton (and thus computably explored) is dead simple.

This technique does not mark the end of research in analysis. Instead it opens possibilities and offers a new perspective: your analysis techniques should be built into your programming language semantics. That isn't to say your go-to model should be the translated form that makes abstraction simple. No, this means that you should have a repeatable, straightforward process for translating a run-of-the-mill concrete semantics into a ready-for-abstraction concrete semantics. It is illuminating to see what your analysis machinery really means in terms of concrete execution, and can ease your explanation of the idea to someone else; we humans are good with thinking concretely, while abstraction makes it harder.

I more recently developed a simpler translation path from semantics to a pushdown abstraction (rather than a finite one), although it assumes more of the programming model - that is, no more than a bounded amount of the continuation can be captured at once. Languages with full continuations can enjoy enhanced precision of the method with some extra model-specific work, but the pushdown guarantee flies out the window. Though the guarantee is theoretically broken, the use of heavy control operators tends to be controlled enough (hah) to not cause too much leaked precision.

The translation amounts to saving the state of the machine right after control transfers to a function as a key into a table of "local continuations" that also gets updated on function call to the continuation up to the entry of the function. This "up to" notion is easily captured by replacing the continuation on function call with the key into the continuation table. When it comes time to interpret that key, you just continue to all local continuations that were stored at that key. This non-determinism can lead to loops where previously there weren't (you can always choose the older of return continuations), but reachability of states is maintained with the original semantics. Abstraction follows the same strategy of AAM for finitizing the address space, but continuations no longer need to be store-allocated. This presentation immensely helped people in my department understand the work of my recently-graduated colleague, Dimitrios Vardoulakis, on CFA2, the first pushdown abstraction for a higher-order language. One element that was missing from his work was a non-standard concrete semantics, the term given to concrete semantics augmented with different/extra information to compute the answer of the analysis, rather than the answer of evaluation. Indeed, this idea is not new. Rather, the push I and my colleagues are making is to make the translation from standard to non-standard concrete semantics a straightforward and intuitive process. Secondly, all the trickery of the analysis should have a concrete counterpart. Once you abstract, there should not be anything surprising. The full generality of the abstract interpretation framework is something to appreciate, but we must also respect ourselves and our audience by providing an understandable and accessible technical artifact.