Today I want to talk a little bit about what I've been researching over the summer. For a little context, I've finished the first year of the CS Ph.D. program at my university, and I have mainly focused on formal methods/automated reasoning. My passion is for proving properties about software - specifically, I want to make that process easier, more natural and commonplace. Since I did not learn how to prove theorems with transfinite induction (the most common tool for proofs about software) until grad school, even after getting B.S. degrees in pure mathematics and computer science at a very good university, I don't think the "commonplace" goal is coming soon. Perhaps I will discuss this area and the difficulties it faces in a future post.
At this university, the largest research group - and the most famous - is for programming languages (PL). Academia's approach to PL over the past few decades has focused mainly on functional programming languages, as far as I can tell. That was shocking to me, since just about everything I've seen in industry has been imperative stuff. After dealing with proofs, I now understand a reason why PL fancies functional languages - they use recursion and immutable values much more often than while loops and repeated variable assignments. This approach allows you to disregard the global state of your program and focus specifically on one function at a time. Why? If your function depends on a variable that x other functions can change, then you must keep track of how that variable is changed. Most likely your proof of correctness hinges on that variable holding a specific property, such as prime number or normalized vector. Every place that variable changes, you must make sure that property holds; if it doesn't, you have to make sure it does before your function is called. That's a lot to keep track of. It's not all that easy to maintain, especially if you work in a team setting.
So what's cool about functional programming languages other than that fact? Well, I haven't begun to talk about macros. There are ways of abstracting behavior of your program beyond what a function can do. Say you need the same large initialization sequence executed before some block of code in many different places. You can make a new command that will wrap your code with all that boilerplate at compile time with a macro. That's a simple example. There are other examples in which a macro will do long and complicated analyses on input code, and then output optimized or annotated code that should be equivalent in functionality.
Ooh, "equivalent" sounds like a theorem proving word. Can we prove that these analyses find the invariants we need for correctness of the transform? How about after that - actually proving that the macro form expands to an equivalent expression in all contexts? The answer is (you're going to love this) it depends. It depends on what you consider proof. Certainly you can write functions that describe the semantics of the language you're working in (say, with an embedded operational semantics - oh but for a termination argument you have to specifically limit how many evaluation steps are allowed. Yikes.) So wait, you have to embed the language you're working in... in the language you're working in? Why would you want to do that? Gross.
So what if we want our logic to allow talking about macros inside the logic? Say we don't need that embedding, and then we can actually write rewrite rules to simplify macro forms without having to expand them. This is some heavy stuff. Now we are talking about quantification over syntax rather than simply values.
Some vocabulary - when a macro expands, it is a function from representation of syntax to representation of syntax, but when it's complete, that representation is reified to become actual syntax that will be evaluable.
A problem enjoyed (rather, suffered) by many macro systems is the loss of scoping laws. It is not possible that in all contexts your macro will have the same meaning. This is due to free and bound variable capture. A simple textual replacement macro system such as in LISP and C is bound to change the meaning of your program when you decide to use a variable name that the macro redefines. In LISP at least you can check that variables you introduce are not in the support (set of free or bound variables) of the input expressions, and fail to expand if they are. This is ugly. Not only does it reveal internals of a macro (something that module abstraction should give), but it pits the programmer against the macro - "why can't I use the variable names I want?" There is a solution, however it is not the most palatable either (just the best we've got), called hygienic macro expansion. A hygienic macro expander will rename variables as it goes to make sure variables are available only in the scope of where they are intended to be used. I can think of more ambiguous statements, but that one surely is a nuisance to anyone doing formal reasoning.
Hygienic macros have only recently gotten a formal treatment (see Dave Herman's thesis - A Theory of Typed Hygienic Macros), and since it is in the setting of a decidable type theory, the whole story is not delivered. However it does give us hope for a complete meaning. If one is able to completely describe how a macro's input will be used for introducing and consuming bindings in its expansion, then we can show that if two inputs have the same binding structure implies that the macro's expansion on both inputs has the same binding structure (more precisely, both expansions are alpha-equivalent), then the macro expander is hygienic. Nice. That takes care of the all contexts problem. Now how do we express this without long-winded arguments? My idea is to have simple base that requires long-windedness to begin with, but (haha!) we use macros to conceal this detail.
Notice that the hygiene property cannot be defined inside the logic, since any function describing binding structure would be manipulating only a representation of syntax, and not actual syntax for which the logic has reasoning capabilities. Wait wait wait... we are now introducing to the logic a means to have meta-reasoning at the non-meta level. That shouldn't work, right? Gödel did some work on this I think. Well, there are theorem provers out there that allow for meta-reasoning, so let's look at how they do it.
The key idea is that you don't collapse the meta-theory into your theory, but you separate the two and start at the meta level to begin with. This is what is meant by working in a logical framework rather than a specific logic. Thus we do have an "embedding" of the language we're working in, but it's already given as it is defined in the logical framework. A bit of a hack, but logically necessary. Then what encoding does one choose for syntax? If you have any familiarity with the lambda calculus, you should know how beta-reduction is rather involved (in call-by-value semantics) since it must do bound variable renaming. In hygienic macro systems, lexical information about a term must be stored somewhere, and for a practical language, so too must source code correlation information.
Yikes again. Source code correlation information is purely for debugging purposes and thus logically uninteresting. We can abstract over this by instead defining an abstract datatype that satisfies the properties we want of syntax, and then let an arbitrarily complicated implementation prove that it satisfies this specification. One way researchers have been able to describe binding in syntax is by actually using a higher order representation. That is, instead of symbolically representing bindings everywhere like LISP does, actually use a real lambda abstraction. This is called Higher-Order Abstract Syntax (HOAS) and it has been getting lots of attention for that past, oh, 15 years. Why then aren't we sipping on piña coladas while our informal meta-reasoning style proofs whiz through formal systems?
Remember my description of theorem proving in the beginning, and how induction is vital? Well it turns out that in general, inductive proofs over HOAS is... impossible. The reason for this is because its general representation is not inductively definable - variable references can be arbitrarily far down a syntax tree from the binding point; a crippling technicality called "negative occurrence" in type theory. Though IANAH (I am not a historian), the first attempt I know of to restrict syntax for meta-reasoning was in Edinburgh LF (logical framework), actually implemented at CMU as "Twelf", based on (competes with?) a logic called (fold-n) developed by some guys at École Polytechnique. There is additional work in Isabelle/HOL that builds off Twelf's idea, but this building is a big stack of confused mess if you ask me. At the bottom of the stack is Isabelle's logical framework, then classical higher order logic axioms, then some concrete representation of syntax, then the "specification logic," and then the "object logic." These layers are abstractly what needs to exist, but everything is logically defined, so execution isn't given (can't just build on top of Common LISP like ACL2 does). There is quite a bit of research into extracting programs from specifications, but what if I want programs AS specifications? As far as I know, there is no real programming language basis for these systems that would mean executable specifications and reasoning about actual runnable programs, and not their (trusted) translation into some other logic.
Perhaps it's my history with ACL2 - it is a programming language, a logic and a theorem prover - but is it too much to ask for? This is not a solved problem. I'd like to chip out a big piece of this mystery though. Meta-reasoning about (hygienic) macros must be a possibility. Since meta-reasoning itself has not yet been well-executed, there certainly is a lot to do [collection of meta-reasoning problems for mechanized systems to tackle]. There is a chance though, that by attacking the problem from a macro setting might just simplify (and generalize) the problem space.
Thursday, July 15, 2010
Monday, April 12, 2010
SMT solvers hate you
And me, of course. No one is safe from their wrath of awful error messages. Boolector-1.2 was giving me "illformed 'let' or 'flet'" at the very end of my 1200 line file. Very useful. After poking around for 30 minutes or so, I found that it did not like my identifiers with hyphens (they should be underscores). Yices did just fine on the file, by the way. Boolector also does not like (let (?varname (let (... ...) ...)) ...). Nested lets are stupid. Scope is for humans.
On that note, I found that Boolector confuses nested lets with shadowed variables, so if ?x is bound twice, it incorrectly concludes that a nested let was found. Yices' behavior for shadowed variables is in my opinion far worse; it persists the first binding as if the second never happened. You will get models when you do not expect them.
Granted, this format really isn't for humans, but for the SMT solvers themselves. Let researchers munch on some NP-completeness rather than inform users. That said, there is an OCaml implementation of a well-formedness check, but it is admitted to be incomplete. The (faulty) implementations of the language specification are the real specs, so some error handling on the solvers' side is very well justified.
Just a note for some other researchers/engineers out there that run into Boolector giving some really nasty errors. I may edit this post after future interaction with more solvers like CVC3, MathSAT and Z3.
P.S. Yices doesn't care about the hundreds of close parens that are bound to conclude your file while Boolector expects all parens to match (no extra ')'s allowed). The error is the unhelpful "Unexpected EOF" message when they do not match.
On that note, I found that Boolector confuses nested lets with shadowed variables, so if ?x is bound twice, it incorrectly concludes that a nested let was found. Yices' behavior for shadowed variables is in my opinion far worse; it persists the first binding as if the second never happened. You will get models when you do not expect them.
Granted, this format really isn't for humans, but for the SMT solvers themselves. Let researchers munch on some NP-completeness rather than inform users. That said, there is an OCaml implementation of a well-formedness check, but it is admitted to be incomplete. The (faulty) implementations of the language specification are the real specs, so some error handling on the solvers' side is very well justified.
Just a note for some other researchers/engineers out there that run into Boolector giving some really nasty errors. I may edit this post after future interaction with more solvers like CVC3, MathSAT and Z3.
P.S. Yices doesn't care about the hundreds of close parens that are bound to conclude your file while Boolector expects all parens to match (no extra ')'s allowed). The error is the unhelpful "Unexpected EOF" message when they do not match.
Thursday, February 18, 2010
After Almost a Year of Graduate School
A lot has happened in the past year. I graduated UT with degrees in computer science and pure mathematics, I moved to Boston to pursue my Ph.D. in computer science at Northeastern, and I embarked on a research journey. Allow me to give convey some lessons I have learned.
As you may know, Ph.D. students seldom have to pay for tuition or worry about supporting themselves, as do (some) undergraduates and masters students. No, in exchange for your education, you are required to (at first) be a teaching assistant (you are supported by your college), and later a research assistant (you are supported by your advisor). If you are clever, you might manage to get a fellowship and just give your own work attention. Perhaps next year when I have a better idea of what is an interesting research direction I will apply for a fellowship, but for now I am a TA for a freshman logic class.
This class is very different from my own experience with logic in undergrad, where I spent most of my time smearing glaze over my eyes and amassing a large drool stain on my book while a coke-bottle glasses-wearing emeritus professor drones on about the most menial pedantry of set theory, functions and recursion. This class actually uses my beloved interactive theorem prover, ACL2, to teach students how to reason about their programs, and in the mean time surreptitiously teach them rigorous fundamentals of propositional logic, induction and first-order logic.
ACL2 is a tool that has been taking up a majority of my time recently, and is essentially the reason I came to Northeastern. The name stands for A Computational Logic in Applicative Common LISP, so the 2 is not a version number, but more of an ordinal coefficient/concatenation multiplier. My interest in ACL2 arose when a friend of mine told me about research he was doing with its inventor, J Moore. The idea for systems like ACL2 and the reason theorem proving intrigues me so much comes from a comment John McCarthy (inventor of LISP) made in the 60s along the lines of, "rather than debugging, programs should be proven correct, and that proof should be checked by a computer program." I had only seen reasoning about programs for a couple of lectures in my analysis of programs class (really a discrete structures class), presented in Floyd-Hoare logic, with the caveat that proving properties about programs this way does not scale, is a pain in the ass, and no one does it. Thus I had forgotten about proving properties about programs and went back to hacking and debugging as usual.
A computational logic is a logic for reasoning about algorithms - you stay in the realm of computation, instead of careening off into pure mathematics. ACL2 deals only with finite structures; there is no concept of infinity, functions cannot be values, and lists have a natural number length. All functions must be proven to terminate on any possible input (i.e. ACL2 is a total logic). These restrictions keep the logic very simple, but still very expressive. There are definitely theorem proving systems out there that allow for reasoning about infinity, higher-order functions and partially-defined functions, but proofs in these systems are much less easily automated. Because of the restrictions on the logic, some fantastic engineering and brilliant proof search heuristics, ACL2 is the most automated theorem prover in existence.
Have I sold you yet? If yes, you are just like most other people faced with ACL2 for the first time and shown what it can do. Bright eyes. Naive expectations. The reality of the matter is that proofs of interesting theorems still need mathematicians to come up with a proof strategy. Not only that though - in theorem provers, statements of truth that normally can be used the way you wish when proving theorems by hand now have extra semantics attached to them. You see, when you prove something to be true, you are programming the theorem prover to apply rules in a directed way. Typically if you prove A = B, then really what you are telling the prover is, "when you see A, replace it with B." This is called "substituting equals for equals," a special case of term rewriting. If you do not know the gory details of the prover's rewriting strategy, then you can end up proving extra very specific, horribly inelegant lemmas in order to ram your proof through the system.
The reasons for the prover's rewriting strategies being so directed come from years of experience and a lot of empirical data from heavy users of the system. Here is the learning curve: new to ACL2, you write some simple statement that is obvious to anyone and the prover fails - "stupid piece of crap," you say. Next you start reading and toying around with carefully crafted examples that the prover gets automatically, "oh cool, I was just doing it wrong. ACL2 is awesome!" Later you try to actually use ACL2 for arbitrary code, say you are trying to prove a flood fill algorithm makes a flooded region bigger or the same size as before (intuitively obvious), "AHHH MY EYES! MY FACE! I'M ON FIRE!" and after your Ph.D. in computer science with a focus on formal methods, "oh I get it." At least that is what I am promised. Currently I have bandages on my third degree burns, staring at a burning building, considering going back to save my dog.
My current research attempts to distribute large spanners to users of ACL2, so they have access to their fire hydrants - okay enough of this metaphor. ACL2 tries its very hardest to prove a conjecture true by applying different types of rules to rewrite the whole thing to T (true). The order in which rules are applied matters very much in the success of the proof, and ACL2's approach is to try everything, one by one, in the reverse order theorems were proved. In very directed proofs of one particular theorem, this generally works - consider in mathematics when you prove minor theorems, use those to prove stronger theorems, and then use those to prove your punchline theorem. Typically you do not have to appeal to the low level, minor theorems in your main proof.
In software? Not so much. You tend to work in layers of abstraction, and these layers have borders that allow for communication with other layers. It is this communication that destroys the reverse-chronological ordering strategy. Without communication, the prover would skip over entire layers as irrelevant to get to rules that apply - no big deal. With communication, things get hairy, and you would like to reason at the highest level you can before you have to dive deeper to resolve some technical issue. By tricking ACL2 into reasoning in different layers of abstraction, the order in which you prove theorems becomes less relevant, so you can worry about reasoning rather than rewrite strategies, and it structures proofs to progress in a more natural order.
How I trick ACL2 is the topic of a paper I am writing, and the main idea came from my mentor, Pete Manolios (actually got his Ph.D. under J Moore's supervision, so a very relevant source of ideas). His insight into this "reasoning in layers" came from reasoning about hardware - a very layered process goes into their design. In fact, my methods for tricking ACL2 made the proof of a refinement theorem about a pipelined machine go through in 100 seconds, when previously it made my computer crash, and Pete's computer 15 minutes to prove. This result was a marvellous fluke, since I managed to get such drastic results from a flawed implementation of my trickery. When I corrected the flaw, the proof shot up to 48 minutes on my computer. It took months of fiddling, engineering, and adding heuristics to get the corrected version to emulate the behavior of the flawed version in the case of the refinement proof. I now have a rather robust framework for reasoning in layers of abstraction. Now all I need to do is find some more problems to apply it to that give similar results.
As you may know, Ph.D. students seldom have to pay for tuition or worry about supporting themselves, as do (some) undergraduates and masters students. No, in exchange for your education, you are required to (at first) be a teaching assistant (you are supported by your college), and later a research assistant (you are supported by your advisor). If you are clever, you might manage to get a fellowship and just give your own work attention. Perhaps next year when I have a better idea of what is an interesting research direction I will apply for a fellowship, but for now I am a TA for a freshman logic class.
This class is very different from my own experience with logic in undergrad, where I spent most of my time smearing glaze over my eyes and amassing a large drool stain on my book while a coke-bottle glasses-wearing emeritus professor drones on about the most menial pedantry of set theory, functions and recursion. This class actually uses my beloved interactive theorem prover, ACL2, to teach students how to reason about their programs, and in the mean time surreptitiously teach them rigorous fundamentals of propositional logic, induction and first-order logic.
ACL2 is a tool that has been taking up a majority of my time recently, and is essentially the reason I came to Northeastern. The name stands for A Computational Logic in Applicative Common LISP, so the 2 is not a version number, but more of an ordinal coefficient/concatenation multiplier. My interest in ACL2 arose when a friend of mine told me about research he was doing with its inventor, J Moore. The idea for systems like ACL2 and the reason theorem proving intrigues me so much comes from a comment John McCarthy (inventor of LISP) made in the 60s along the lines of, "rather than debugging, programs should be proven correct, and that proof should be checked by a computer program." I had only seen reasoning about programs for a couple of lectures in my analysis of programs class (really a discrete structures class), presented in Floyd-Hoare logic, with the caveat that proving properties about programs this way does not scale, is a pain in the ass, and no one does it. Thus I had forgotten about proving properties about programs and went back to hacking and debugging as usual.
A computational logic is a logic for reasoning about algorithms - you stay in the realm of computation, instead of careening off into pure mathematics. ACL2 deals only with finite structures; there is no concept of infinity, functions cannot be values, and lists have a natural number length. All functions must be proven to terminate on any possible input (i.e. ACL2 is a total logic). These restrictions keep the logic very simple, but still very expressive. There are definitely theorem proving systems out there that allow for reasoning about infinity, higher-order functions and partially-defined functions, but proofs in these systems are much less easily automated. Because of the restrictions on the logic, some fantastic engineering and brilliant proof search heuristics, ACL2 is the most automated theorem prover in existence.
Have I sold you yet? If yes, you are just like most other people faced with ACL2 for the first time and shown what it can do. Bright eyes. Naive expectations. The reality of the matter is that proofs of interesting theorems still need mathematicians to come up with a proof strategy. Not only that though - in theorem provers, statements of truth that normally can be used the way you wish when proving theorems by hand now have extra semantics attached to them. You see, when you prove something to be true, you are programming the theorem prover to apply rules in a directed way. Typically if you prove A = B, then really what you are telling the prover is, "when you see A, replace it with B." This is called "substituting equals for equals," a special case of term rewriting. If you do not know the gory details of the prover's rewriting strategy, then you can end up proving extra very specific, horribly inelegant lemmas in order to ram your proof through the system.
The reasons for the prover's rewriting strategies being so directed come from years of experience and a lot of empirical data from heavy users of the system. Here is the learning curve: new to ACL2, you write some simple statement that is obvious to anyone and the prover fails - "stupid piece of crap," you say. Next you start reading and toying around with carefully crafted examples that the prover gets automatically, "oh cool, I was just doing it wrong. ACL2 is awesome!" Later you try to actually use ACL2 for arbitrary code, say you are trying to prove a flood fill algorithm makes a flooded region bigger or the same size as before (intuitively obvious), "AHHH MY EYES! MY FACE! I'M ON FIRE!" and after your Ph.D. in computer science with a focus on formal methods, "oh I get it." At least that is what I am promised. Currently I have bandages on my third degree burns, staring at a burning building, considering going back to save my dog.
My current research attempts to distribute large spanners to users of ACL2, so they have access to their fire hydrants - okay enough of this metaphor. ACL2 tries its very hardest to prove a conjecture true by applying different types of rules to rewrite the whole thing to T (true). The order in which rules are applied matters very much in the success of the proof, and ACL2's approach is to try everything, one by one, in the reverse order theorems were proved. In very directed proofs of one particular theorem, this generally works - consider in mathematics when you prove minor theorems, use those to prove stronger theorems, and then use those to prove your punchline theorem. Typically you do not have to appeal to the low level, minor theorems in your main proof.
In software? Not so much. You tend to work in layers of abstraction, and these layers have borders that allow for communication with other layers. It is this communication that destroys the reverse-chronological ordering strategy. Without communication, the prover would skip over entire layers as irrelevant to get to rules that apply - no big deal. With communication, things get hairy, and you would like to reason at the highest level you can before you have to dive deeper to resolve some technical issue. By tricking ACL2 into reasoning in different layers of abstraction, the order in which you prove theorems becomes less relevant, so you can worry about reasoning rather than rewrite strategies, and it structures proofs to progress in a more natural order.
How I trick ACL2 is the topic of a paper I am writing, and the main idea came from my mentor, Pete Manolios (actually got his Ph.D. under J Moore's supervision, so a very relevant source of ideas). His insight into this "reasoning in layers" came from reasoning about hardware - a very layered process goes into their design. In fact, my methods for tricking ACL2 made the proof of a refinement theorem about a pipelined machine go through in 100 seconds, when previously it made my computer crash, and Pete's computer 15 minutes to prove. This result was a marvellous fluke, since I managed to get such drastic results from a flawed implementation of my trickery. When I corrected the flaw, the proof shot up to 48 minutes on my computer. It took months of fiddling, engineering, and adding heuristics to get the corrected version to emulate the behavior of the flawed version in the case of the refinement proof. I now have a rather robust framework for reasoning in layers of abstraction. Now all I need to do is find some more problems to apply it to that give similar results.
Labels:
ACL2,
graduate school,
research,
theorem proving
Subscribe to:
Posts (Atom)