Life, the Universe and Math

Abstracting Abstract Machines in Maude

noreply@blogger.com (J. Ian Johnson) — Sat, 08 Jun 2013 18:17:00 +0000

I've been helping a colleague with his research in analyzing languages with higher-order temporal contracts. While doing this, we ditched Promela (the language of explicit-state model checker, SPIN) in favor of the far more expressive rewriting engine, Maude, that just so happens to have a pretty good LTL model checking engine in it.

This was my first time using Maude. It's a quirky little language, far different from PLT Redex, my semantics engineering tool of choice. For one, it is not an embedded DSL. Redex allows me to escape to Racket whenever I need to do some more complicated computation that I don't need to model using reduction relations. Maude is strictly a term rewriter, and thus, certain tasks are tricky. This post is meant to be a tutorial for those of you who want to try Maude for prototyping analyses, considering its performance might justify the loss in expressiveness for you.

My intention is not to teach you all of Maude, because that would be silly. I will point out some pitfalls I fell into while getting AAM with store widening to work. We'll just model the untyped lambda calculus for brevity.

Gotcha #1: if you want to define a type and get a list of that type in the same module, you can't. The reason for this is that you need prove whatever type you want to get a list of fits into the theory the the LIST "functional" module expects, which just happens to be the trivial theory. This proof must be given at the top level using a "view," and not nested within the defining module.

Workaround: Since type declarations and the definitions of their variants don't have to be in the same module, we can have a stub module ToDefine that declares all the types we want to have views of.


mod ToDefine is

 pr STRING .

 sorts VAR Addr Time Expr Value Storable Point State StateStamp Frame Kont HeapDelta .

 subsorts VAR < String .

 op mt : -> Kont [ctor] .

 op kont : Frame Addr -> Kont [ctor] .

 subsorts Kont < Value < Storable .

endm

I've thrown in other types (sorts) that we'll need later. This subsort thing is another funky feature of Maude: you can monkey-patch types by just saying something else is in that type. The reason we order Kont less than Value instead of just both less than Storable is that there should be a least sort for each term, in order to have unambiguous parsing (ya, parsing is type-directed). If you write everything in prefix form, this doesn't matter, but the mix-fix that Maude offers can get confused. A lot of the time, if your program doesn't typecheck, you get an unhelpful error that it has no parse. The op keyword declares an operation, and the ctor attribute means that we should treat it like a constructor of that type. mt is a constant, so we don't have to reference it by mt().

The "pr" thing is short for "protecting," which is like a require, not an include. It's for protecting equations/rules in the module from this monkey-patching stuff.

The boilerplate to show all these sorts are in the TRIV (trivial) theory follows. The names for views is the same as the sort just for notational convenience, so we can say Set{Value} instead of Set{ValueView}.


view VAR from TRIV to ToDefine is sort Elt to VAR . endv

view Addr from TRIV to ToDefine is sort Elt to Addr . endv

view Expr from TRIV to ToDefine is sort Elt to Expr . endv

view Value from TRIV to ToDefine is sort Elt to Value . endv

view Storable from TRIV to ToDefine is sort Elt to Storable . endv

view Point from TRIV to ToDefine is sort Elt to Point . endv

view State from TRIV to ToDefine is sort Elt to State . endv

view StateStamp from TRIV to ToDefine is sort Elt to StateStamp . endv

The way to alias instantiations of types, say a map from variables to addresses, is to define an entirely new module and use some funky import renaming syntax:


mod Env is

 pr ToDefine .

 pr MAP{VAR, Addr} * (sort Map{VAR, Addr} to Env) .

endm

mod AStorable is

 pr ToDefine .

 pr SET{Storable} * (sort Set{Storable} to AStorable) .

endm

And since we want to have a map from addresses to sets of storable entities (Abstract storables, or AStorable), we again have to do the view boilerplate:


view AStorable from TRIV to AStorable is sort Elt to AStorable . endv

Continuing on, we write our AST constructors (expressions are labeled so we can distinguish them for continuations' allocated addresses):


mod Syntax is

 pr Env .

 pr AStorable .

 pr INT .

 pr MAP{Addr, AStorable} * (sort Map{Addr, AStorable} to Heap) .

 op Var : Int VAR -> Expr [ctor] .

 op Lam : Int VAR Expr -> Expr [ctor] .

 op App : Int Expr Expr -> Expr [ctor] .



 op CloV : VAR Expr Env -> Value [ctor] .

endm

The semantics I present a bit differently from the original AAM papers to make store-widening much easier and efficient to do. I separate changing the store into its own step by using a "apply this store change" state. This allows incremental updates to the shared store while stepping all the states in the worklist. Another difference is that I don't nondeterministically step to all values of a variable, since that is wasteful in terms of time and space - the precision gain is non-existent. The meta-functions alloc and tick from the paper are given the entire state to use, which introduces a circularity if you try to parameterize the semantics by a theory containing Addr, Time, alloc and tick, because states are defined by the semantics being instantiated. Oops. I've instead given the functions much less information, but it ends up being enough for commonly used abstract addressing schemes.

Gotcha #2: For a reason I don't fully understand, functional modules are not allowed to define rewrite rules. That means we couldn't parameterize the semantics by the AAM meta-functions and spaces even if we break the circularity. They can, however, define conditional memberships in a sort, so one could define the ReachableStates sort and encode the semantics as a backchaining one instead of a forward chaining one, but that defeats the purpose of a forward analysis. Due to this limitation, I present a very simple allocation and time-stamping scheme: 0CFA.


mod CESK is

 pr Syntax .

 pr AStorable .

 pr SET{Value} * (sort Set{Value} to AVal) .

 pr MAP{Addr, AStorable} * (sort Map{Addr, AStorable} to Heap) .



 op Eval : Expr Env Kont Time -> Point [ctor] .

 op Continue : Kont AVal Time -> Point [ctor] .



 op ar : Expr Env -> Frame [ctor] .

 op fn : AVal -> Frame [ctor] .



 op none : -> HeapDelta [ctor] .

 op binds : Addr AStorable -> HeapDelta [ctor] .



 op wn : Point Heap -> State [ctor] .

 op wd : Point Heap HeapDelta -> State [ctor] .



 op applyΔ : Heap HeapDelta -> Heap .

 op join : Heap Addr AStorable -> Heap .

The first part of the module here defines what states look like: they are composed of a point in the abstract state space, and its associated heap. We add a heap with wn ("wide to narrow"). The heap change constructor (wd) contains the heap itself so that we can apply the change as a separate step. These declarations out of the way, we enter another boilerplate land: meta-variables. Matching equations must have an idea what sort of patterns they're dealing with. Thus a variable, s, standing for a state must either be declared as "always used to match a state" (var s : State) or directly in each matching pattern (s:State). I personally like to have my equations clutter-free, just using naming conventions to know what I'm working with. I like this feature of the language, but I'd appreciate a way to export common meta-variables. Every time I want to match against an Expr in a different module, I have do declare e stands for an Expr. It gets copy/pasty quickly.


 vars ℓ : Int .

 vars e e0 e1 : Expr .

 vars ρ : Env .

 vars σ σ' : Heap .

 vars x : VAR .

 vars t t' : Time .

 vars v : Value .

 vars av av1 : AVal .

 vars ss ss' : AStorable .

 vars f : Frame .

 vars κ : Kont .

 vars a b c : Addr .

 vars δ : HeapDelta .

 vars ς̂ : Point .

I use unicode in my programs. That's right. The last couple steps before giving the rewrite rules that define the semantics are the meanings of join and allocation. First, changing the store. There is a bit of a gotcha here, that you can't update a map in-place; you must first match against the key you wish to change to see if it's already there, and then extend the rest with the key mapping to its new value. It's not too bad with the associative-commutative matching in Maude.


 eq join((σ, a |-> ss'), a, ss) = (σ, a |-> (ss', ss)) .

 eq join(σ, a, ss) = (σ, a |-> ss) [owise] .

 eq applyΔ(σ, none) = σ .

 eq applyΔ(σ, binds(a, ss)) = join(σ, a, ss) .

The owise attribute is nice. We can have a catch-all case if all previously defined cases didn't match - just like functional programming. Now comes allocation. In 0CFA there is no time. It's constant. That's why we add a singleton sort Unit less than Time, where the only constructor is the initial time. Variables are their own addresses, and continuations are allocated to the label of the expression that creates them.


 op vaddr : VAR -> Addr [ctor] .

 op kaddr : Int -> Addr [ctor] .

 sort Unit .

 subsort Unit < Time .

 op t0 : -> Unit [ctor] .



 op allocVar : VAR Time -> Addr .

 op allocKont : Int Time -> Addr .

 op tick : Value AVal Kont Time -> Time .

 eq allocVar(x, t) = vaddr(x) .

 eq allocKont(ℓ, t) = kaddr(ℓ) .

 eq tick(v, av, κ, t) = t .

Finally the semantics. Since the multitude of values at any one address aren't really needed until a function call, we keep all values in superposition until we really need to observe which value we looked up. The associative-commutative matching on the set of values in the fn frame gives us that resolution of non-determinism.


 rl [Var-lookup] : wn(Eval(Var(ℓ,x), ρ, κ, t), σ) => wd(Continue(κ, σ[ρ[x]], t), σ, none) .

 crl [Fn-eval] : wn(Eval(App(ℓ, e0,e1), ρ, κ, t), σ) =>

                 wd(Eval(e0, ρ, kont(ar(e1,ρ), b), t), σ, binds(b, κ))

                 if b := allocKont(ℓ, t) .

 rl [Lam-eval] : wn(Eval(Lam(ℓ, x, e), ρ, κ, t), σ) => wd(Continue(κ, CloV(x,e,ρ), t), σ, none) .

 rl [Arg-eval] : wn(Continue(kont(ar(e,ρ), a), av, t), σ) => wd(Eval(e,ρ, kont(fn(av), a), t), σ, none) .

 crl [Apply] : wn(Continue(kont(fn((CloV(x,e,ρ), av)), a), av1, t), σ) =>

               wd(Eval(e,(ρ, x |-> b), kont(f, c), t'), σ, binds(b, av1))

               if b := allocVar(x, t)

               /\ kont(f, c), ss := σ[a]

               /\ t' := tick(CloV(x,e,ρ), av1, kont(f, c), t) .



 rl [Heap-change] : wd(ς̂, σ, δ) => wn(ς̂, applyΔ(σ, δ)) .

endm

So that's the simple part of Maude. The fun reflective stuff comes next, when we want to widen the store to get something that runs in polynomial time. The scheme here is to represent all states we have seen (at which heap), the states we still need to step (the "frontier"), and the shared heap. Well, the heaps get pretty big, and checking if a state has been seen before would need to compare equality on a large object. Instead, we use Shivers' trick and track how many times we've changed the store, and represent the stores in the seen set as just a number. The store monotonically increases, so this is perfectly sound. This is where the StateStamp type comes in: just a point and an Int. First a couple aliases:


mod Seen is pr ToDefine . pr SET{StateStamp} * (sort Set{StateStamp} to Seen) . endm

mod Frontier is pr ToDefine . pr LIST{Point} * (sort List{Point} to Frontier) . endm

The wide semantics will be reusing the original (or narrow) semantics we previously defined. In order to do that, we need to have an internalized notion of what it means to search for a possible reduct for a term. To do this, we use metaSearch, reflect up the CESK module, inject our state to step with the current shared heap, and let it search for all 1-step reductions. It's pretty cool. Redex doesn't have a notion of meta-reduction/search; I instead have to escape to Racket and use apply-reduction-relation, completely thwarting the debugging tools Redex offers, like traces. But first! Some boilerplate.


mod WIDE is

 pr CESK .

 pr META-LEVEL .

 pr Seen .

 pr Frontier .

 pr SET{Term} * (sort Set{Term} to TermSet,

                op _`,_ to _u_) .



 sort System .

 op sys : Frontier Heap Int Seen -> System [ctor] .

 op wt : Point Int -> StateStamp [ctor] .

 op Fail : -> State [ctor] .



 vars S S' S0 S1 SC SP : Seen .

 var ς̂ : Point .

 vars F F' F0 F1 ς̂s ς̂s' ς̂s'' : Frontier .



 vars name x y : VAR .

 vars e e0 e1 e2 : Expr .

 vars ς ς' : State .

 vars a a0 a1 : Addr .

 vars ss : AStorable .

 var i : Int .

 var b : Bool .

 var v : Value .

 vars ρ ρ0 ρ1 : Env .

 vars σ σ' σ0 σ1 σc σc' σp : Heap .

 vars δ δ' : HeapDelta .

 vars κ κ0 κ1 : Kont .

 vars vs vs' vs'' : AVal .

 vars δ? δ?' : Bool .

The reason we add a Fail state is so that we give the "reflect down" operation (downTerm from META-LEVEL) a fallback in the same expected kind, incase we can't reify a term that metaSearch produces back into a State. This shouldn't happen, but it's there so that downTerm is happy. A change from the previous semantics is that we build up changes to the store instead of apply them one-by-one. Also, we have to keep track if any of these "changes" actually do change the store. We thus have new join and apply operations that also return a Boolean saying if there was a change. We of course have to add a type and constructor for this - call it Heap/Δ?.


 sort Heap/Δ? .

 op hδ : Heap Bool -> Heap/Δ? [ctor] .



 op $join/δ? : Heap Addr AVal Bool -> Heap/Δ? .

 ceq $join/δ?((σ, a |-> vs'), a, vs, δ?) = hδ((σ,a |-> vs''), δ?')

      if vs'' := (vs' , vs)

      /\ δ?' := δ? or-else (vs'' =/= vs') .

 eq $join/δ?(σ, a, vs, δ?) = hδ((σ,a |-> vs), true) [owise] .



 op $applyΔ/δ? : HeapDelta Heap Bool -> Heap/Δ? .

 eq $applyΔ/δ?(none, σ, δ?) = hδ(σ, δ?) .

 eq $applyΔ/δ?(binds(a, ss), σ, δ?) = $join/δ?(σ, a, ss, δ?) .

The magic happens next. The heavy hitter is allOneStepAux, which takes the term to step, which search solution it's on (I'll explain), the pattern to match against (just "a state"), the shared heap, the number of heap changes, whether there has been a change for this step, a partial seen set, the next frontier, and the current seen set. What I mean by "search solution" is (and this tripped me up for a couple hours) which of the many possible search solutions, since this interface doesn't return a set of all solutions. Instead, you have to keep polling for solutions until it returns "failure," which I suppose I could see as a reasonable interface if users don't necessarily need all solutions (but really it's unsettling to me).


 op $step-all : Frontier Heap Heap Int Bool Seen Frontier -> System .



 vars T P : Term .

 vars N σt σt' : Int .



 op CESKMod : -> Module .

 eq CESKMod = upModule('CESK, false) .



 sort StepTuple .

 op stpl : Frontier Heap Bool Seen -> StepTuple .



 op metaTermToStateSet : Term Heap Int Bool Seen Frontier -> StepTuple .

 op allOneStepAux : Term Int Term Heap Int Bool Seen Frontier Seen -> StepTuple .



 eq metaTermToStateSet(T, σ, σt, δ?, S, F) =

    allOneStepAux(T, 0, 'ς1:State, σ, σt, δ?, empty, F, S) .



 op retag : Seen Int -> Seen .

 eq retag(empty, σt) = empty .

 eq retag((wt(ς̂, σt), S), σt') = wt(ς̂, σt'), retag(S, σt') .



 ceq allOneStepAux(T, N, P, σ, σt, δ?, SC, F, SP) =

     stpl(F, σ, δ?, (SP, SC))

     if metaSearch(CESKMod, T, P, nil, '+, 1, N) == failure .

 ceq allOneStepAux(T, N, P, σ, σt, δ?, SC, F, SP) =

     allOneStepAux(T, N + 1, P, σ1, σt, δ?', S', F', SP)

     if wd(ς̂, σ0, δ) := downTerm(getTerm(metaSearch(CESKMod, T, P, nil, '+, 1, N)), Fail)

     /\ hδ(σ1, δ?') := $applyΔ/δ?(δ, σ, δ?)

     /\ S' := if δ? then wt(ς̂, σt + 1), SC

              else if δ?' then retag(SC, σt + 1)

                   else wt(ς̂, σt), SC fi

              fi

     /\ F' := if δ?' or-else not wt(ς̂, σt) in SP then ς̂ F else F fi [owise] .



 eq $step-all(nil, σp, σc, σt, δ?, S, F) = sys(F, σc, (if δ? then (σt + 1) else σt fi), S) .

 ceq $step-all(ς̂ ς̂s, σp, σc, σt, δ?, S, F) = $step-all(ς̂s, σp, σc', σt, δ?', (S, S'), F')

     if stpl(F', σc', δ?', S') := metaTermToStateSet(upTerm(wn(ς̂, σp)), upTerm(ς̂), σc, σt, δ?, S, F).



 op CFA : System -> System [iter] .

 eq CFA(sys(F, σ, σt, S)) = $step-all(F, σ, σ, σt, false, S, nil) .

endm

The allOneStepAux function will consider all seen states to be at the current store's timestamp until it encounters a change, at which point it will do a one-time retagging of all the states just marked as seen. This is why there is a partial seen set, SC, so that we only retag states seen in just this step. The arguments to metaSearch are the module it's searching in, the starting term, the pattern it's matching, any side conditions on the pattern (none), a mode of search (+ is 1 or more), the max depth of search (max 1), and which search solution we're after. *phew*

Apart from the boilerplate, this was a fairly pleasant experience. I haven't scaled up my toying around with it to see if it is competitive with my hand-rolled implementation of an analysis for R4RS Scheme, but I wouldn't be surprised either way. This thing is a beast for its rewriting efficiency, but needing to reflect each term for each step, considering that they don't use direct reflection (as far as I can tell), that has got to be expensive. It's been a fun experiment, though.

Is your analysis concrete?

noreply@blogger.com (J. Ian Johnson) — Mon, 18 Feb 2013 18:27:00 +0000

In the area of program analysis, it is all-too-easy to marry yourself to a model of computation that does not capture the entire behavior of the language you're analyzing before you even begin. The classical data flow analysis algorithms from the Fortran folks have been mostly successful for all the compiler hackers out there, but computer security has become such a real danger that we can't just use the work of hackers (though their work is appreciated). Before I talk more about program analysis, I want to say a few words on reliable computing.

Reliable computing and programming languages

If we are to be confident about any property of a program that we write, not only should it pass a large test suite, but it should also have that property proven either automatically, by an interactive proof process resulting in a machine-checkable proof, or provided directly by the language it is written in.

Herein lies the problem. Computers speak only one language - machine code. In modern computers, machine code is a binary representation of some low-level imperative language. In just under 100% of the time, programs are not written in machine code, so we employ compilers - program translators - to change a nicer language to machine code. Sometimes we even chain these compilers together so that there is a higher level intermediate code that even higher level languages can target, so that different kinds of translation technology can be used at the different levels of abstraction.

How, then, do we know properties hold of programs written in high level languages that promise to maintain certain abstractions? Can programs written in other languages break these properties? Many engineers understand languages by their implementation, but not the intended semantics of the language. After all, a language only means something if it executes machine code, right? Well, not really. This line of thinking becomes far too intractable as high level language implementors take more liberties with their implementation strategy. Instead, if we assume a mathematically ideal computational model for the machine code (not really a problem unless you're in outer space or working on an airplane for over 12 hours), then we can employ powerful proof techniques to prove that such translations not only preserve properties, but that programs in the target language can't break them. These techniques require a the model (semantics) for the machine code AND a semantics for the high level language. What is a semantics if not an implementation?

A programming language semantics is a mathematical model of the meaning of a program in that language. Yes, an imperative implementation could be a semantics, but it's often too large and complex of a mathematical object to reason about. Instead, we have different styles for writing programming language semantics (c.f. the Redex book) that are much more amenable to standard proof techniques like induction and subject reduction. I am a personal proponent of languages maintaining a formal semantics in one of these styles. The easiest to learn and possibly the most successful is small-step operational semantics.

Analysis of a language, not a graph

The standard model of data flow analysis is of graphs; basic blocks form nodes and control flow forms edges between nodes. This form of analysis can be proven correct with respect to a concrete semantics, but the path is cumbersome. Furthermore, not all languages have the standard imperative block structure. There is another way to describe a program analysis and have its correctness as an immediate result of the way it's written. This is in the form of Abstracting Abstract Machines (AAM), which shows a straightforward way to phrase your programming language semantics such that abstraction to a finite automaton (and thus computably explored) is dead simple.

This technique does not mark the end of research in analysis. Instead it opens possibilities and offers a new perspective: your analysis techniques should be built into your programming language semantics. That isn't to say your go-to model should be the translated form that makes abstraction simple. No, this means that you should have a repeatable, straightforward process for translating a run-of-the-mill concrete semantics into a ready-for-abstraction concrete semantics. It is illuminating to see what your analysis machinery really means in terms of concrete execution, and can ease your explanation of the idea to someone else; we humans are good with thinking concretely, while abstraction makes it harder.

I more recently developed a simpler translation path from semantics to a pushdown abstraction (rather than a finite one), although it assumes more of the programming model - that is, no more than a bounded amount of the continuation can be captured at once. Languages with full continuations can enjoy enhanced precision of the method with some extra model-specific work, but the pushdown guarantee flies out the window. Though the guarantee is theoretically broken, the use of heavy control operators tends to be controlled enough (hah) to not cause too much leaked precision.

The translation amounts to saving the state of the machine right after control transfers to a function as a key into a table of "local continuations" that also gets updated on function call to the continuation up to the entry of the function. This "up to" notion is easily captured by replacing the continuation on function call with the key into the continuation table. When it comes time to interpret that key, you just continue to all local continuations that were stored at that key. This non-determinism can lead to loops where previously there weren't (you can always choose the older of return continuations), but reachability of states is maintained with the original semantics. Abstraction follows the same strategy of AAM for finitizing the address space, but continuations no longer need to be store-allocated. This presentation immensely helped people in my department understand the work of my recently-graduated colleague, Dimitrios Vardoulakis, on CFA2, the first pushdown abstraction for a higher-order language. One element that was missing from his work was a non-standard concrete semantics, the term given to concrete semantics augmented with different/extra information to compute the answer of the analysis, rather than the answer of evaluation. Indeed, this idea is not new. Rather, the push I and my colleagues are making is to make the translation from standard to non-standard concrete semantics a straightforward and intuitive process. Secondly, all the trickery of the analysis should have a concrete counterpart. Once you abstract, there should not be anything surprising. The full generality of the abstract interpretation framework is something to appreciate, but we must also respect ourselves and our audience by providing an understandable and accessible technical artifact.

Docs for Racket + emacs

noreply@blogger.com (J. Ian Johnson) — Wed, 24 Oct 2012 15:56:00 +0000

I have been doing all my research prototyping in the "programming language programming language" Racket. For those of you who work in emacs rather than DrRacket, you might find it fairly frustrating to move from emacs to the browser to search on docs.racket-lang.org, or if you're savvy, go to the terminal and use raco doc.
I came up with a script that will take whatever symbol your cursor is currently in to the documentation for it, if any. It requires a function from paredit, but you should have that loaded with scheme/racket files already.


(setq raco-location "raco")
(require 'paredit)

(setq raco-doc-history nil)
(defun call-raco-doc (text)
  (shell-command (concat raco-location " doc -- '" text "' &")))

(defun get-current-word ()
  (paredit-handle-sexp-errors
   (save-excursion
    (progn (backward-sexp)
           (mark-sexp)
           (buffer-substring (mark) (point))))
    (save-excursion
      (progn (mark-sexp)
             (buffer-substring (mark) (point))))))
(defun raco-doc ()
  "call raco doc on selected text"
  (interactive)
  (call-raco-doc (get-current-word)))
(defun raco-doc-prompt ()
  "call raco doc on prompted text"
  (interactive)
  (let ((text (read-string (if (consp raco-doc-history)
                               (concat "raco doc [" (car raco-doc-history) "]: ")
                               "raco doc: ")
                           nil
                           'raco-doc-history
                           ;; default value should be most recent history item
                           (if (consp raco-doc-history)
                               (car raco-doc-history)
                               nil)
                           t))) ;; inherit the current input method
    (call-raco-doc text)))
(defun enable-raco-doc ()
  (define-key scheme-mode-map (kbd "C-c C-r") 'raco-doc)
  (define-key scheme-mode-map (kbd "C-c C-M-r") 'raco-doc-prompt))
(provide 'raco-doc)

Play all mp3s in subtrees

noreply@blogger.com (J. Ian Johnson) — Mon, 02 Jul 2012 20:49:00 +0000

~/bin/playall:

#!/bin/bash
find . -iname '*.mp3' | sort | awk '{ print "\""$0"\"" }' | xargs mplayer

Good for playing all albums of an artist.

But wait! This doesn't give mplayer focus, so you can't press spacebar to pause/unpause. That's extremely annoying. We can fix that by replacing mplayer with mplayer_wrapper:

#!/bin/bash
mplayer "$@" < /dev/tty

Going Meta on the Macro

noreply@blogger.com (J. Ian Johnson) — Thu, 15 Jul 2010 21:20:00 +0000

Today I want to talk a little bit about what I've been researching over the summer. For a little context, I've finished the first year of the CS Ph.D. program at my university, and I have mainly focused on formal methods/automated reasoning. My passion is for proving properties about software - specifically, I want to make that process easier, more natural and commonplace. Since I did not learn how to prove theorems with transfinite induction (the most common tool for proofs about software) until grad school, even after getting B.S. degrees in pure mathematics and computer science at a very good university, I don't think the "commonplace" goal is coming soon. Perhaps I will discuss this area and the difficulties it faces in a future post.
At this university, the largest research group - and the most famous - is for programming languages (PL). Academia's approach to PL over the past few decades has focused mainly on functional programming languages, as far as I can tell. That was shocking to me, since just about everything I've seen in industry has been imperative stuff. After dealing with proofs, I now understand a reason why PL fancies functional languages - they use recursion and immutable values much more often than while loops and repeated variable assignments. This approach allows you to disregard the global state of your program and focus specifically on one function at a time. Why? If your function depends on a variable that x other functions can change, then you must keep track of how that variable is changed. Most likely your proof of correctness hinges on that variable holding a specific property, such as prime number or normalized vector. Every place that variable changes, you must make sure that property holds; if it doesn't, you have to make sure it does before your function is called. That's a lot to keep track of. It's not all that easy to maintain, especially if you work in a team setting.
So what's cool about functional programming languages other than that fact? Well, I haven't begun to talk about macros. There are ways of abstracting behavior of your program beyond what a function can do. Say you need the same large initialization sequence executed before some block of code in many different places. You can make a new command that will wrap your code with all that boilerplate at compile time with a macro. That's a simple example. There are other examples in which a macro will do long and complicated analyses on input code, and then output optimized or annotated code that should be equivalent in functionality.
Ooh, "equivalent" sounds like a theorem proving word. Can we prove that these analyses find the invariants we need for correctness of the transform? How about after that - actually proving that the macro form expands to an equivalent expression in all contexts? The answer is (you're going to love this) it depends. It depends on what you consider proof. Certainly you can write functions that describe the semantics of the language you're working in (say, with an embedded operational semantics - oh but for a termination argument you have to specifically limit how many evaluation steps are allowed. Yikes.) So wait, you have to embed the language you're working in... in the language you're working in? Why would you want to do that? Gross.
So what if we want our logic to allow talking about macros inside the logic? Say we don't need that embedding, and then we can actually write rewrite rules to simplify macro forms without having to expand them. This is some heavy stuff. Now we are talking about quantification over syntax rather than simply values.

Some vocabulary - when a macro expands, it is a function from representation of syntax to representation of syntax, but when it's complete, that representation is reified to become actual syntax that will be evaluable.

A problem enjoyed (rather, suffered) by many macro systems is the loss of scoping laws. It is not possible that in all contexts your macro will have the same meaning. This is due to free and bound variable capture. A simple textual replacement macro system such as in LISP and C is bound to change the meaning of your program when you decide to use a variable name that the macro redefines. In LISP at least you can check that variables you introduce are not in the support (set of free or bound variables) of the input expressions, and fail to expand if they are. This is ugly. Not only does it reveal internals of a macro (something that module abstraction should give), but it pits the programmer against the macro - "why can't I use the variable names I want?" There is a solution, however it is not the most palatable either (just the best we've got), called hygienic macro expansion. A hygienic macro expander will rename variables as it goes to make sure variables are available only in the scope of where they are intended to be used. I can think of more ambiguous statements, but that one surely is a nuisance to anyone doing formal reasoning.

Hygienic macros have only recently gotten a formal treatment (see Dave Herman's thesis - A Theory of Typed Hygienic Macros), and since it is in the setting of a decidable type theory, the whole story is not delivered. However it does give us hope for a complete meaning. If one is able to completely describe how a macro's input will be used for introducing and consuming bindings in its expansion, then we can show that if two inputs have the same binding structure implies that the macro's expansion on both inputs has the same binding structure (more precisely, both expansions are alpha-equivalent), then the macro expander is hygienic. Nice. That takes care of the all contexts problem. Now how do we express this without long-winded arguments? My idea is to have simple base that requires long-windedness to begin with, but (haha!) we use macros to conceal this detail.

Notice that the hygiene property cannot be defined inside the logic, since any function describing binding structure would be manipulating only a representation of syntax, and not actual syntax for which the logic has reasoning capabilities. Wait wait wait... we are now introducing to the logic a means to have meta-reasoning at the non-meta level. That shouldn't work, right? Gödel did some work on this I think. Well, there are theorem provers out there that allow for meta-reasoning, so let's look at how they do it.

The key idea is that you don't collapse the meta-theory into your theory, but you separate the two and start at the meta level to begin with. This is what is meant by working in a logical framework rather than a specific logic. Thus we do have an "embedding" of the language we're working in, but it's already given as it is defined in the logical framework. A bit of a hack, but logically necessary. Then what encoding does one choose for syntax? If you have any familiarity with the lambda calculus, you should know how beta-reduction is rather involved (in call-by-value semantics) since it must do bound variable renaming. In hygienic macro systems, lexical information about a term must be stored somewhere, and for a practical language, so too must source code correlation information.

Yikes again. Source code correlation information is purely for debugging purposes and thus logically uninteresting. We can abstract over this by instead defining an abstract datatype that satisfies the properties we want of syntax, and then let an arbitrarily complicated implementation prove that it satisfies this specification. One way researchers have been able to describe binding in syntax is by actually using a higher order representation. That is, instead of symbolically representing bindings everywhere like LISP does, actually use a real lambda abstraction. This is called Higher-Order Abstract Syntax (HOAS) and it has been getting lots of attention for that past, oh, 15 years. Why then aren't we sipping on piña coladas while our informal meta-reasoning style proofs whiz through formal systems?

Remember my description of theorem proving in the beginning, and how induction is vital? Well it turns out that in general, inductive proofs over HOAS is... impossible. The reason for this is because its general representation is not inductively definable - variable references can be arbitrarily far down a syntax tree from the binding point; a crippling technicality called "negative occurrence" in type theory. Though IANAH (I am not a historian), the first attempt I know of to restrict syntax for meta-reasoning was in Edinburgh LF (logical framework), actually implemented at CMU as "Twelf", based on (competes with?) a logic called (fold-n) developed by some guys at École Polytechnique. There is additional work in Isabelle/HOL that builds off Twelf's idea, but this building is a big stack of confused mess if you ask me. At the bottom of the stack is Isabelle's logical framework, then classical higher order logic axioms, then some concrete representation of syntax, then the "specification logic," and then the "object logic." These layers are abstractly what needs to exist, but everything is logically defined, so execution isn't given (can't just build on top of Common LISP like ACL2 does). There is quite a bit of research into extracting programs from specifications, but what if I want programs AS specifications? As far as I know, there is no real programming language basis for these systems that would mean executable specifications and reasoning about actual runnable programs, and not their (trusted) translation into some other logic.

Perhaps it's my history with ACL2 - it is a programming language, a logic and a theorem prover - but is it too much to ask for? This is not a solved problem. I'd like to chip out a big piece of this mystery though. Meta-reasoning about (hygienic) macros must be a possibility. Since meta-reasoning itself has not yet been well-executed, there certainly is a lot to do [collection of meta-reasoning problems for mechanized systems to tackle]. There is a chance though, that by attacking the problem from a macro setting might just simplify (and generalize) the problem space.

SMT solvers hate you

noreply@blogger.com (J. Ian Johnson) — Mon, 12 Apr 2010 18:17:00 +0000

And me, of course. No one is safe from their wrath of awful error messages. Boolector-1.2 was giving me "illformed 'let' or 'flet'" at the very end of my 1200 line file. Very useful. After poking around for 30 minutes or so, I found that it did not like my identifiers with hyphens (they should be underscores). Yices did just fine on the file, by the way. Boolector also does not like (let (?varname (let (... ...) ...)) ...). Nested lets are stupid. Scope is for humans.

On that note, I found that Boolector confuses nested lets with shadowed variables, so if ?x is bound twice, it incorrectly concludes that a nested let was found. Yices' behavior for shadowed variables is in my opinion far worse; it persists the first binding as if the second never happened. You will get models when you do not expect them.

Granted, this format really isn't for humans, but for the SMT solvers themselves. Let researchers munch on some NP-completeness rather than inform users. That said, there is an OCaml implementation of a well-formedness check, but it is admitted to be incomplete. The (faulty) implementations of the language specification are the real specs, so some error handling on the solvers' side is very well justified.

Just a note for some other researchers/engineers out there that run into Boolector giving some really nasty errors. I may edit this post after future interaction with more solvers like CVC3, MathSAT and Z3.

P.S. Yices doesn't care about the hundreds of close parens that are bound to conclude your file while Boolector expects all parens to match (no extra ')'s allowed). The error is the unhelpful "Unexpected EOF" message when they do not match.

After Almost a Year of Graduate School

noreply@blogger.com (J. Ian Johnson) — Thu, 18 Feb 2010 17:54:00 +0000

A lot has happened in the past year. I graduated UT with degrees in computer science and pure mathematics, I moved to Boston to pursue my Ph.D. in computer science at Northeastern, and I embarked on a research journey. Allow me to give convey some lessons I have learned.

As you may know, Ph.D. students seldom have to pay for tuition or worry about supporting themselves, as do (some) undergraduates and masters students. No, in exchange for your education, you are required to (at first) be a teaching assistant (you are supported by your college), and later a research assistant (you are supported by your advisor). If you are clever, you might manage to get a fellowship and just give your own work attention. Perhaps next year when I have a better idea of what is an interesting research direction I will apply for a fellowship, but for now I am a TA for a freshman logic class.

This class is very different from my own experience with logic in undergrad, where I spent most of my time smearing glaze over my eyes and amassing a large drool stain on my book while a coke-bottle glasses-wearing emeritus professor drones on about the most menial pedantry of set theory, functions and recursion. This class actually uses my beloved interactive theorem prover, ACL2, to teach students how to reason about their programs, and in the mean time surreptitiously teach them rigorous fundamentals of propositional logic, induction and first-order logic.

ACL2 is a tool that has been taking up a majority of my time recently, and is essentially the reason I came to Northeastern. The name stands for A Computational Logic in Applicative Common LISP, so the 2 is not a version number, but more of an ordinal coefficient/concatenation multiplier. My interest in ACL2 arose when a friend of mine told me about research he was doing with its inventor, J Moore. The idea for systems like ACL2 and the reason theorem proving intrigues me so much comes from a comment John McCarthy (inventor of LISP) made in the 60s along the lines of, "rather than debugging, programs should be proven correct, and that proof should be checked by a computer program." I had only seen reasoning about programs for a couple of lectures in my analysis of programs class (really a discrete structures class), presented in Floyd-Hoare logic, with the caveat that proving properties about programs this way does not scale, is a pain in the ass, and no one does it. Thus I had forgotten about proving properties about programs and went back to hacking and debugging as usual.

A computational logic is a logic for reasoning about algorithms - you stay in the realm of computation, instead of careening off into pure mathematics. ACL2 deals only with finite structures; there is no concept of infinity, functions cannot be values, and lists have a natural number length. All functions must be proven to terminate on any possible input (i.e. ACL2 is a total logic). These restrictions keep the logic very simple, but still very expressive. There are definitely theorem proving systems out there that allow for reasoning about infinity, higher-order functions and partially-defined functions, but proofs in these systems are much less easily automated. Because of the restrictions on the logic, some fantastic engineering and brilliant proof search heuristics, ACL2 is the most automated theorem prover in existence.

Have I sold you yet? If yes, you are just like most other people faced with ACL2 for the first time and shown what it can do. Bright eyes. Naive expectations. The reality of the matter is that proofs of interesting theorems still need mathematicians to come up with a proof strategy. Not only that though - in theorem provers, statements of truth that normally can be used the way you wish when proving theorems by hand now have extra semantics attached to them. You see, when you prove something to be true, you are programming the theorem prover to apply rules in a directed way. Typically if you prove A = B, then really what you are telling the prover is, "when you see A, replace it with B." This is called "substituting equals for equals," a special case of term rewriting. If you do not know the gory details of the prover's rewriting strategy, then you can end up proving extra very specific, horribly inelegant lemmas in order to ram your proof through the system.

The reasons for the prover's rewriting strategies being so directed come from years of experience and a lot of empirical data from heavy users of the system. Here is the learning curve: new to ACL2, you write some simple statement that is obvious to anyone and the prover fails - "stupid piece of crap," you say. Next you start reading and toying around with carefully crafted examples that the prover gets automatically, "oh cool, I was just doing it wrong. ACL2 is awesome!" Later you try to actually use ACL2 for arbitrary code, say you are trying to prove a flood fill algorithm makes a flooded region bigger or the same size as before (intuitively obvious), "AHHH MY EYES! MY FACE! I'M ON FIRE!" and after your Ph.D. in computer science with a focus on formal methods, "oh I get it." At least that is what I am promised. Currently I have bandages on my third degree burns, staring at a burning building, considering going back to save my dog.

My current research attempts to distribute large spanners to users of ACL2, so they have access to their fire hydrants - okay enough of this metaphor. ACL2 tries its very hardest to prove a conjecture true by applying different types of rules to rewrite the whole thing to T (true). The order in which rules are applied matters very much in the success of the proof, and ACL2's approach is to try everything, one by one, in the reverse order theorems were proved. In very directed proofs of one particular theorem, this generally works - consider in mathematics when you prove minor theorems, use those to prove stronger theorems, and then use those to prove your punchline theorem. Typically you do not have to appeal to the low level, minor theorems in your main proof.

In software? Not so much. You tend to work in layers of abstraction, and these layers have borders that allow for communication with other layers. It is this communication that destroys the reverse-chronological ordering strategy. Without communication, the prover would skip over entire layers as irrelevant to get to rules that apply - no big deal. With communication, things get hairy, and you would like to reason at the highest level you can before you have to dive deeper to resolve some technical issue. By tricking ACL2 into reasoning in different layers of abstraction, the order in which you prove theorems becomes less relevant, so you can worry about reasoning rather than rewrite strategies, and it structures proofs to progress in a more natural order.

How I trick ACL2 is the topic of a paper I am writing, and the main idea came from my mentor, Pete Manolios (actually got his Ph.D. under J Moore's supervision, so a very relevant source of ideas). His insight into this "reasoning in layers" came from reasoning about hardware - a very layered process goes into their design. In fact, my methods for tricking ACL2 made the proof of a refinement theorem about a pipelined machine go through in 100 seconds, when previously it made my computer crash, and Pete's computer 15 minutes to prove. This result was a marvellous fluke, since I managed to get such drastic results from a flawed implementation of my trickery. When I corrected the flaw, the proof shot up to 48 minutes on my computer. It took months of fiddling, engineering, and adding heuristics to get the corrected version to emulate the behavior of the flawed version in the case of the refinement proof. I now have a rather robust framework for reasoning in layers of abstraction. Now all I need to do is find some more problems to apply it to that give similar results.

Strange Consequences of Simple Definitions (I)

noreply@blogger.com (J. Ian Johnson) — Wed, 08 Apr 2009 20:27:00 +0000

PDF of this post
I decided that upon finding Thomae's function, of which today's topic is about, that a good subsection of this journal of mathematical musings should be about what strange things can be mathematically correct, but not at all intended from a given definition. We look into the realm of real analysis today, a topic I am rather fond of, seeing as the real numbers are really more bizarre than one might originally think. Good old rational numbers, the collection of integer over natural (coprime), can approximate any real number to any arbitrary precision, yet the two fields are vastly different in mathematical properties.

That aside, which I might talk about in a later entry, I turn to the core discussion about continuous functions. How are they defined, and what's something really weird I can construct that is still considered continuous?

A function from a subset to is said to be continuous at a point if for every , there exists a such that if and , then . Let us step back and get an intuitive understanding of this definition. If we look at and all its neighbouring points, then we can get a neighbourhood in around that all map into the neighbourhood about . Visualising the shrinking and the function still staying nicely mapped into the smaller and smaller neighbourhoods, this gives us a nice image about what it means to be continuous. Notice that this definition is for single points. If is continuous at all points , then we say that the function is simply continuous. However, what if we construct a function that is continuous just about everywhere, and discontinuous everywhere else (which is also... everywhere?). Yes, I go back to my point about the rationals being able to approximate the reals to any arbitrary precision - in other words, dense in . Thomae's function is continuous at every irrational point, and discontinuous at every rational point. Thomae's function is defined as if is irrational, if , and if we say . Let us first prove that is discontinuous at each rational point. Let be arbitrary, let , and let be arbitrary. Since the irrationals are dense in , there exists an such that . Since and , . Thus since was arbitrary, for all there exists an such that implies . By the definition of discontinuous, is discontinuous at . Since was arbitrary, is discontinuous at every rational point.

Next we show the more shocking result that is continuous on an arbitrary irrational . Let be arbitrary. By the Archimedean property, there exists an such that , so . Call the set of rationals with denominator at most within set . We know that is finite from the following combinatorial argument. Since and , the number of rationals with denominator in the desired range is less than or equal , and so forth for natural numbers , so is finite. Let . There are two cases for such that , where is irrational and thus , or where and . In this case, cannot be less than since is defined to be small enough to skip each point in with a smaller denominator than . Thus , so . Therefore is continuous at . For more information and an image of what looks like, take a look at the Wikipedia page.

Fellow blogger.com mathematicians...

noreply@blogger.com (J. Ian Johnson) — Sat, 31 Jan 2009 05:28:00 +0000

If you stumble upon this page in the hopes of finding a way to add LaTeX capabilities to your blogger/blogspot posts, today is your lucky day. Well, your lucky day could have been back in March 2007 when wolverineX02 wrote his "LaTeX for blogger" script, but I have added the ability to UNDO the translated LaTeX code. Now if you wish to revise a formula, you don't have to wade through a lot of URL translation codes/rewrite your whole formula.

This script requires Firefox and the greasemonkey extension, both of which I highly recommend. http://userscripts.org/scripts/show/41481
The only thing different from normal is that math must be enclosed in two dollars signs on each side instead of the one you're used to in your TeX compilers. $$\LaTeX$$ becomes :)

Interesting occurrence with 7th root of unity

noreply@blogger.com (J. Ian Johnson) — Fri, 30 Jan 2009 16:08:00 +0000

PDF of this post
I've come across a property of that is rather interesting. Letting , we let , we will see that .

Instead of working directly with this difference, we square the expression for something more easily manipulated. . Using the equalities , etc., we can simplify this down to . Since the sum of the 7th roots of unity equals 0 and the sum only neglects the root , we see that . Thus . Amazing stuff, right?

A bash script I wrote and love

noreply@blogger.com (J. Ian Johnson) — Thu, 08 Jan 2009 22:07:00 +0000

I, like many Linux geeks before me, love to use the terminal for just about everything. However when I need to use programs with GUIs (Firefox, amarok, okular, etc), I wanted to execute them silently and to get them to return focus to the terminal. Normally if you type

$ firefox

You get a bunch of X and Gtk warnings during execution of Firefox, and you can't use that shell anymore. You have to open another tab in konsole. So, this script I wrote is called "sex" for "silent execute." Mainly for the brevity and the humor of having

$ sex dolphin

be a valid command. This script also allows parameters to be sent correctly to an executable, so

$ sex okular A\ document\ with\ spaces.pdf

opens "A document with spaces.pdf" in okular correctly.

#!/bin/sh
CONCAT="${1}"; shift
for ARG in "$@"; do
  CONCAT="${CONCAT} \"${ARG}\""
done
eval "$CONCAT >> /dev/null 2>&1 &"

Simply put this in /usr/bin/sex (need root) and do a chmod +x /usr/bin/sex. I recommend adding

alias ff='sex firefox'

to your ~/.bashrc or your /etc/bash.bashrc too. It's a nice shortcut.

Divide by 0 != universe explode

noreply@blogger.com (J. Ian Johnson) — Fri, 05 Dec 2008 18:57:00 +0000

Post inspired by: The Last Denominator

I just want to let people know that there is no way that any (true) theorem about the field of real numbers (or any field) has a division by zero. In order to get to the concluding statement would require an element that is 1/0 (which I will call the inverse of zero) to be summoned. Since this element does not exist in a field other than the zero ring (in which case 0 = 1), it cannot be used. The step taken is invalid.

Now that I have shown that the usage of zero's inverse is invalid, I will show that no true statement can imply a theorem that involves zero's inverse. By the axioms of inference that logicians, philosophers and mathematicians have developed, A implies B is equivalent to (NOT A) OR B. Therefore since B = conclusion with zero's inverse, if A is true, then the implication is false. And so we conclude that only false statements can imply the existence of zero's inverse (with the exception of the extremely uninteresting case of the zero ring).

To those of you who do not know to what I refer to as a ring and field, here is the break down (first we need the concept of a group):
A group G is a set of elements with a very limited amount of structure imposed on them under one operation (call it *):
1) There exists an identity element, 1, such that 1*x = x for each element x in G.
2) There exists an inverse element 1/x such that 1/x * x = 1 for each x in G.
3) For two elements in G, x and y, x*y is also in G.
4) For three elements in G, x,y and z, (x*y)*z = x*(y*z). In other words, * is an associative operation.

A ring R is simply a set of elements that have a certain structure under two operations defined among the elements (call them + and *):
1) R is an abelian group under + (abelian just means that for x,y in R, x + y = y + x. In other words, + is commutative).
2) For two elements x, y in R, x*y is in R.
3) The distributive laws hold: x,y,z in R imply x(y + z) = xy + xz, and (x + y)z = xz + yz.

A field F is a special kind of ring that really has a lot of great qualities that allow us to prove a lot of cool things:
1) F is a ring.
2) All elements of F except the identity of + form an abelian group under *.

The "except the identity of +" is the key term. Since we commonly call the identity of +, 0, this means that 0 is not required to have an inverse (with regards to *) in F. We know that the real numbers form a field because every element other than 0 has a multiplicative inverse (namely 1/x), and every element has an additive inverse (-x), and you can't add or multiply two numbers and get an answer that isn't a real number. Since 0*x = 0 for any x in the reals, 0 does not have an inverse z such that 0*z = 1. Thus 0 does not have an inverse in the real numbers.

Okay? Now you can be sure that if you get a division by 0 that either the number is too small for your calculator to keep the correct precision and it just rounded to 0, you started off with a false statement, or you had an incorrect deductive step.

The intersection inference problem is NP-complete

noreply@blogger.com (J. Ian Johnson) — Wed, 26 Nov 2008 00:25:00 +0000

PDF for this post
First let me start off by defining the intersection inference problem.

Given a set , some finite number () subsets , and constants , is there a subset such that for ranging all values?

First we prove that intersection inference is in NP. Suppose we have a solution . In order to verify that is indeed a solution, we perform the intersections of with and confirm that . Set intersection can be performed (naively) in time, so this process is indeed polynomial and clearly yields the correct answer. Thus intersection inference is in NP.

To prove intersection inference is NP-complete, we will show a reduction from 3-dimensional mapping to the intersection inference problem. With an arbitrary instance of the 3-dimensional mapping problem defined as sets such that , and . To reduce this to the intersection inference problem, let , and define to be the subsets of triples that contain respectively. Define in a similar manner from to for and from to for . Finally define . The constraints are and for . We can see that to create these subsets takes time in the rather straightforward manner of iterating over 's elements to divvy up the elements into arrays representing .

Through this construction we can see that exactly triples must match, and only one triple can contain each specific element in . Thus if there exists a solution to this intersection inference problem, the solution is exactly the triples that satisfy the 3-dimensional matching problem. If there does not exist a solution, then at least one constraint was not satisfied, meaning at least one element in was not covered, or all elements were covered with some redundancies. Thus the reduction is correct and intersection inference is NP-complete because a known NP-complete problem was poly-time reducible to the intersection inference problem which is in NP.

Windows installer and bad design

noreply@blogger.com (J. Ian Johnson) — Fri, 08 Aug 2008 06:29:00 +0000

So I have a hard drive that I use to share data between my Linux and Windows OSes. That drive is in EXT2 and thus does not support the "hidden" file attribute. I decided I wanted to install Visual Studio Express 2008, but behold... the installer decides to use my EXT2 drive to extract to and then copy from into my temp directory on the other hard drive. Ridiculous? I haven't even started yet.

Setup decided to create a file called $shtdwn$.req with the hidden bit on. During the copy process, instead of saying, "these are the files I need. Copy them," it just gets the directory contents and tries to copy them all. This would work on any file system that supports the hidden attribute. The kicker? Well, first of all I have no idea why it decided to use P: instead of C:, but more importantly I could not discover any way to reconfigure this setting. No command-line argument help prompt in order to discover said functionality... I even tried using the cmd prompt to copy at elevated privileges to no avail.

But to my relief, the gurus of Linux prevail (yes, not a windows dev... hah) and give me a workaround.
set _SFX_CAB_SHUTDOWN_REQUEST=$shtdwn$.req
before running vcssetup.exe and it will skip that file. I'm pretty sure it just has to contain a value and not $shtdwn$.req specifically, but I didn't bother to test.

This post is for people like me in the future. For those of you that stumble upon this page, I hope this is what you were looking for.

BrowserOS

noreply@blogger.com (J. Ian Johnson) — Thu, 24 Jul 2008 02:38:00 +0000

This somewhat unorganized post is about my vision of desktops merging with the internet. A day's brainstorm session, let's say.

Abstract
An amazing amount of our time is spent online. So much so that browsing the web has become more natural to us than browsing our computer. My vision of computers in the future (apart from Minority Report-style holographic and surface computing) involves intertwining the browser and the rest of the computer. What I see involves probably the biggest risk toward this approach, and that is getting rid of hierarchical directories to store files, but rather allowing a user to search for documents via a common search engine query.

The File System (BrowserFS)
Files can be tagged by users and also by a background data mining process for relevancy determining algorithms. The way to best implement this is to have one's hard drive be totally a one- to two-level database system.
What is a two-level database system? Well, in my mind there are three types of files. User-created/owned, executable, and application data. Application data would be stored in the second level of the database. The second level is a database as a blob entry in the first level of the database in order to minimize clutter and narrow query results to relevant areas. The motivation for this, or rather my inspiration for this comes from my observation of two classes of applications. One class uses many flat files in complex directory hierarchies, and the other uses very few "packed" files which are basically virtual file systems that have all those many flat files organized as described of the previous class. Obviously one approach is cleaner than the other, and there only exists one level of indirection between the two. If the OS API has a boolean in fopen, say, that specifies if the data is first or second level, the resulting query could be limited to the application database.
"But wait!" you say, "Applications don't need to search for their data files, since they know exactly what they need. Don't introduce this overhead." I agree 100%, which brings me to my next idea.
The development tools for this operating system would allow order to be introduced to files residing on the build system. A tool that could create directory hierarchies that references files residing on the system would output a necessary build file for dependent data files. We impose directory hierarchies because hey, we know that's how we organize without ad hoc search queries. The most important reason however is because creating a URI to the file would allow direct access without queries. Since before shipping an application, one generally works in debug mode, debug applications would use this hierarchy file to open the referenced files on the first level database. Ship applications however would compile the second level database BLOB and work as described.
Applications can still access all other files; first or second level. The second level database is just another file in the first level database, only access to it is optimized more in the OS kernel. Thus if a user specifies he wants to use that application database file that a search has given him, he can, and then browse though that database as well. This is all if he has the permissions to do so, of course.
So where are the first level files, and how do applications access them once search spits them out? Well they all have URIs too, and querying the database in standard fashion can output all files on your computer if one so desires. The main difference is that first-level file URIs are assigned by the OS, and the second-level file URIs are assigned by the developer.

File Name and Extension Makeover
File names and file extensions are no longer what you might think either. A file extension is an attribute of a file record in the database that will either reference an application (applications register extensions with the OS by simply being applications on the system) that will open it, or are generic (ex. type EXT) and do not reference an application. So developers can still change a file's type, and it's still by changing what filename.end ends with, but by changing the extension, a disambiguation text extension would ask if it's for common applications A, B, C, Generic, or search. With machine learning, the OS can determine the behavior it should take by default for changing certain file names to others.
Keep in mind file names are no longer file names. They're optional titles that are also part of the user tags (weighted differently though) that are displayed on search results, along with snippets, if enabled.

Relevant Material Preview
This brings me to another thought. File snippets can mean loads of different things for different files. For this reason, applications can define ways to render snippets. Rendering can be in the form of text, rich text, images or even other applications' output. Why other applications' output? It's lame if the OS is in charge of things such as all the video playback with codecs galore ever changing. No, that makes the OS monolithic and breaks precious programming patterns. Of course one would aim on shipping the OS with a default video player, but I digress. By letting other applications render things for you, this enables third parties to customize snippet behavior without modifying the application itself. Say you have a text document that normally renders the text around what's found as relevant in its own way. Say the default behavior is the output the rich text, but you want it to output normal text. Assuming the application doesn't allow you to specify this yourself, there is a third party app that does this, and only this. Assign it to render that file's type and presto! Customization is at your fingertips.
Now I'm getting into just features the search application would provide. Granted there are craploads of ways to present information, but I won't go into those details now.

Replacing Explorer
What I want to talk about now is the user interface of the desktop manager. I've never really liked the taskbar and the start menu. The start menu has basically a directory hierarchy with links to files or applications that installed apps have deemed themselves worthy of adding to. This is clutter that becomes unnecessary with search. Screw the start menu. It's gone.
What do I LIKE about the start menu? Recent programs and access to control panel, along with links to the mostly used folders. But... what makes those folders so important? What criteria is used for determining which recently used apps are shown? This seems like customizable queries should be allowed. I actually don't like how recent programs messes up a lot and adds things I really don't want shown. This is another application where machine learning or AI could really show its shine. As for folders, ya... folders no longer exist. Only tags and relevancy by content. So what do we do? We show some default set of "popular tags'" query results along with some other logic we deem adds visibility weight. Obviously we can't/don't want to show all files tagged documents on the computer. No, we want to add importance like date made, times opened, etc. Really one could use tags to make a pseudo folder/file hierarchy, as I'm sure you can imagine, but that defeats the purpose of search, and isn't how people will think to organize. Say... you're loading pictures off your digital camera and normally would make a new folder in "Pictures\Jojo's family\Life in Madagascar" this time for "Summer 2008". This time, instead of navigating to Life in Madagascar and then making the file to import to, you're prompted to simply add tags. So "jojo's family Madagascar summer 2008" and pow, searches for jojo and more restricted searches will yield these pictures. For those who hate to type, search can be extended to finding relevant tags as well, so you can click through results and add tags to the list.
This need for metadata is the current major risk of the imaginary BrowserOS. I can see in the near future though, using facial recognition and other image processing, more indexing techniques can be added to the system to build tags to improve relevancy without the need for user input.

Feeds, Feeds, Feeds
Back to the desktop. Query results can be rendered in any control that is deemed capable of rendering data, even if not in the most verbose way (say of the search application). Controls can make up your desktop. Multiple desktops are also possible (think Spaces if you're a Mac fanboy, or Gnome/KDE if you're awesome and use linux) that are all customized different ways. When I visualize this, I think of iGoogle, and all the different feeds and widgets I can have everywhere. I also think of computers of tomorrow all on static IPs and server oriented, so you could set up queries to render RSS feeds and server iGoogle if you so wish. My aim is really to merge the internet and the desktop experience into one seemless vision.
Search could even show results from searching the local computer and with a search engine of choice. All these different modes would be easily discoverable in the search application.

Window Management
So yes, search result rendering, no more start menu, quick launch can be simple controls set up on the desktop that just fetch from a URI, but what of running applications? Surely those must be like windows and KDE and Gnome and Mac OSX somehow...? Nope. Running apps are just entries in the database as well. Granted, apps that start and stop all the time without showing themselves wouldn't be performing all these transactions in realtime. Applications will have execution tags that the desktop manager will be able to hook into. User apps, namely separate windows that normally show in the taskbar will have certain tags, background processes have other tags, but aren't added to the database until queried, or not at all. This information will likely just reside in memory. I imagine most likely that I would have alt-tab functionality only as a UI add-on that uses these queries. To actually show the gamut of applications running... I say default query in a feed somewhere. If an application needs to get your attention, the side of the screen could glow and once hovered over would show a feed of applications that are tagged to get your attention (think orange flashing tabs for a new AIM message for this attention-grabbing idea). Of course all this would be customizable as behavioral UI addons. Hell, an addon could actually show you the windows taskbar, if you really wanted it. In my opinion though, this is cluttered UI and won't be long missed once users get addicted to search.

Unstructured Thoughts on Structures

noreply@blogger.com (J. Ian Johnson) — Sun, 20 Jul 2008 07:02:00 +0000

As an undergraduate in mathematics, I know more math than you, but still less math than epsilon. It is for this reason that most all my opinions and thoughts on mathematics are much more romantic than they are useful. I have written papers on different areas that interested me, and I have gone to many different talks and read several papers. Could I come up with an hypothesis and prove it? Probably not, but given my current position in university, I am sure I could find guidance toward starting such an endeavor.
My goal with this blog is to increase readers' interest in mathematics and programming. I plan on talking about many random topics (but not truly random... a very fun investigation to do right there) that come to catch my interest, and I also aim to post solutions to different programming problems I come across. As long as my solutions to obscure problems are archived on the net, search engines will be able to index them and help a programmer in distress in the future.
I plan not to touch such topics as politics, philosophy or my personal life, but as an opinionated person without the motivation to maintain more than one blog, I make no promises about the future.

Back to the topic of this post, "Unstructured Thoughts on Structures." I am sure by now you have observed the unstructured thoughts, but what of these structures? Mathematics is all about creating abstractions to quantitatively and generally describe a problem. Structures, namely algebraic structures, are currently my focus of interest. Group theory, ring theory, field theory and Galois theory may ring bells for some of you. This is what I am talking about. This area of mathematics is so vastly different from anything else one studies before it that even mathematicians like to joke about it not being, "real math." A quote comes to mind from a source I cannot currently recall that is along the lines of, "Now we can solve this problem without any math at all; just group theory!" With this tidbit about structures to pique your interest, I will save more depth descriptions and discussions for later posts.

Welcome to my blog, friends of mathematics and computers.