Logic, Types, and Probability

Probability is defined in the most rigorous of settings as a sample space, a sigma algebra, and a function from that sigma algebra into [0,1].  However, in probability classes, it is often introduced from a logical standpoint.  Event e has probability p, event not e has probability 1-p, and event e1 and e2 has probability e1 * probability e2 (assuming independence).  Using this we can continue defining probabilities, probability of e1 or e2 is probability of not (not e1 and not e2) so 1-(1-p(e1))*(1-p(e2)), and probability of e1 -> e2 is probability of not e1 or e2, is probability of not (not not e1 and not e2), is probability of 1-p(e1)*(1-p(e2)).

We can try to extend this via the Curry-Howard Isomorphism this into the simply typed lambda calculus, providing probabilities to certain types.  The probability of type A -> B becomes 1-p(A)*(1-p(B)).

Unfortunately, this natural progression from probabilities of events to probabilities of types doesn’t seem to work for what synthesis would use probabilities for.  In synthesis, probabilities would more be creating a sigma algebra over the space of refinement trees, and traversing those trees in order of increasing probability.  This I feel is a productive area of thought.  For example, consider derivations in the simply typed lambda calculus (with unit) of Unit.  there’s a simple one, unit : Unit.  But there’s also a more complex one, (\x . x) unit, and also, (\x . unit) unit.  It would be best for each sub-deduction to have the same sigma algebra.  If somehow there is a deduction which has the same context and goal type as the original, it should have the same probabilities as the original.

How do humans know which library functions to use?

A programmer hears that she needs to create a function that adds up all the numbers in a list of numbers.  The programmer identifies that this is a good situation for using the built in library function List.fold_left, and so does this.

However, where does she get this knowledge that a fold is the correct function to call.  She hears the problem and recognizes that this problem has the pattern of accumulating the data into an accumulator, knows that fold often fits the pattern of accumulating data, and fills in the data so that it typechecks and performs the correct action; 0 for empty list and (+) for accumulation.

How could we inform the computer that it needs to perform that accumulation action, and know that fold is appropriate.

We first need to have formalized what the notion of accumulate is.  Certainly the definition of fold formalizes it.  The implementation of fold is shown below.

let rec fold_left (f:'a->'b->'a) (acc:'a) (l : 'b list) : 'a =
  begin match l with
  | [] -> acc
  | h::t -> fold_left f (f acc x) xs

This feels too specific for an specification for something that accumulates.  I can’t however find anything more general.  This is a difficulty in trying to encode semantics, in that the semantics seem to become as complicated as the program.  Perhaps not a complete specification is correct, but to give invariants that imply, or nearly imply, the complete semantics of the program is sufficient.  For example, instead of writing out exactly what we want the accumulator to handle, we want to know that sum (l:’a list) has properties that sum [] = 0 and sum (h::t) = h+sum(t).  This is certainly more specific than merely int list -> int, but not yet a full implementation.  However, it seems to give enough information to provide a full implementation, and there is a natural way to extend the specification to an implementation.  Perhaps allowing for a stronger type system will enable more instances where the type informs the implementation in immediate ways.