Automated Static Verification of Higher-order Functional Programs
Explore the automated static verification of higher-order functional programs, focusing on Haskell code. Learn about denotational semantics, program assertions, and tools like GHC for verification. Discover the emphasis on simplicity and structural properties for easy proofs. Consider utilizing existing technology such as SMT solvers and model finders for efficient verification processes.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
1 HASKELL TO LOGIC THROUGH DENOTATIONAL SEMANTICS Dimitrios Vytiniotis, Koen Claessen, Simon Peyton Jones, Dan Ros n POPL 2013, January 2013
2 Real programs contain assertions dimitris@artemis:~/GHC/ghc-head/ghc/compiler/typecheck$ grep -i ASSERT ./*hs ./FamInst.lhs: = ASSERT( isAlgTyCon tycon ) ./Inst.lhs: ; wrap <- ASSERT( null rest && isSingleton theta ) ./TcCanonical.lhs: = ASSERT( tyConArity tc <= length tys ) -- Type functions are saturated ./TcCanonical.lhs: = ASSERT( not (isKind t1) && not (isKind t2) ) ./TcClassDcl.lhs: = ASSERT( ok_first_pred ) local_meth_ty ./TcClassDcl.lhs: rho_ty = ASSERT( length sel_tyvars == length inst_tys ) ./TcDeriv.lhs: ASSERT( null sigs ) ./TcDeriv.lhs: = ASSERT2( equalLength rep_tc_tvs all_rep_tc_args, ppr cls <+> ppr rep_tc ) ./TcDeriv.lhs: arg_ty <- ASSERT( isVanillaDataCon data_con ) ./TcEnv.lhs: -> ASSERT( lvl == lvl1 ) id ./TcEnv.lhs: TopLevel -> ASSERT2( isEmptyVarSet id_tvs, ppr id $$ ppr (idType id) ) ./TcErrors.lhs: = ASSERT( isEmptyBag insols ) ./TcErrors.lhs: = ASSERT( not (null matches) ) ./TcErrors.lhs: = ASSERT( length matches > 1 ) ./TcEvidence.lhs: | otherwise = ASSERT( arity < n_tys ) ./TcEvidence.lhs:mkTcForAllCo tv (TcRefl ty) = ASSERT( isTyVar tv ) TcRefl (mkForAllTy tv ty) ./TcEvidence.lhs:mkTcForAllCo tv co = ASSERT( isTyVar tv ) TcForAllCo tv co ./TcEvidence.lhs:mkTcForAllCos tvs (TcRefl ty) = ASSERT( all isTyVar tvs ) TcRefl (mkForAllTys tvs ty) ./TcEvidence.lhs:mkTcForAllCos tvs co = ASSERT( all isTyVar tvs ) foldr TcForAllCo co tvs ./TcEvidence.lhs: = ASSERT (tc `hasKey` eqTyConKey) ./TcEvidence.lhs: = ASSERT( equalLength tvs cos ) ./TcExpr.lhs: = ASSERT( not (isSigmaTy res_ty) ) ./TcExpr.lhs: = ASSERT( notNull upd_fld_names ) (from the GHC type checker)
3 This work Automated static verification of higher- order functional programs www.github.com/danr/contracts Tool works on subset of Haskell, uses GHC as frontend
4 Our setting Verify Haskell code: higher-order, lazy but pure Don t aim for high expressiveness, go for simple, easy-to-prove (e.g. structural) properties Automatically discharge all tedious but simple goals that a programmer has to manually and repeatedly check No best solution yet. Our choice for this work Re-use existing technology: Automated theorem provers (e.g. SMT solvers), model finders ACL2? Boogie? Prolog? Property-directed reachability? [Bjorner et al]
5 Programs and properties 1. 2. can risers crash? non-empty input non-empty result? risers [] = [] risers [x] = [[x]] risers (x:y:ys) = case risers (y:ys) of [] -> error urk (s:ss) -> if x <= y then (x:s):ss else [x]:s:ss risers ? CF && {xs | not (null xs)} -> CF && {ys | not (null ys)} Syntax of contracts ( refinements more appropriate): C ::= {x | p} | (x:C1)->C2 | C1 && C2 | CF Just an ordinary Haskell expression of type Bool crash-free
6 Design module Foo f x y = g x = First Order Logic Formulae ???? HALO translation to First Order Logic ???? g C -- Prelude data [a] = [] | a : as data Bool = True | False Functions over these ? Haskell Source Theorem Prover Z3/Equinox/E/ Vampire/Paradox Satisfiable Probably contract doesn t hold but who knows Unsatisfiable Contract holds! <loop> Can t tell anything
7 Key idea: let denotational semantics guide us ? = ?1?1= ?1 ????= ?? ? = ???? ? ?? ? ? | ? ? = ? ? ? ? ? ? ??? A /case-lifted language Standard ? construction ? ?1? + + ??? + ? ? + 1??? Lifting Continuous function space One product of ?? cpos for each constructor ?? of arity ?? Distinguished one-element cpo
8 and use ? itself as FOL structure Logical language: ? = ? ???? ? ? ? ? ?????? ??? ?1,?2 ??? ??? A translation of expressions to logical terms: ? ? = ? ? ?1 ?2 = ??? ? ?1,? ?2 ? ? ? = ? ? ? ? ? = ???? ? ??? = ??? apply (1???,_) = 1??? apply ( ,_) = apply(fun(d),d ) = d(d ) apply(_,_) = Interpreted as Interpreted as the apply combinator in ? Interpreted as injection into the appropriate product
9 Function definitions become FOL axioms head (Cons x xs) = x head _ = error ? ??. ??? ???? ?,?? ??? ??? = ??? ??? ??? = ??? ?.? = ??? ? = ??? ? = ???? ???????0? ,???????1? ??? ? = ??? = ? Theory ???? NB: ??? ???? = ??? A Good Thing! Theorem: ? ????
10 Axiomatize (some) true facts about ? data List a = Cons a (List a)| Nil ? ??.???? ?,?? ??? ??? ??? ? ??.???????0???? ?,?? ? ??.???????1???? ?,?? = ? = ?? Theory ? Theorem: ? ?
11 Higher-order functions head (Cons x xs) = x head _ = error ? ??.???( ??????,?) = ???(?) Interpreted as the apply(.,.) combinator in ? double f x = f (f x) ? ?.??? ??? ?????????,? ,? = ?????? ?,? ? ?.?????? ?,? = ???(?,??? ?,? )
12 Refinements denotationally and logically Denotationally Logically ? ? ?} ? = ??? ? ? = ??? ? ? = ???? ? ?:?1 ?2 ?. ? ?1 ? ? ?2
13 Soundness via denotational semantics Assume that: ?????(? ???? ? ? ) Then: ? ???? (? ?) By previous theorems: ? ? ???? hence: ? ? ? which is equivalent to: ? ?
14 Automating induction Currently support fixpoint induction add Z y = y add (S x) y = S (add x y) add ? CF -> CF -> CF Assume contract holds for uninterpreted function add_rec add Z y = y add (S x) y = S (add_rec x y) add_rec CF -> CF -> CF --------------------------- add ? CF -> CF -> CF NB: A sound thing to do by admissibility of contracts
15 Admissibility and why it matters In Haskell, data types are not inductive. Hence your familiar induction principle is simply unsound! ones = 1 : ones f Z = [] f (S x) = 1 : f x Lemma: forall x. f x ones Proof: Holds for UNR Holds for Z Assume holds for x; then holds for (S x) Right? Logical inequality, not admissible! WRONG! Let: u = S u Then: f u = ones
16 Admissibility and induction Admissibility = If P is true for all elements of a chain, then true for the limit. Not all predicates are admissible ?????????? ? ?(UNR) ? ?(? ) ?(??? ?) [FixInd] Theorem: All predicates ? ? are admissible. Comes for-free! Base contracts are Haskell functions, and those are continuous!
17 Happily implemented on top of GHC API Z3 rocks for provable properties! Disclaimer: 40-80 FOL axioms/problem Use of fixpoint induction
18 More features (and non-features) More features: Incremental verification Prove spec for g , use either the spec or definition of g , or both to prove other specifications Some support for lemmas Mutual (automatic) fixpoint induction Primitive arithmetic constraints via SMT2 (in Z3) Experimental features: logical equality, finite unfoldings Not there: Pre/post inference, strengthening of IH Support for counterexamples (see next slide)
19 What s next: counterexamples Unprovable contracts (because they re false or we re incomplete) Paradox Equinox Z3 Vampire E-prover AnyMorphism.big_sat_app_any_morphism_fail_step P:---- X:---- Z:---- V:---- E:---- Loop.sat_id_loop_pred P:0.00 X:0.01 Z:0.01 V:---- E:0.01 Loop.sat_id_recursive_true P:---- X:---- Z:---- V:---- E:0.01 PredLog.sat_concatMap_cf_missing_step P:---- X:---- Z:---- V:---- E:---- PredLog.sat_concatMap_retains_missing_step P:---- X:---- Z:---- V:---- E:---- PredLog.sat_flattenAnd_cf_missing_step P:---- X:---- Z:---- V:---- E:---- PredLog.sat_flattenAnd_retains_missing_step P:---- X:---- Z:---- V:---- E:---- ... Recursion.sat_qfac_cf_broken_step P:---- X:---- Z:---- V:---- E:---- Recursion.sat_rev_cf_broken_step P:---- X:---- Z:---- V:---- E:---- Risers.big_sat_risersBy_nonEmpty_broken2_step P:---- X:---- Z:---- V:---- E:---- Risers.big_sat_risersBy_nonEmpty_broken_step P:---- X:---- Z:---- V:---- E:---- Risers.sat_risers_broken2_step P:---- X:---- Z:---- V:---- E:---- Risers.sat_risers_broken3_step P:---- X:---- Z:---- V:---- E:---- Risers.sat_risers_broken_step P:---- X:---- Z:---- V:---- E:---- Risers.sat_risers_missing_le_step P:---- X:---- Z:---- V:---- E:---- Shrink.big_sat_shrink_lazy_step P:---- X:---- Z:---- V:---- E:---- Timeouts We now know why, and how to address this: stay tuned
20 What s next: usability Proving is reasonably fast, now explore: Automatic strengthening of induction hypotheses Pretty printing models as counterexamples More induction principles Testing in larger scale Interfacing with theorem provers for manual proofs? Lots of man-hours needed, come help please!
Related work ESC/Haskell [Xu et al] Contracts are programs Symbolic execution/inlining Zeno [Sonnex et al] Automated equality proofs Clever heuristics Strict semantics Catch [Mitchell] Pattern match errors Via dataflow analysis Liquid Types [Jhala et al] Predicate abstraction Inference Quantifiers driven by type system Dafny & Boogie [Leino et al], ACL2 Leon [Suter et al] Specialized decision procedure for FP Good for first-order F7/F* [Swamy et al] Hoare logic for FP [Regis-Gianas & Pottier] HO logics CBV *really* helps HO model checking, MoChi [Kobayashi et al] Specialized decision procedures Lots of techniques stacked Good for inference, good for counterexamples HOLCF-based verification [Huffman] Reasoning in a very rich logic that contains formalization of a domain theory More sophisticated axiomatization, ability to reason about parametricity and monad laws Symbolic execution-based [Tobin- Hochstadt and Van Horn][Xu] Abstraction, lots of smaller queries to theorem prover
22 What we did and what I learnt We ve given a semantic basis for the verification of Haskell programs We demonstrated that it is implementable We can verify FP in a simple and robust way: For this particular case a simple solution seems to do the job. It appears affordable to use a very precise abstraction of your program and trust your 2013 theorem proving technology Thank you for your attention!