Show HN: Talos – Open-source WASM interpreter for Lean

(github.com)

81 points | by mfornet a day ago ago

21 comments

keithwinstein an hour ago
Looks very interesting! We have done a lot with WasmCert-Isabelle (and there's also WasmRef-Isabelle and their 2023 paper, and the earlier WasmCert-Coq); other than being in Lean instead of Isabelle/HOL or Coq, how would you compare the approach you used? E.g. are you able to do an "in-place Store" like WasmRef-Isabelle, and can you represent memories and tables as plain vectors of bytes/refs in memory, can you grow them in-place, etc.? Or any other optimizations/lessons learned?
I'm also curious -- are you just implementing the Wasm binary and text parsing, validation algorithm, and execution semantics in Lean from scratch by reading the English prose in the spec document, and then checking it against the spec tests and the SpecTec description? Or do you have some sort of automated (classical or LLMy) transformation happening? (One could imagine directly transforming the SpecTec, or the OCaml reference interpreter, into Lean... but it sounds like you're not doing that? I think one needs to be a little careful here because e.g. at this point some of the English prose and reference interpreter implementation, and I think maybe some of the tests, are autogenerated from the SpecTec.) Which parts (if any) are outside the scope of the formalization? E.g. for WasmCert-Isabelle, I believe the binary and definitely the text parsing, and I think some of the arithmetic ops, are not covered.
How are you modeling the explicit sources of nondeterminism in the Wasm execution semantics? E.g. NaN representation, {memory., table.}grow, host calls, stack exhaustion, relaxed SIMD instructions, etc., and that's all before we get to the threads proposal? Because if the goal is to prove programs correct, one risk is that I prove my program correct against your Wasm interpreter (which maybe makes certain choices that aren't determined by the spec), and then I run it against another fully-conforming interpreter in the wild and it behaves incorrectly.
himata4113 13 hours ago
talos is already in use by https://github.com/siderolabs/talos, was confused for a second when I saw talos and wasm for a second, got excited about native wasm pod support.
[-]
- jazzyjackson 11 hours ago
  Also collides with the Power9 desktop system https://www.raptorcs.com/TALOSII/
jacobjwalters 7 hours ago
What is the program logic used here? The num_integer verification example seems to be hardcoding addresses in the spec; what if I want to reason about larger programs that dynamically allocate, where the addresses may not be known statically? How can I make sure these do not overlap? And since this is a shallow embedding into lean, what’s the approach for verifying properties of non-terminating programs?
[-]
- mfornet 5 hours ago
  > what if I want to reason about larger programs that dynamically allocate, where the addresses may not be known statically? How can I make sure these do not overlap?
  We are actively working on this, as it is a pre-condition :P to reason about the simplest of useful programs. The idea is to develop an API around separation logic that allows you to reason about logic that manipulate non-overlapping regions of memory.
  It won't be relevant if address are not known statically since API theorems will be parametrized over non-relevant constants such as addresses, function indices, etc...
  > And since this is a shallow embedding into lean, what’s the approach for verifying properties of non-terminating programs?
  To use the interpreter there is the concept of fuel, which we explicitly hide from the reasoning layer. Using fuel you can write statements of the form, this function returns out of fuel for any value of fuel passed to the interpreter, which is equivalent to prove that your program doesn't terminate.
  [-]
  - jacobjwalters 2 minutes ago
    Is the plan to build a new separation logic framework, or use e.g. iris-lean or splean as a base?
    And even if fuel isn’t exposed in the program logic, I’d imagine you’d still want step indexing to allow reasoning around cyclic heap structures. My experience with getting Claude (even fable) to do step indexed logical relations/SL proofs autonomously is that it struggles hard even on toy examples, particularly around the index shuffling. Do you have a setup in mind that can scale to larger programs?
    Fun project in any case! I look forward to seeing how it develops :)
quietusmuris a day ago
Interesting. Do I have to write specs in Lean against the Wasm semantics or can you annotate Rust directly?
[-]
- mfornet 5 hours ago
  Both.
  You can write "annotate" your rust code using asserts. On the wasm side asserts are converted to trap instructions, so the Lean spec will simply be: For every input this code never traps.
  Part of our focus is making sure that specs are both easy to write and read, since they are human facing. Eventually you could imagine how writing code will mostly be writing specs, and both the code and the proofs will be handled by AI agents. In this scenario it is very important that humans can easily audit and modify the specs.
  [-]
  - quietusmuris 3 hours ago
    Doesn't that put the Rust compiler (and its assert lowering) in the trusted base? How do you know the asserts you wrote are the traps you're reasoning about?
  - IshKebab 2 hours ago
    How do you actually prove it though? I understand if it's fully automated SMT-style proof, but doesn't Lean require tediously explicit proofs? If it doesn't prove automatically do you have to write out Lean helper proofs about the compiled WASM?
lukerj00 a day ago
I’m on the Cajal team - not OP, but happy to answer questions.
The core bet is that Wasm is a good verification target (close to compiled artifacts, many languages target it), and Lean is the right place to do verification.
Super interested in hearing from people working with Lean, compilers or other Wasm verification frameworks (eg Iris-Wasm).
[-]
- jsmorph an hour ago
  Cool. I've been working on a compiler for a subset of Lean that targets WASM. The compiler is implemented in Lean.
  https://github.com/jsmorph/leanexe
  I think I managed to use Talos to prove the WAT generated from an example LeanExe program is correct. ?
  https://gist.github.com/jsmorph/275a15dc21af037e1d02a1b433be...
  Fun.
- kdavis 11 hours ago
  What other verification targets did you consider?
  [-]
  - mfornet 5 hours ago
    Initially we considered formalizing rust code, aeneas is a very promising project that would unlock a lot of features right way by transpiling to lean. However, we didn't want to lock ourselves to rust, so we decided to use a lower level target such that we could verify code from "any" language.
    We considered LLVM-IR, and RISC-V.
    Ultimately WASM felt like the right decisions. More importantly WASM spec is very well done in details, and it is written with formal verification in mind early on, and there are are plans from to include Lean as one of the targets for generating the spec automatically from SpecTec. Once this exist, we will formalize that our interpreter is correct under the definition generated from the official Wasm-Lean-Spec so we remove it from the Trusted-Base going forward.
  - lukerj00 2 hours ago
    More on this - LLVM-IR has no official formal semantics and it's riddled with UB. RISC-V has a formal model in Sail, but it's an ISA so you throw away the structured control flow and types which we want for proving.
    Wasm has different levels we can validate against - starting with W3C test suite, then later full verification against the SpecTec-generated Lean semantics so that we can drop our interpreter from the trusted base.
BobbyTables2 an hour ago
Only thing left is to make a Kanban out of it…
oulipo2 5 hours ago
Interesting, have you also looked at other formal methods, like Abstract Interpretation?
CurryFurry 8 hours ago
For "Lean"? LeaRn? Lean Manufacturing? Stupid one-word techbro product names.
[-]
- johnsonjo 6 hours ago
  Lean is a programming language [1]
  > Lean is an open-source programming language and proof assistant that enables correct, maintainable, and formally verified code
  [1]: https://lean-lang.org/
sohex 5 hours ago
Do people just not even search their proposed name anymore?
[-]
- IshKebab 5 hours ago
  It's ok for multiple different things to use the same name.
  https://en.wikipedia.org/wiki/Lynx_(disambiguation)
  I wish these comments were banned. They come up every time someone names a project with a name that was also used by one guy for his forgotten Lisp dialect in the 70s.