Zulip Chat Archive

Stream: general

Topic: Discussion about using lean as smart contract language


ZHAO Jiecheng (Nov 02 2023 at 10:01):

I have a preliminary idea I'd like to discuss. Could Lean be the next-generation language for smart contracts and get a significant share of users?

A smart contract essentially calculates the next state of the blockchain based on a given transaction. Here's a simple breakdown: t represents the transaction, f is the smart contract, and s denotes the state. So, the new state becomes s' := f(t, s) after the transaction. A significant challenge is ensuring that no malicious transactions exists, which might alter the state in harmful ways, such as rug pulls or hacks. While some have tried to address this using the formal semantics of the runtime, such methods offer limited assurance. As a result, the bulk of contract verification heavily relies on third-party audits.

This is where I see the potential for Lean to revolutionize the process. Developers could publish their smart contracts alongside proofs that clearly define and guarantee the contract's behavior. For instance, let P represent the claim that "everyone can withdraw their money." A proof such as ∀ s t, P f(t, s) s would ensure there's no possibility of a rug pull.

The beauty of this approach is its accessibility. Anyone could write these proofs, reducing our dependence on highly costly third-party audits that only a few institutions offer. In blockchain communities, many feel that such institutional requirements contradict the foundational principle of permissionless access that blockchain should uphold.

If we integrate Lean with blockchain, it can open doors to something more interesting. For instance, when a developer submits code, they can also declare certain properties and provide a corresponding proof to the chain. The chain then validates this proof. If everything aligns, the code gets deployed. Moreover, if a developer later submits an updated code with a proof ensuring the same property, they can seamlessly upgrade the initial code. Such flexibility contrasts with our present situation, where smart contracts are either non-upgradable or can be modified without restrictions, which is quite frustrating.

Furthermore, developers can upload their code and its desired properties, setting a bounty as an incentive. If someone from the community successfully provides a proof that validates the stated properties of the code, they earn the reward. This not only accelerates the proof-generation process but also fosters collaboration and trust within the community.

I would like to write a toy system first. I'd appreciate any ideas, suggestions, questions or concerns you might have.

Martin Harrison (Nov 02 2023 at 10:16):

ZHAO Jiecheng said:

I have a preliminary idea I'd like to discuss. Could Lean be the next-generation language for smart contracts and get a significant share of users?

A smart contract essentially calculates the next state of the blockchain based on a given transaction. Here's a simple breakdown: t represents the transaction, f is the smart contract, and s denotes the state. So, the new state becomes s' := f(t, s) after the transaction. A significant challenge is ensuring that no malicious transactions exists, which might alter the state in harmful ways, such as rug pulls or hacks. While some have tried to address this using the formal semantics of the runtime, such methods offer limited assurance. As a result, the bulk of contract verification heavily relies on third-party audits.

This is where I see the potential for Lean to revolutionize the process. Developers could publish their smart contracts alongside proofs that clearly define and guarantee the contract's behavior. For instance, let P represent the claim that "everyone can withdraw their money." A proof such as ∀ s t, P f(t, s) s would ensure there's no possibility of a rug pull.

The beauty of this approach is its accessibility. Anyone could write these proofs, reducing our dependence on highly costly third-party audits that only a few institutions offer. In blockchain communities, many feel that such institutional requirements contradict the foundational principle of permissionless access that blockchain should uphold.

If we integrate Lean with blockchain, it can open doors to something more interesting. For instance, when a developer submits code, they can also declare certain properties and provide a corresponding proof to the chain. The chain then validates this proof. If everything aligns, the code gets deployed. Moreover, if a developer later submits an updated code with a proof ensuring the same property, they can seamlessly upgrade the initial code. Such flexibility contrasts with our present situation, where smart contracts are either non-upgradable or can be modified without restrictions, which is quite frustrating.

Furthermore, developers can upload their code and its desired properties, setting a bounty as an incentive. If someone from the community successfully provides a proof that validates the stated properties of the code, they earn the reward. This not only accelerates the proof-generation process but also fosters collaboration and trust within the community.

I would like to write a toy system first. I'd appreciate any ideas, suggestions, questions or concerns you might have.

Which blockchain? Are you proposing to build a new network in Lean?

ZHAO Jiecheng (Nov 02 2023 at 10:33):

No idea yet. Maybe sub-chain in other eco? Just talk about the idea first. But, yes, integration needs a chain.

Yaël Dillies (Nov 02 2023 at 11:03):

Do you know about Cairo?

Eric Wieser (Nov 02 2023 at 11:08):

Lean is in some sense "perfectly trustworthy" for mathematics, but this isn't the case for executable programs; there are plenty of ways to perform a rug pull where you prove your program does one thing in the kernel, but when run in the real world it does something different.

Eric Wieser (Nov 02 2023 at 11:09):

So as usual, the way to tell that no rugs are being pulled is to carefully read the source code

ZHAO Jiecheng (Nov 02 2023 at 11:37):

Yaël Dillies said:

Do you know about Cairo?

Yes. But I think STARKs is not related to the idea.

ZHAO Jiecheng (Nov 02 2023 at 12:24):

Eric Wieser said:

So as usual, the way to tell that no rugs are being pulled is to carefully read the source code

While this holds true for most development scenarios, it's the reason functional programming languages like Haskell aren't as popular as people like them. However, what I'd like to discuss is whether this remains the case in the context of smart contracts.

  1. Upgrading a smart contract is very challenging, as many are designed to be immutable. Developing smart contracts is more akin to chip manufacturing than traditional software development. (It's not 'soft' at all.)
  2. Once an attack starts, halting the chain or the program becomes difficult.
  3. The financial loss can be substantial.
  4. Unlike in traditional software development where spotting bugs is primarily the developer's responsibility, in crypto, users need to be familiar with the code to safeguard their assets. Most users can't understand the code and are dependent on auditing reports. For those capable of understanding the code, reading through thousands of lines just to interact with the contract safely is time-consuming and low-efficiency. I think here proof would be helpful.
  5. Only institutions with a high reputation can provide trustworthy auditing reports. This naturally leads to a monopoly, driving up costs for developers and users.

While "no rug" isn't a precisely defined term, it's still good to show that a contract adheres to certain well-established properties.

After witnessing numerous hacking incidents, I'm convinced that employing a language with a strong type system and few vulnerabilities is essential. Presently, even audits that cost millions of dollars can overlook potential risks. Solidity and the EVM lack the robustness to effectively prevent bugs. The EVM can be viewed as an early implementation of Vitalik's idea, introduced prematurely and leaving many issues unresolved. There inevitably will be a successor to address these shortcomings.

ZHAO Jiecheng (Nov 02 2023 at 12:26):

Eric Wieser said:

where you prove your program does one thing in the kernel, but when run in the real world it does something different.

How that would be, could you give me some examples. Like there are bugs in the kernel or ...

Eric Wieser (Nov 02 2023 at 12:34):

Sure, here's an example:

def one := 1

@[implemented_by one]
def two := 2

-- in the kernel, two is still two
theorem two_is_two : two = 2 := rfl

-- but we told the compiler / interpreter that it can use `one` to execute `two`
#eval two -- prints `1`

Eric Wieser (Nov 02 2023 at 12:35):

Like there are bugs in the kernel

No, but the kernel is not the same thing as the VM, and you are implicitly trusting that the type-theoretic algorithms are correctly compiled into C / handled by the interpreter.

ZHAO Jiecheng (Nov 02 2023 at 12:53):

Eric Wieser said:

Sure, here's an example:

def one := 1

@[implemented_by one]
def two := 2

-- in the kernel, two is still two
theorem two_is_two : two = 2 := rfl

-- but we told the compiler / interpreter that it can use `one` to execute `two`
#eval two -- prints `1`

It is interesting and helpful. Thank you very much. Is there any way to avoid this. Would an external checker or a dialect help?

ZHAO Jiecheng (Nov 02 2023 at 12:58):

Eric Wieser said:

Like there are bugs in the kernel

No, but the kernel is not the same thing as the VM, and you are implicitly trusting that the type-theoretic algorithms are correctly compiled into C / handled by the interpreter.

Does what you mean here the kernel of the compiler or type checker? I think we always have to trust some basic infrastructure. But maybe the less 'axioms' the better.

Shreyas Srinivas (Nov 02 2023 at 13:00):

ZHAO Jiecheng said:

I would like to write a toy system first. I'd appreciate any ideas, suggestions, questions or concerns you might have.

The problem of using lean itself as a language has been explained by Eric. But even if you cannot get code extraction, just having an EDSL for reasoning about smart contracts is good to have. You can restrict the kind of constructions allowed there. Lean is nice for implementing EDSLs

ZHAO Jiecheng (Nov 02 2023 at 13:16):

Shreyas Srinivas said:

ZHAO Jiecheng said:

I would like to write a toy system first. I'd appreciate any ideas, suggestions, questions or concerns you might have.

The problem of using lean itself as a language has been explained by Eric. But even if you cannot get code extraction, just having an EDSL for reasoning about smart contracts is good to have. You can restrict the kind of constructions allowed there. Lean is nice for implementing EDSLs

I may not catch your idea well. But do you mean something like this? https://github.com/ivan71kmayshan27/coq-evm or https://github.com/runtimeverification/evm-semantics ?

ZHAO Jiecheng (Nov 02 2023 at 13:17):

(deleted)

Jannis Limperg (Nov 02 2023 at 13:17):

There's a bunch of prior work in this area. Look through recent ITP/ICFP/POPL/PLDI proceedings and you should find some papers about verifying smart contracts. Smart contracts are indeed an attractive verification target, but as always the devil is in the details.

ZHAO Jiecheng (Nov 02 2023 at 13:23):

Jannis Limperg said:

There's a bunch of prior work in this area. Look through recent ITP/ICFP/POPL/PLDI proceedings and you should find some papers about verifying smart contracts. Smart contracts are indeed an attractive verification target, but as always the devil is in the details.

I saw some of them. What I am interested in is would it be easy if we write the contract in Lean and verify it in Lean.

Jannis Limperg (Nov 02 2023 at 13:26):

Wrt using Lean for verified software, when you run a verified program in Lean, you trust that

  • the Lean compiler has no bugs;
  • all @[implemented_by] and @[extern] annotations are correct;
  • the Lean kernel (or leanchecker or lean4lean) has no bugs.

Other systems have much smaller trustbases, particularly HOL Light with the CakeML/Candle project.

However, for a smart contracts project you likely wouldn't use the Lean compiler anyway (because you need to output your blockchain language, not C), so then the first two points are not a concern. As Shreyas points out, you likely don't want to compile all of Lean either (because that would be way too inefficient); rather you would work with a deep embedding of the blockchain language, or perhaps of a custom language that compiles to the blockchain language.

Jannis Limperg (Nov 02 2023 at 13:28):

ZHAO Jiecheng said:

I saw some of them. What I am interested in is would it be easy if we write the contract in Lean and verify it in Lean.

If you write the contract in Lean, you have to compile it to an efficient blockchain program. This is highly nontrivial because blockchains have such severe resource limits that the compiler would have to optimise near-perfectly.

ZHAO Jiecheng (Nov 02 2023 at 13:30):

Jannis Limperg said:

Wrt using Lean for verified software, when you run a verified program in Lean, you trust that

  • the Lean compiler has no bugs;
  • all @[implemented_by] and @[extern] annotations are correct;
  • the Lean kernel (or leanchecker or lean4lean) has no bugs.

Other systems have much smaller trustbases, particularly HOL Light with the CakeML/Candle project.

However, for a smart contracts project you likely wouldn't use the Lean compiler anyway (because you need to output your blockchain language, not C), so then the first two points are not a concern. As Shreyas points out, you likely don't want to compile all of Lean either (because that would be way too inefficient); rather you would work with a deep embedding of the blockchain language, or perhaps of a custom language that compiles to the blockchain language.

Understood, thank you.

Eric Wieser (Nov 02 2023 at 13:32):

Perhaps worth noting that of those bullet points, mathematicians only really care about the last one (which is also the one that needs the least trust)

ZHAO Jiecheng (Nov 02 2023 at 13:33):

Jannis Limperg said:

ZHAO Jiecheng said:

I saw some of them. What I am interested in is would it be easy if we write the contract in Lean and verify it in Lean.

If you write the contract in Lean, you have to compile it to an efficient blockchain program. This is highly nontrivial because blockchains have such severe resource limits that the compiler would have to optimise near-perfectly.

Some block-chains support wasm. I saw some one is porting Lean to wasm. Does that help?

ZHAO Jiecheng (Nov 02 2023 at 13:34):

Eric Wieser said:

Perhaps worth noting that of those bullet points, mathematicians only really care about the last one (which is also the one that needs the least trust)

Just out of curiosity, why?

Jannis Limperg (Nov 02 2023 at 13:54):

ZHAO Jiecheng said:

Some block-chains support wasm. I saw some one is porting Lean to wasm. Does that help?

A bit, but you'd still have to ensure that your Lean smart contracts compile to extremely efficient wasm.

Shreyas Srinivas (Nov 02 2023 at 15:28):

Jannis Limperg said:

ZHAO Jiecheng said:

Some block-chains support wasm. I saw some one is porting Lean to wasm. Does that help?

A bit, but you'd still have to ensure that your Lean smart contracts compile to extremely efficient wasm.

You also probably don't want to reason about your contracts at the wasm level anyway. What you need is a high level DSL where you can reason about smart contracts at a high level (symbolically) and then have verified translations to some IR up to some semantics, and so on until you reach wasm.

Shreyas Srinivas (Nov 02 2023 at 15:29):

As long as you stick to DSLs, you don't have to trust the full lean compiler.

Shreyas Srinivas (Nov 02 2023 at 15:31):

Ofc, this may not be a fully satisfactory answer, because I am carefully asking you to avoid contact with the real world, but these simplification and abstraction steps restrict your means of interaction with the real world, and prove prpperties up to that point, which is the best you can achieve anyway with FV

Shreyas Srinivas (Nov 02 2023 at 15:39):

Reasoning about a full programming language for distributed systems is simply overkill.

ZHAO Jiecheng (Nov 03 2023 at 00:12):

Thank you all for the information. I changed my mind. It may not be a good idea. Thank you~

Adomas Baliuka (Nov 03 2023 at 03:46):

Eric Wieser said:

Perhaps worth noting that of those bullet points, mathematicians only really care about the last one (which is also the one that needs the least trust)

I guess these "soundness questions" will keep coming up. Perhaps there should be a page to link to that explains the situation and the relevant point of view in Lean and Mathlib?

That bullet list is not complete (from what I've seen mentioned as examples). To illustrate: How hard would it be (for a group of experts) to set up a CI system where anyone can upload code and someone proving False while passing appropriate additional checks indicates a kernel bug (or type system inconsistency)? My impression from what people have said is "very hard", while the list being complete might be interpreted as saying that just "banning the words implemented_by and extern" would do it. (I'm not suggesting anyone should try this)

I'm beginning to understand why mathematicians (who care about formalization) don't care about this but I'm not quite there yet. For example I could see that coming up when using automation, especially language models. Eventually some system might find one of these loopholes and it may be hard to detect due to it happening inside macros and tactics. D. J. Bernstein has also mentioned this possibility in his recent paper.

Scott Morrison (Nov 03 2023 at 03:55):

To be honest I think that Mathlib CI is a pretty close approximation of what you've asked for: we compile your code with Lean, then recheck the oleans with lean4checker (and soon lean4lean?).

If your proof of False passes that, you've either found a boring bug in our CI setup, or something interesting.

Scott Morrison (Nov 03 2023 at 03:55):

e.g. I don't know how to pass that.

Scott Morrison (Nov 03 2023 at 03:56):

(Okay, if my PR disables CI, or changes the version of lean4checker, etc...)

Adomas Baliuka (Nov 03 2023 at 04:21):

Scott Morrison said:

e.g. I don't know how to pass that.

Will it prevent me from adding axioms? What I'm trying to understand is what exactly the "pretty close" means in "pretty close approximation". (I do realize that I probably never will understand this exactly. It's still interesting to try.)

Mario Carneiro (Nov 03 2023 at 04:26):

not as currently configured

Scott Morrison (Nov 03 2023 at 05:27):

We should add this somewhere in Mathlib CI, I guess. Any takers? Even just import Mathlib and then scan the constants in the environment for unfamiliar axioms?

ZHAO Jiecheng (Nov 03 2023 at 06:08):

Scott Morrison said:

To be honest I think that Mathlib CI is a pretty close approximation of what you've asked for: we compile your code with Lean, then recheck the oleans with lean4checker (and soon lean4lean?).

If your proof of False passes that, you've either found a boring bug in our CI setup, or something interesting.

Do you mean these checks?
Screenshot-from-2023-11-03-14-06-21.png

Bolton Bailey (Nov 03 2023 at 06:25):

Readers of this thread might be interested in this job posting from a couple months ago, it seems Nethermind might be interested in verifying smart contracts as well.

Scott Morrison (Nov 03 2023 at 11:22):

@ZHAO Jiecheng, yes, if you look at what CI does it runs lean4checker.

Eric Wieser (Nov 03 2023 at 11:37):

Adomas Baliuka said:

Eric Wieser said:

Perhaps worth noting that of those bullet points, mathematicians only really care about the last one (which is also the one that needs the least trust)

I guess these "soundness questions" will keep coming up. Perhaps there should be a page to link to that explains the situation and the relevant point of view in Lean and Mathlib?

I think this would be a very good idea. @Scott Morrison, do you think this belongs in the Lean docs, or the community website?

Scott Morrison (Nov 04 2023 at 04:23):

Both would be good, I think. The Lean docs should just point to the available tools and explain the situation. The Mathlib docs could be more opinionated about "what mathematicians should care about".


Last updated: Dec 20 2023 at 11:08 UTC