Zulip Chat Archive

Stream: general

Topic: Making Lean more accessible with Sophize


view this post on Zulip Abhishek Chugh (Sep 30 2020 at 16:21):

Hello! I would like to introduce the Lean community to a Mathematics Library that I have been working on for over a year.
Sophize is a non-profit online mathematics library and academic discussion platform. We are trying to create a deeply connected network of mathematical knowledge from a variety of sources - including formal systems such as lean.

As a part of this project, we are incorporating knowledge from formal systems and making them more accessible. The following short video gives an overview of our work with the Metamath language.
https://youtu.be/XNJ2w71Gi50

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 16:27):

We would like to incorporate the wealth of knowledge you have created into our platform and extend our services to the Lean community. We can help make Lean more accessible to the broader mathematics community. Our broader goals for mathematics are outlined here:
https://www.youtube.com/watch?v=Wb1JbW9Otek&list=PLdfnRS1Iea0219QBYU_jF9D8F9VPUIq1z

I am looking for feedback from the community to help focus on what the community finds most valuable

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 16:31):

Would you folks be generous to fill-in this short form so that we can better understand the community's views?
https://forms.gle/WSYc44wCadxRUyPe8

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 16:54):

Some people (e.g., me) prefer reading text to watching videos.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 16:55):

Ah, I see. Thanks for letting me know.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 16:57):

So the basic idea - for Lean - would be to incorporate all the proofs and theorems that have been developed by the community. We provide very simple and intuutive to look up the meanings of every term/proposition in the proof.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 16:58):

Secondly, we organize the proofs in the form of a graph (built of individual arguments) where you can keep browsing the proof as much as you like (till you reach the axioms)

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 16:59):

Have you already incorporated theorems from some other library of formal proofs? If yes, could you please post a link to your presentation of the library?

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 17:00):

Do you want to deal with the proof tree only, or with the tactic mode source?

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:00):

Thirdly, we allow you to run proof -generating code on our servers that whose output gets displayed on the fly on the website.

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 17:01):

In either case, do you have some way of separating "important" dependencies from inevitable low-level ones? E.g., every rw or simp injects lots of boilerplate code in the proof tree.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:02):

Yury G. Kudryashov said:

Have you already incorporated theorems from some other library of formal proofs? If yes, could you please post a link to your presentation of the library?

Yes, we have done so for the Metamath language. The video (<4min) shows what we have done. The link is at https://sophize.org/s/metamath

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:03):

Yury G. Kudryashov said:

In either case, do you have some way of separating "important" dependencies from inevitable low-level ones? E.g., every rw or simp injects lots of boilerplate code in the proof tree.

Since we have full control over the entire process, we can do such things easily. Each language needs special support and we will handle such things

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:05):

Yury G. Kudryashov said:

Do you want to deal with the proof tree only, or with the tactic mode source?

I was thinking about the same and it doesn't look like things are very useful without the tactic mode support. So we are thinking of focusing more on that rather than the unwrapped tree.

view this post on Zulip Johan Commelin (Sep 30 2020 at 17:06):

But it will be a lot harder to get the data you want, in that case.

view this post on Zulip Johan Commelin (Sep 30 2020 at 17:07):

You can ask lean for the proof tree, but not for the tactic script that generated it. So you will need to parse the lean source code. That's a non-trivial thing to do.

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 17:07):

Or you can ask the lean server info at each symbol.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:08):

I am assuming that there Lean already has a parser - in C/C++, am I right?

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:08):

I went through the same process with metamath (although it was probably an order of magnitude simpler I suppose)

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 17:08):

AFAIR, someone posted highlighted sources obtained by asking the lean server about each symbol.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:09):

That would be pretty great. Would simplify a lot of things

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:15):

So my main focus for this thread is to understand these sorts of things - which are the biggest pain-points/opportunities in terms of

  • presentation of formal Lean proofs
  • sharing your proofs with descriptions - with the world to explain what you have done
  • making people understand how rigorous your work is
  • other opportunities to make formal mathematics popular/accessible

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:21):

Yury G. Kudryashov said:

AFAIR, someone posted highlighted sources obtained by asking the lean server about each symbol.

Can you give me a hint about where I could find this?

view this post on Zulip Eric Wieser (Sep 30 2020 at 17:26):

You may be interested in https://www.imo.universite-paris-saclay.fr/~pmassot/lean/, which looks a bit like one of your "smart articles", and is generated from source code with comments

Edit: this is generated with https://leanprover-community.github.io/format_lean/, which is not actively maintained.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:27):

This is great. They must have gone through a very similar process to create this. Thanks Eric

view this post on Zulip Pedro Minicz (Sep 30 2020 at 17:29):

Cool. I've stumbled upon your channel yesterday while searching about Metamath.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:30):

Oh. Great! I guess you could be our first 'organic' viewer

view this post on Zulip Pedro Minicz (Sep 30 2020 at 17:32):

Abhishek Chugh said:

I went through the same process with metamath (although it was probably an order of magnitude simpler I suppose)

I believe @Mario Carneiro is going to be the best person to access how hard that is going to be. But from what I've seen it seems that the process will be more than an order of magnitude harder in Lean.

view this post on Zulip Mario Carneiro (Sep 30 2020 at 17:33):

Yep, it's basically never been done before

view this post on Zulip Mario Carneiro (Sep 30 2020 at 17:33):

The only practical way to get information about parsed lean source is to hack lean directly

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:34):

Hey Mario, great to see you here.

view this post on Zulip Mario Carneiro (Sep 30 2020 at 17:34):

or find a way to get what you need through the lean API

view this post on Zulip Mario Carneiro (Sep 30 2020 at 17:34):

also hello again

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:35):

I see. Can you give me a hint why the lean parser would not have the information I am looking for?

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:37):

Is it different from the regular language parsers (some complex stages etc.) or is it that the parsed data is going to be very hard to make sense of?

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:48):

So, what I am getting from here that displaying a tactic based proof - which hides (or reduces visibility) of some of the most repeated tactics would be the best way to present proofs.
And that hacking into the parser would be my best bet to get this information.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 17:51):

could I get some feedback on whether actually doing this would be helpful to the community. Or reading/sharing proofs is quite easy for you already?

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 18:11):

Reading proofs is relatively easy for experienced users but it is not that easy for new users.

view this post on Zulip Pedro Minicz (Sep 30 2020 at 18:12):

I'd also argue that some (many?) proofs on mathlib were not made to be read.

view this post on Zulip Pedro Minicz (Sep 30 2020 at 18:12):

At least I find small term proofs harder to read than bigger (and slower to check) tactic mode proofs.

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 18:15):

Usually a short term mode proof proves something trivial.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 18:15):

I see.

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 18:16):

And you can reconstruct the proof in your mind from the list of theorems used in this line.

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 18:16):

(says a person who often writes proofs of "trivial" results as term mode one-liners)

view this post on Zulip Bryan Gin-ge Chen (Sep 30 2020 at 18:17):

It really depends what you mean by "read". I don't think it's a good use of time for humans to try to type-check Lean proofs, but certainly it's not bad practice and not too difficult with experience to skim over a proof that Lean accepts and get a sense of whether it's an efficient / "good" proof.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 18:20):

Okay. So most of the time is spent creating proofs. Folks don't come back often to look important proofs to admire/learn something from them?

view this post on Zulip Pedro Minicz (Sep 30 2020 at 18:20):

Well, at least I do sometimes. Following a Lean proof is as convenient, if not more so something, as reading a demonstration from a textbook.

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 18:22):

Sometimes we refactor old proofs, e.g., to get an intermediate lemma out of it, or to slightly generalize the statement, or to make the proof more readable.

view this post on Zulip Yury G. Kudryashov (Sep 30 2020 at 18:22):

Or if we want to change the definition.

view this post on Zulip Pedro Minicz (Sep 30 2020 at 18:22):

I agree that most term-mode one-liners are "trivial." But it would be nice to have mathlib's source itself be a tool for learning.

view this post on Zulip Pedro Minicz (Sep 30 2020 at 18:23):

I believe it already kind of is, it just clearly wasn't made of it.

view this post on Zulip Mario Carneiro (Sep 30 2020 at 18:37):

Abhishek Chugh said:

Is it different from the regular language parsers (some complex stages etc.) or is it that the parsed data is going to be very hard to make sense of?

The lean parser is very extensible (some might say too extensible), and lean 4 will be even more so. You can't parse lean code without also interpreting lean code, which requires that you also know the whole environment, have a VM implementation, and a typechecker - that is, you must be lean

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 18:37):

Thanks for the help guys. I think I am getting a feel of things here. One last thing I would like to know:
Has anyone made automatic proof-generators for even simple input statement like "6*9+7=61" or "d (3 sin x)/dx = 3 cos x"?
Seems to me like having a formally verified proofs for things like this could be useful in critical applications (verification/security). And I suppose strongly verified easily accessible knowledge might be one of the goals of projects like Lean?

view this post on Zulip Mario Carneiro (Sep 30 2020 at 18:39):

For 6*9+7=61, yes, this is the norm_num tactic (which incidentally is almost the same tactic as the one I wrote for metamath)

view this post on Zulip Mario Carneiro (Sep 30 2020 at 18:39):

although for something that small you can probably just use rfl too

view this post on Zulip Sebastian Reichelt (Sep 30 2020 at 18:40):

This might be too much of a tangent, but one thing I would really like to see is a service that properly connects different libraries and also informal references. Like OEIS but with a much broader scope, or https://www.cs.ru.nl/~freek/100/ extended to all definitions and theorems. I know there are some projects with this goal, but they tend to get lost trying to solve interesting research problems around translating between provers or between formal and informal math. I'm imagining something much simpler.
(Like OEIS, in the long term it might actually work better in the reverse direction: There is a central service which essentially provides unique IDs, and others - in this case theorem prover libraries - reference these IDs.)

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 18:40):

Mario Carneiro said:

Abhishek Chugh said:

Is it different from the regular language parsers (some complex stages etc.) or is it that the parsed data is going to be very hard to make sense of?

The lean parser is very extensible (some might say too extensible), and lean 4 will be even more so. You can't parse lean code without also interpreting lean code, which requires that you also know the whole environment, have a VM implementation, and a typechecker - that is, you must be lean

Thanks, Mario. I will start digging into the parser/API/LeanCrawler to understand this.

view this post on Zulip Mario Carneiro (Sep 30 2020 at 18:41):

import tactic.norm_num

example : 6*9+7=61 := by norm_num
example : 6*9+7=61 := rfl

view this post on Zulip Mario Carneiro (Sep 30 2020 at 18:41):

For taking derivatives, I think there is a tactic but I haven't been involved in that area. @Yury G. Kudryashov might know

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 18:43):

That's pretty convenient!

view this post on Zulip Mario Carneiro (Sep 30 2020 at 18:44):

@Sebastian Reichelt This sounds a lot like what Michael Kohlhase's work on OMDoc and such. I think they want URIs for everything, dunno how not-vaporware it all is though

view this post on Zulip Bryan Gin-ge Chen (Sep 30 2020 at 18:45):

simp is able to do a bunch of derivatives:

import tactic
import analysis.special_functions.trigonometric
open real
#simp [exp_ne_zero] : (λ x, deriv (λ x, (sin x) / (exp x)) x)
-- λ (x : ℝ), (cos x * exp x - sin x * exp x) / exp x ^ 2

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 18:49):

Sebastian Reichelt said:

This might be too much of a tangent, but one thing I would really like to see is a service that properly connects different libraries and also informal references. Like OEIS but with a much broader scope, or https://www.cs.ru.nl/~freek/100/ extended to all definitions and theorems. I know there are some projects with this goal, but they tend to get lost trying to solve interesting research problems around translating between provers or between formal and informal math. I'm imagining something much simpler.
(Like OEIS, in the long term it might actually work better in the reverse direction: There is a central service which essentially provides unique IDs, and others - in this case theorem prover libraries - reference these IDs.)

I have thought about this for some time and I came to very similar conclusions as yours. Mathematicians are quite interested in finding the underlying translation mechanisms where theories can be transformed (semi/fully) automatically. But they seem to forget that having simple relationships - even if not perfect - at first would be pretty useful. We have a system that can do exactly that - we have unique IDs for each term/propositions/proofs across various datasets/systems.

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 18:52):

We have a lot of pretty scalable foundations setup - using technologies that power things like Instagram and stuff. But I am finding it pretty hard to convince people to realize that a well-engineered software platform for Mathematics is worth investing into :neutral:

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 18:55):

Even though math researchers themselves have published reports saying that there is a need for it (https://arxiv.org/pdf/1404.1905.pdf), the incentive structures in universities make it pretty hard

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 18:58):

Bryan Gin-ge Chen said:

simp is able to do a bunch of derivatives:

import tactic
import analysis.special_functions.trigonometric
open real
#simp [exp_ne_zero] : (λ x, deriv (λ x, (sin x) / (exp x)) x)
-- λ (x : ℝ), (cos x * exp x - sin x * exp x) / exp x ^ 2

Lean makes this so simple. Getting this in metamath needed so many lines of code.

view this post on Zulip Mario Carneiro (Sep 30 2020 at 19:06):

that's because the lean library includes tactics (proof producing programs) and annotations to help those programs work effectively, unlike metamath, which only collects a library of formal theorems and leaves automation to the user

view this post on Zulip Abhishek Chugh (Sep 30 2020 at 19:07):

MM1 is gonna solve that :)

view this post on Zulip Mario Carneiro (Sep 30 2020 at 19:07):

I have done some minor work to add annotations to set.mm to help mmj2 in some respects, but there is essentially no support for this sort of thing out of the box. MM1 is indeed trying to tackle this problem head on, with a lean inspired tactic framework on a metamath like foundation


Last updated: May 08 2021 at 19:11 UTC