Zulip Chat Archive

Stream: lean4

Topic: LeanMySQL feedback


Arthur Paulino (Dec 01 2021 at 04:12):

@Mac any tip/idea on how to make it a bit simpler/straightforward than having to write colProp% everytime?

Arthur Paulino (Dec 01 2021 at 04:33):

Actually wait, I need to think of the use cases better

Mario Carneiro (Dec 01 2021 at 04:36):

you could make filter a macro which takes a colProp and produces a term

Sebastian Ullrich (Dec 01 2021 at 08:44):

Might as well go full LINQ at this point

Sebastian Ullrich (Dec 01 2021 at 08:49):

Arthur Paulino said:

Is there a way to hide things from the API user? Or this is not a thing in Lean?

private is as much a thing in Lean programming as in other languages. While it's true that it can be counterproductive for proving, it is extremely unlikely that code that was not designed to be reasoned about from the very beginning will ever be reasoned about. So you should not worry about it.

Mac (Dec 01 2021 at 13:20):

@Sebastian Ullrich incidental reasoning still appears (e.g., for things like Array.get). Also, like @Mario Carneiro , I think private in general is (in most cases) a bad idea.

It makes sense in purely compiled languages where the compiler can use that information to do smarter optimizations because it knows the symbols won't be used outside the current target. For a highly reflective language like Lean, though, where it is still perfectly possible circumvent private with tricks, it just makes for bad UX.

For example, there are a number of private definitions in the Lean core that I have had to copy paste into various projects because I needed to use them. One good example that has been brought up in the doc-gen4 discussion is Name.isBlacklisted.

Henrik Böving (Dec 01 2021 at 13:41):

Is there any use for the private keyword at all in your opinion then?

Henrik Böving (Dec 01 2021 at 13:41):

@Mac

Mac (Dec 01 2021 at 13:51):

@Henrik Böving I think it works fine for auxiliary functions to partial definitions (though where clauses are an alternative there) and for some tricks with runtime unsafe code like Macro.MethodsRefPointed where an end user fiddling with the definition directly can break invariants and be quite bad.

But, in general, no, I don't think private should be used (an opinion I believe is shared by @Mario Carneiro).

Mario Carneiro (Dec 01 2021 at 13:52):

Auxiliary definitions in general can't be private, since that's often where you want to start when proving properties about an operation. If it's a partial definition you are already hosed though, so I agree with @Mac 's assessment

Mario Carneiro (Dec 01 2021 at 13:54):

I think of private as the keyword to use when you want someone to curse your name 2 years later

Mac (Dec 01 2021 at 14:00):

I like to think of the proper use of private in program languages as "if an user interacts with this directly, the computer is likely to halt and catch fire". In imperative languages, private is thus most apt for black magic implementation details. In a declarative, functional language like Lean, though, such black magic should be vanishingly rare (and thus the use of private as well).

Arthur Paulino (Dec 01 2021 at 14:13):

private can also be seen as a means to simplify APIs, so the user can focus on learning how to make use of the lib more easily

Arthur Paulino (Dec 01 2021 at 14:13):

"Look at this HUGE lib, but you can make full usage of its potential with these 4 functions"

Mario Carneiro (Dec 01 2021 at 14:14):

I think another mechanism should be used for that, analogous to rust's #[doc(hidden)]

Mario Carneiro (Dec 01 2021 at 14:15):

we've talked about a mechanized version of the "main definitions" section in mathlib files before using some kind of @[main_definition] attribute

Mac (Dec 01 2021 at 14:15):

Arthur Paulino said:

private can also be seen as a means to simplify APIs, so the user can focus on learning how to make use of the lib more easily

The problem is that it also prevents more advanced users from adapting it to suit their needs.

Mario Carneiro (Dec 01 2021 at 14:19):

I wonder whether it would be effective to just count mathlib-internal lemma uses to get a pretty accurate usefulness metric for theorems in a mathlib file. (This could also apply to anything generated by doc-gen, but mathlib probably has enough internal uses that it is not misleading except for leaf theorems)

Mario Carneiro (Dec 01 2021 at 14:21):

Re-sorting lemmas is sometimes problematic though, because order in the source is also a useful signal for relevance

Mac (Dec 01 2021 at 14:24):

@Mario Carneiro instead of re-sorting the lemma, one could just stick a Popular definitions: foo, bar, etc. string with a links to said definitions at the bottom of the module introduction.

Henrik Böving (Dec 01 2021 at 14:40):

Mac said:

Mario Carneiro instead of re-sorting the lemma, one could just stick a Popular definitions: foo, bar, etc. string with a links to said definitions at the bottom of the module introduction.

I think something machine parsable would be nicer for this, that way one could for example also have these lemmas rank higher in auto complete suggestions.

Mac (Dec 01 2021 at 14:41):

@Henrik Böving ah, sorry, I meant at the bottom of the introduction in the generated module documentation, not in the module docstring (i.e., the popular lemma would be automatically deduced from usage data as Mario was suggesting).

Henrik Böving (Dec 01 2021 at 14:42):

Ahh, that certainly sounds nice yes, although I'm still far from that with my doc-gen4 stuff takes note

Tomas Skrivan (Dec 01 2021 at 15:39):

Arthur Paulino said:

private can also be seen as a means to simplify APIs, so the user can focus on learning how to make use of the lib more easily

In C++ this is often done by hiding implementation details into internal or detail namespace. The same can be done in Lean and does not prevent user from reasoning about your code.

Mario Carneiro (Dec 01 2021 at 15:51):

indeed, you will find lots of definitions in mathlib with _aux in their names, used to evoke exactly this sense of internalness. For example docs#list.permutations_aux is called that because you probably don't want to be using it unless you know what you are doing, but it nevertheless has a bunch of theorems about it, like docs#list.mem_permutations_aux_of_perm (naturally, in order to prove theorems about docs#list.permutations)

Arthur Paulino (Dec 01 2021 at 15:57):

Thanks for such a rich set of ideas!

Kyle Miller (Dec 01 2021 at 16:31):

@Mac @Mario Carneiro I wonder if private would be more useful if it were like protected but put its definition in a namespace called private and opened the private namespace. For example:

namespace baz

private def frobnicate_whatsits : IO () := ...

def foo : IO () := frobnicate_whatsits

end baz

would be like

namespace baz

namespace private

def frobnicate_whatsits : IO () := ...

end private
open private

def foo : IO () := frobnicate_whatsits

end baz

Then outside the baz namespace, you'd have to go out of your way to refer to baz.private.frobnicate_whatsits, but it's still referenceable.

Mac (Dec 01 2021 at 16:35):

@Kyle Miller that would make private extremely trivial syntactic sugar: private def foo would just be short for def private.foo

Kyle Miller (Dec 01 2021 at 16:36):

Don't forget that I'm suggesting it also be extremely trivial syntactic sugar for open private so that you can refer to it without private.

Kyle Miller (Dec 01 2021 at 16:38):

At least in Lean 3, defining something as def private.foo can be different from def foo because in a recursive function it seems you need to use the full name.

Mac (Dec 01 2021 at 16:39):

Kyle Miller said:

At least in Lean 3, defining something as def private.foo can be different from def foo because in a recursive function it seems you need to use the full name.

that is not true in Lean 4 anymore, def foo.bar is literally just syntactic sugar for namespace foo; def bar; end foo.

Kyle Miller (Dec 01 2021 at 16:41):

I can't tell, is it being trivial syntactic sugar an objection?

Mac (Dec 01 2021 at 16:43):

@Kyle Miller yeah, while I don't think the current private should be used often, there are cases were it is appropriate and it does do something more complex than what you are proposing (which doesn't seem worth the sugar in my honest opinion).

Kyle Miller (Dec 01 2021 at 16:52):

While the current private might sometimes be appropriate, is it necessary? The reasoning behind this proposal is that it's sort of the minimal functionality to make it feel like Lean has a private modifier, but it still makes the definitions available enough for proofs.

A more complicated proposal is that anything that's private becomes noncomputable outside where it should be used (or some other mechanism to prevent you from evaluating private definitions).

Arthur Paulino (Dec 01 2021 at 16:55):

Three words that have been around my head: private, secret and internal. Maybe having different behaviors available can help

Arthur Paulino (Dec 02 2021 at 01:47):

I've finished the first "stable" version of LeanMySQL :tada:
https://github.com/arthurpaulino/LeanMySQL
I've dropped some problematic approaches and decided to keep this initial version simple.
Feedback is more than welcome!

And thanks everyone for the support :pray:

Arthur Paulino (Mar 04 2022 at 19:30):

@Chris Lovett and @Yuri de Wit I read what you said here so I brought the conversation to this more specific thread.

The LeanMySQL package has evolved since the last time I shared news about it and the next step is precisely providing a better API for queries. I've been thinking about a simple SQL DSL.

I don't think I've fully understood the ideas you guys mentioned. Are you talking about a way to make lean query the database at compile time (while writing Lean code) like this?

Arthur Paulino (Mar 04 2022 at 19:33):

I vaguely remember @Kyle Miller providing a link to that F# page too

Chris Lovett (Mar 04 2022 at 19:38):

I was thinking only of the language integrated query aspects of this kind of integration, being able to query the database live for metadata to provide intellisense fricken cool too. If we could do both that would rock. Is Lean4 InfoTree extensible enough to do that?

Arthur Paulino (Mar 04 2022 at 19:41):

I suspect it's possible, but I still haven't been able to make custom C code computations available via FFI like that. I've always been reliant on using the computation of C code inside the IO monad on built Lean code.

On your question about the InfoTree, I don't know the InfoTree is

Arthur Paulino (Mar 04 2022 at 19:47):

Chris Lovett said:

I was thinking only of the language integrated query aspects of this kind of integration

Can you explain this in more detail? I couldn't grasp the idea

Yuri de Wit (Mar 04 2022 at 19:48):

@Arthur Paulino Yes, that is exactly it. Really useful when integrating common external models such as SQL but could be anything including C headers, etc. However, it is way less important than a proper SQL DSL (LINQ or otherwise).

Chris Lovett (Mar 04 2022 at 19:52):

Language integrated query to me is just how they added "from", "select", "groupby", and "join" as keywords to C# and F#? so that you can write SQL queries in the language in a first class way rather than embedding the SQL query in an opaque string literal which was the old fashioned way of doing things before LINQ came along...

Chris Lovett (Mar 04 2022 at 19:53):

Lean4 has extensible syntax so this part should be easy to do in Lean4...

Arthur Paulino (Mar 04 2022 at 19:54):

I see, thanks. I get it now. I'm going for it!

Henrik Böving (Mar 04 2022 at 20:04):

You should basically be able to declare a bunch of syntax + a custom TermElab (I suspect this might be too hard for just a macro_rules to get right) to elaborate some sort of built-in query language to something one can pass to the SQL server/whatever

Arthur Paulino (Mar 04 2022 at 20:16):

@Yuri de Wit I can see it working if I declare a representation of the schemes in Lean by hand, but I don't know how to make Lean query the schemes dynamically like that

Yuri de Wit (Mar 05 2022 at 00:02):

I am not entirely sure, but I am afraid it would require some support from the compiler.

Arthur Paulino (Mar 05 2022 at 12:50):

I was able to get a minimal DSL for simple "select from where"-like queries that's able to express things like this.

But I have no idea how to build a syntax for it. I would like to be able to write

def q : SQLQuery := SELECT name, age FROM person WHERE age >= 20

Help is very much appreciated :pray:
By "help" I mean any help. Examples, documentation etc

Henrik Böving (Mar 05 2022 at 16:24):

You can look at Mac's lean4-alloy which defines quite an extensive amount of syntax, there is also lots of syntax declarations to be viewed in the compiler you can grep for and a few examples on how to custom syntax in the docs folder.

Florian Würmseer (Mar 05 2022 at 16:27):

Hi, speaking of SQL and feedback, I'm trying to get my feet wet with Lean (and functional programming). I investigated a little bit about @Henrik Böving s approach of using custom syntax and TermElab. I was able to get simple SELECT FROM queries working, but it felt more tedious than it should be. I'd be more than happy to receive some feedback (Especially before I move to more complex syntax like Where and Having) on the project to make sure this is the right way to do custom syntax and other things. Although this is for postgres, the dialects don't differ that much, and I'm hoping we can generalize queries like syntax in the future.

Arthur Paulino (Mar 05 2022 at 16:57):

Nice pointers! alloy is indeed a rich example.

@Florian Würmseer, I definitely think we can merge efforts with more granular packages. I'm really happy to know that you're doing something similar :octopus:

Siddharth Bhat (Mar 05 2022 at 17:03):

@Arthur Paulino If you like, there's more examples over at opencompl/lean-mlir/EDSL.lean, which follows exactly the grammar defined over at MLIR/LangRef.md. This maybe useful to see a 1:1 translation :)

Henrik Böving (Mar 05 2022 at 17:05):

Well sadly we are in SQL land so there is no "the grammar" there is a grammar for every dialect :/

Arthur Paulino (Mar 06 2022 at 02:35):

Florian Würmseer said:

I'd be more than happy to receive some feedback

I'm unsure about accepting a list of tables on the FROM clause (I'm checking your code)

Arthur Paulino (Mar 06 2022 at 09:23):

Also I don't think we're supposed to perform matches against those autogenerated syntax terms like `Query.col_AS_. I am (slowly) figuring it out by looking at all sources provided (including your project) and also based on some limited experience writing a few tactics in Lean 4. Will let you know as soon as I achieve something more concrete

Jannis Limperg (Mar 07 2022 at 14:43):

Arthur Paulino said:

Also I don't think we're supposed to perform matches against those autogenerated syntax terms like `Query.col_AS_.

Yeah this is not the way. @Florian Würmseer , I've created a PR for your project that makes the elaboration code more idiomatic. For the syntax implemented so far, you could also get away with macros, but maybe you'll eventually need more power.

Florian Würmseer (Mar 07 2022 at 18:49):

Thank you @Arthur Paulino and @Jannis Limperg for the feedback! Very much appreciated. And your pull request really helped me to understand the Syntax and original intention of termElabs.

Yuri de Wit (Mar 07 2022 at 19:12):

Interesting ... looking a the PR created by @Jannis Limperg , I was also :eyes: over it as it is a bit different than what I read in another document (I think the official one?).

I would also have reached out to declare_syntax_cat and company, but probably just for a single category for the whole DSL.

Jannis Limperg (Mar 07 2022 at 19:38):

My pleasure! No claims that this code is in any way optimal btw, it still doesn't feel particularly elegant to me.

Arthur Paulino (Mar 07 2022 at 19:56):

I'm needing some help too. I'm seeing myself writing repeated code because of parentheses. For example, on these two lines, I've written match cases that are very similar and have the same consequences.

Is there a way to avoid this duplication hell? It's gonna get worse once I write the match cases for sqlProp :scared:

Mario Carneiro (Mar 07 2022 at 19:58):

Why do you have non-nestable parentheses like that?

Mario Carneiro (Mar 07 2022 at 19:58):

You could always call yourself recursively in the paren case

Arthur Paulino (Mar 07 2022 at 20:00):

I see what you mean. I'll do some work

Arthur Paulino (Mar 07 2022 at 20:00):

Thanks!

Mario Carneiro (Mar 07 2022 at 20:00):

I'm not sure why the sqlProp case would be particularly bad

Mario Carneiro (Mar 07 2022 at 20:00):

you should have a function for "get value from ident" or whatever you need to make the six relational operators work

Mario Carneiro (Mar 07 2022 at 20:01):

in any case you can make it six calls to a function for relational operators

Arthur Paulino (Mar 07 2022 at 20:02):

It would be bad for the sqlProp case because each side of the operator can be inside parens too

Arthur Paulino (Mar 07 2022 at 20:03):

But I can kill it all with the recursive call. I'll try that

Mario Carneiro (Mar 07 2022 at 20:04):

Normally you would handle that by having a grammar construction like syntax "(" sqlProp ")" : sqlProp and then it only has to be handled once

Arthur Paulino (Mar 07 2022 at 20:07):

For the first case it's a bit trickier because the parens don't always cover the entire syntax:

declare_syntax_cat                  selectField
syntax ident                      : selectField
syntax ident " AS " ident         : selectField
syntax "(" ident ")"              : selectField
syntax "(" ident ")" " AS " ident : selectField

Arthur Paulino (Mar 07 2022 at 20:07):

but it's definitely manageable with a new declare_syntax_cat

Leonardo de Moura (Mar 07 2022 at 20:46):

@Mac I make heavy use of private, and I think it is very useful when maintaining the code. It makes it clear the function should not be used in other modules. It also helps accidental misuse when the function makes many assumptions about the input arguments. It also makes it clear to Lean users they should not assume this function will exist forever.

For a highly reflective language like Lean, though, where it is still perfectly possible circumvent private with tricks, it just makes for bad UX.

Sure, you can circumvent with tricks, but you cannot complain when things break if you use this kind of trick.

In imperative languages, private is thus most apt for black magic implementation details.

This is not true . private (static) is heavily used in many projects to hide implementation details from others even when there is no black magic.

Arthur Paulino (Mar 07 2022 at 20:50):

(sorry for the little mess btw. I wanted to rename the thread I created but for some reason Zulip split it in two)

Mac (Mar 07 2022 at 21:10):

Leonardo de Moura Since this thread is quite old, I am not going to resurrect it. However, I do appreciate hearing your thoughts on the topic, even if I (and @Mario Carneiro ) disagree.

Leonardo de Moura (Mar 07 2022 at 21:14):

@Mac I thought it was a new thread since it was on the top today. It seems it accidentally moved to the top when Arthur renamed some thread.


Last updated: Dec 20 2023 at 11:08 UTC