Zulip Chat Archive

Stream: lean4

Topic: Pre-computing declarations in macros


Matej Penciak (Feb 01 2023 at 18:48):

My hope is to be able define a macro that does some expensive computations at the time a data structure is declared, involving some complicated data types (in particular, the outputs are not Nats or Strings, and so on) and expands to declarations with the results of the calculations. A simple MWE of a strategy that doesn't work is this:

inductive Bit | zero | one -- Stand-in for some complicated data type
deriving Repr

open Nat in
def getBits (a : Nat) : Array Bit := -- Stand-in for some expensive calculation
  let rec loop (m : Nat) (acc : Array Bit) : Array Bit :=
    match h : m with
    | 0 => acc.push .zero
    | 1 => acc.push .one
    | n + 1 =>
      have : m / 2 < m := bitwise_rec_lemma (h  succ_ne_zero _)
      loop (m / 2) (acc.push <| if m % 2 == 0 then .zero else .one)
  loop a #[]

open Lean in
macro "new_data" id:ident "with data: " p:num
  : command => do
    let bits  `(getBits $p)
    `(
      namespace $id
      def $(mkIdent `bits) : Array Bit := $bits
      end $id
    )

If I understand things correctly: right now if I declare new_data blah with data: 10 then the macro basically expands to

namespace blah
def bits : Array Bit := getBits 10
end blah

So if I ever call blah.bits it has to re-compute the getBits function which is not something I want to have repeated every time it's called.

My hope was that I would be able to replace the macro with something like:

open Lean in
macro "new_data" id:ident "with data: " p:num
  : command => do
    let pNat : Nat := p.getNat
    let bits : Array Bit := getBits pNat
    let bitsSyntax : TSyntax `term := sorry-- Do something with `bits` to extract its syntax
    `(
      namespace $id

      def $(mkIdent `bits) : Array Bit := $bitsSyntax

      end $id
    )

I know roughly that such a thing should be possible, because if the computations return Nats I can use Lean.Syntax.mkNumLit, and I get the behavior that I desired in those cases.

Alternatively, if I have some fundamental misunderstanding of how Lean works and there's actually an easier way of doing this I'd be happy to hear any suggestions.

Gabriel Ebner (Feb 01 2023 at 18:55):

So if I ever call blah.bits it has to re-compute the getBits function which is not something I want to have repeated every time it's called.

This is not what's happening though. Top-level declarations are only evaluated once:

def bits : Array Nat := dbg_trace "expensive"; #[0,1,1,0,1,0,0,1]
@[noinline] def bigFunctionUsingBits := bits
#eval bits ++ bigFunctionUsingBits -- prints expensive once

(You actually need to be really careful with this feature. If you compile the Lean code into a native executable, then bits will be evaluated at initialization time, i.e., before main. So make sure bits is not too expensive either.)

Matej Penciak (Feb 01 2023 at 19:11):

Ok, this is great to know! I will see if I run into anything else with this, but I think I now have a sense of what to do going forward!

Gabriel Ebner (Feb 01 2023 at 19:17):

If you actually want to evaluate at elaboration time, you should look into the Quote (α → Syntax, but has incorrect Quote Syntax instance!!!) and the ToExpr (α → Expr) classes.

James Gallicchio (Feb 01 2023 at 19:59):

Gabriel Ebner said:

(You actually need to be really careful with this feature. If you compile the Lean code into a native executable, then bits will be evaluated at initialization time, i.e., before main. So make sure bits is not too expensive either.)

is it possible to eliminate dead top-level decls from the initialization sequence? :thinking:

Mario Carneiro (Feb 01 2023 at 20:05):

that is difficult because they are separately compiled

Mario Carneiro (Feb 01 2023 at 20:05):

each module produces an initialize_Foo() function which calls all the initializers in the file

Mario Carneiro (Feb 01 2023 at 20:05):

and at that point you don't know which of them will be used in the final executable

Arthur Paulino (Feb 01 2023 at 20:10):

Btw, if you have a dangling def foo : Nat := sorry that's not called, it will make your code panic everytime its compiled binary is executed.

I understand why it happens, but the error message is not very insightful :thinking:
IIRC, it doesn't say where the panic is coming from

Henrik Böving (Feb 01 2023 at 20:13):

Yeah that makes sense its a closed term so it gets extractedand closed terms are evaluated at the beginning

Henrik Böving (Feb 01 2023 at 20:21):

And generally you can find the panics by using GDB and backtracing if you want to, I guess we could also just check if a certain decl has sorry (we already track sorryDep in the compiler) and just...not extract it if that is the case?

Arthur Paulino (Feb 01 2023 at 20:23):

Henrik Böving said:

I guess we could also just check if a certain decl has sorry (we already track sorryDep in the compiler) and just...not extract it if that is the case?

That's what I was about to suggest but having a sorry doesn't mean that such sorry will be hit

Gabriel Ebner (Feb 01 2023 at 20:24):

You should also be aware of lean4#1965, which is another unexpected way that code can be run on initialization.

Arthur Paulino (Feb 01 2023 at 20:28):

Another idea: catch panics from sorry during initialization and just skip that declaration. There might be an error later that will look like unknown identifier blah, but I think it's better than the current message.

Is there some blocker that doesn't allow the initialization routine to see where that panic is coming from? (file name and line)


Last updated: Dec 20 2023 at 11:08 UTC