Zulip Chat Archive

Stream: mathlib4

Topic: Lean-based "changed names" script?


Yury G. Kudryashov (Mar 15 2025 at 19:35):

WDYT about running something like

import Lean.Environment

open Lean

def main (_ : List String) : IO Unit := do
  initSearchPath ( findSysroot)
  unsafe withImportModules #[{module := `Mathlib}] {} (trustLevel := 1024) fun env => do
  for (c, _) in env.constants do
    if !c.hasMacroScopes && !c.isInternal then
      IO.println c

(TODO: sort alphabetically), upload these files for master somewhere, then diffing master to HEAD in the CI?

Yury G. Kudryashov (Mar 15 2025 at 19:35):

(instead of the current text search in git diff)

Yury G. Kudryashov (Mar 15 2025 at 19:37):

This will catch stuff like "simps gnerates different set of lemmas now" or "to_additive chose a different name for the additive version".

Ruben Van de Velde (Mar 15 2025 at 19:37):

Or " I accidentally changed namespaces"

Yury G. Kudryashov (Mar 15 2025 at 19:38):

BTW, what's the right way to sort it in Lean?

Yury G. Kudryashov (Mar 15 2025 at 19:41):

For #22461, I used lake exe print_decls | sort

Yury G. Kudryashov (Mar 15 2025 at 19:42):

See also #lean4 > What's the right way to print lots of lines? for a thread where I asked why my script was slow (answer: because I was running it inside #eval).

Michael Rothgang (Mar 15 2025 at 20:29):

@Damiano Testa is working on a rewrite of the "declarations diff" script in Lean - and if I understood correctly, this would also catch such versions, right?

Damiano Testa (Mar 15 2025 at 20:32):

The relevant PR is #22464, but I have not checked how different that is from what Yury wants.

Damiano Testa (Mar 15 2025 at 20:33):

The file is not actually persisted: the script creates one for master and one for the PR, sorts them and then lets diff do the rest.

Damiano Testa (Mar 15 2025 at 20:35):

By the way, the script that I use in that PR is indeed slow: if there is anything in there that is useful to you, feel free to take it or push over it!

Yury G. Kudryashov (Mar 15 2025 at 21:11):

I'll try to speed it up tonight.

Yury G. Kudryashov (Mar 16 2025 at 02:19):

This executable runs in 4-5s:

import Lean

open Lean

/-- A copy of `isAutoDecl` from Batteries that takes `env` as an argument
instead of using `getEnv`.

TODO: upstream. -/
def isAutoDecl (env : Environment) (decl : Name) : Bool := Id.run do
  if decl.hasMacroScopes then return true
  if decl.isInternal then return true
  if isReservedName env decl then return true
  if let Name.str n s := decl then
    if s.startsWith "proof_" || s.startsWith "match_" || s.startsWith "unsafe_" then return true
    if env.isConstructor n && ["injEq", "inj", "sizeOf_spec"].any (· == s) then
      return true
    if let ConstantInfo.inductInfo _ := env.find? n then
      if s.startsWith "brecOn_" || s.startsWith "below_" || s.startsWith "binductionOn_"
        || s.startsWith "ibelow_" then return true
      if [casesOnSuffix, recOnSuffix, brecOnSuffix, binductionOnSuffix, belowSuffix, "ibelow",
          "ndrec", "ndrecOn", "noConfusionType", "noConfusion", "ofNat", "toCtorIdx"
        ].any (· == s) then
        return true
      if let some _ := isSubobjectField? env n (.mkSimple s) then
        return true
  pure false

unsafe def main (_ : List String) : IO Unit := do
  initSearchPath ( findSysroot)
  withImportModules #[{module := `Mathlib}] {} (trustLevel := 1024) fun env => do
    let names : Array String := (env.constants.fold · #[]) fun ar c _ 
      if isAutoDecl env c then ar else ar.push c.toString
    for name in names.qsortOrd do
      IO.println name

Yury G. Kudryashov (Mar 16 2025 at 02:21):

I'm tired after the flight (it got delayed for ~2h30m), so I can't test if it speeds up your bash script (git checkout + getting cache takes time too) tonight.

Yury G. Kudryashov (Mar 16 2025 at 03:49):

Your Lean script takes ~50s on my laptop, probably because of the insertion sort.

Yury G. Kudryashov (Mar 16 2025 at 03:57):

@Mario Carneiro WDYT about adding a version of isAutoDecl that takes an environment as an argument instead of relying on getEnv?

Mario Carneiro (Mar 16 2025 at 03:59):

what's the motivation? Can't you just pass in a monad reader state containing the env of your choice?

Yury G. Kudryashov (Mar 16 2025 at 04:05):

My motivation is "I don't know how to do it". Probably, it's a wrong motivation.

Yury G. Kudryashov (Mar 16 2025 at 04:14):

Another solution (for me) would be adding instructions on how to run a chunk of code in a context with ← getEnv returning a given env to the docstring of getEnv.

Yury G. Kudryashov (Mar 16 2025 at 04:19):

Is it docs#Lean.withEnv ?

Mario Carneiro (Mar 16 2025 at 04:20):

yes, that will do it

Yury G. Kudryashov (Mar 16 2025 at 04:23):

One more question: how do run CoreM code from inside a pure function?

Yury G. Kudryashov (Mar 16 2025 at 04:26):

Found docs#Lean.Core.CoreM.toIO and docs#CoreM.withImportModules

Yury G. Kudryashov (Mar 16 2025 at 04:39):

This prints nothing :(

import Mathlib.Lean.CoreM
import Batteries.Tactic.Lint.Basic

open Lean


unsafe def main (_ : List String) : IO Unit := do
  initSearchPath ( findSysroot)
  let names : Array String  CoreM.withImportModules #[`Mathlib] (trustLevel := 1024) do
    flip ( getEnv).constants.foldM #[] fun ar name _  do
      if !( Batteries.Tactic.Lint.isAutoDecl name) then
        return ar.push name.toString
      else
        return ar
  for name in names.qsortOrd do
    IO.println name

... and I don't know how to debug meta code.

Christian Merten (Mar 16 2025 at 10:07):

I think the reason is in the docstring of docs#Lean.withImportModules: "No live references to the environment object or imported objects may exist after act finishes." In your case, the reference to the environment object is in ar.push name.toString. Replacing that with ar.push name.toString.toName.toString makes it work, I am guessing that s.toName.toString is implemented sufficiently not-in-place, that it creates a new memory reference.

Yury G. Kudryashov (Mar 16 2025 at 14:13):

(note: I still don't understand why making a pure version of isAutoDecl is a bad idea; e.g., this will make it clear that the function doesn't depend on other parts of the context, only on env)


Last updated: May 02 2025 at 03:31 UTC