Zulip Chat Archive

Stream: lean4

Topic: find simp lemma uses


James Gallicchio (Aug 15 2023 at 15:27):

is there an easy way to find all uses of a simp lemma in mathlib? should I just crawl the fully elaborated body of every declaration in the environment looking for that lemma?

hoping for something that's faster than removing the simp attr and recompiling

James Gallicchio (Aug 15 2023 at 21:32):

Wrote a simple metaprogram to traverse everything and find uses of the lemma. Putting my code here for my own future reference!

import Lean

def defsToFind := [
  `List.countp_eq_zero,
  `List.countp_eq_length,
  `List.countp_filter,
  `List.one_le_count_iff_mem,
  `List.count_pos,
  `List.count_eq_zero,
  `List.count_eq_length
]

open Lean Meta

def prog := show MetaM Unit from do
  let e  getEnv
  e.constants.forM (fun name c => do
    match c.value? with
    | none => pure ()
    | some v => do
      for d in defsToFind do
        if v.find? (·.constName? = some d) |>.isSome then
          dbg_trace (d, name)
  )
  return ()

#eval prog

Alex J. Best (Aug 15 2023 at 21:57):

There is already a function docs#Lean.Expr.containsConst

Alex J. Best (Aug 15 2023 at 21:57):

I worry that this won't find when dsimp used a particular lemma if it is a rfl lemma, but that should be fairly uncommon

Scott Morrison (Aug 16 2023 at 00:09):

@James Gallicchio, no promises about correctness, usability, etc, as it's new and just an experiment, but https://github.com/semorrison/lean-extract might also give you what you want with no metaprogramming, just grep.

James Gallicchio (Aug 16 2023 at 01:26):

how do you extract this information? in particular, how do you identify whether lemmas were used by simp or not?

Scott Morrison (Aug 16 2023 at 02:23):

This code only reads the proof terms, so it is a bit of a hack to identify the lemmas used by simp, and very much implementation dependent (and possibly wrong in corner cases!). We look for the auxiliary definitions that are created by the simplifier, and unfold those to discover the lemmas actually used.

allUsedConstants takes a predicate describing constants which should be unfolded, and separately returns constants that appear directly in the proof term, and constants that appear behind (one step of) unfolding. If we specify that predicate to match the constants used when the simplifier generates auxiliary declarations, this gives us the lemmas used by simp! As I said, a hack. :-) Bug reports about the outputs welcome.

I have some separate code that instead reads the InfoTrees from a compiled module, and extracts all the tactic invocations from that, which will give us an independent way of identifying lemmas used by simp (although will miss lemmas used by simp when simp is called internally from another tactic). Hopefully I'll put this up soon.

Scott Morrison (Aug 16 2023 at 02:23):

If it would be useful I can add some CI that generates a downloadable .gz of the output of these two scripts.

Jannis Limperg (Aug 16 2023 at 11:43):

simp can log which lemmas it used (this is used to implement simp?). You could modify Lean to log this information, then use this modified Lean to compile the libraries of interest. This would probably give you more accurate data, at the cost of some more work.

James Gallicchio (Aug 22 2023 at 05:33):

It would be nice if core simp had a debug flag that causes every simp call to insert metadata nodes around every proof term it generates, so the information is clear from the expressions.

Does that sound like a good idea? I could put together a quick RFC but I have no notion of how complicated this would be to implement.

Scott Morrison (Aug 22 2023 at 05:46):

No, I don't think this would be good as an RFC. Core doesn't want things like this at the moment.

Between the proof terms and the InfoTrees all this information is there.


Last updated: Dec 20 2023 at 11:08 UTC