Zulip Chat Archive

Stream: mathlib4

Topic: Refactoring framework (and generalizations)

Thomas Murrills (Aug 26 2025 at 18:22):

Recently, I've been working on ways to record and write refactors in a syntax- and elaboration-aware manner.

There are often two situtations in which we want to be able to construct a refactor:

within the elaboration of something whose elaboration is under our control (as an example, I was able to reformat the docstrings of to_additive to use /-- -/ instead of " " by modifying the elaboration of the to_additive attribute itself to record the refactor in oleans)
while traversing all syntax during linting

The second situation is a bit tricky, though, since linters are not allowed to modify the environment. My workaround is this:

have the linters store the edits in a ref (e.g. an IO.Ref or a Mutex, but the act of actually storing an edit is quick and usually relatively infrequent, so IO.Ref should be sufficient)
add elab_rules for the eoi command token at the end of every (ordinary) file which read the data out of said IO.Refs and ferry it into a persistent env extension.

Ideally, there would be controlled ways to modify the environment in linters. But being able to register "cleanup" operations that run at the end of every file is, I think, potentially more generally useful, and worth having anyway; e.g. we might like to run certain inexpensive env_linters at the end of every file during editing.

We also might want to record other data besides refactors from within linters, such as arbitrary statistics on a library. For example, this framework could allow the tactic analysis framework to persist profiling data into reports which could then be collected and consolidated from the oleans (or from any other file we might like to write during the cleanup phase).

(These two points are the "and generalizations" part of this topic.)

There are subtleties here, of course: we need to have an appropriate data structure for storing things in the ref during interactive elaboration, where the file may change and then get re-elaborated/re-linted; we need to wait for everything to finish before running a cleanup function; etc. This—plus figuring out exactly where the "joins" should be in this framework so that it is flexible but conveniently modular, and does not lock you into premature abstractions—is the bulk of what I'm working on.

I'd like to mention that this is not intended as an alternative to leanup (which faces more difficult challenges) but rather for controlled, same-version refactors.

One advantage of doing things this way is that it's hopefully more maintainable than an approach like lake exe refactor (from which I took inspiration—and a little code, thank you Mario :) ), which effectively recreates the lean frontend. Here, we only need to interact with hopefully-pretty-stable env extension and ref APIs.

It also allows us to run the "hard part" of refactors nonlocally: push a commit which activates your refactoring linter, wait for CI to finish, then pull the olean cache, and run a cheaper exe locally to read the recorded edits from oleans and write them.

(Initially, I did want to recreate the lean frontend so that we could intervene at any point during the process. But I ultimately felt that this would be too sensitive to relatively rapidly-changing Lean internals, and thus not maintainable. Maybe one day, though!)

Thomas Murrills (Aug 26 2025 at 18:23):

I'll share the code shortly, once I clean it up a little, but I can anticipate that perhaps more important than the underlying infrastructure will be the utilities that make refactorings easy. So, I'm asking here for potential large-scale refactors to stress-test this and provoke the development of basic utilities! :)

Damiano Testa (Aug 26 2025 at 18:27):

I have written various "one-use" refactor tools and another trick that I used is to have the linter rewrite the file when parsing eoi: linters are not allowed to modify the environment "after the fact", but they can still run writeFile... :upside_down:

Thomas Murrills (Aug 26 2025 at 18:45):

Ah, yes, I forgot to mention—I’ve seen eoi put to good use in your linters! :)

For thinking about a non-one-off (many-off?) framework, though, I really wanted to try to persist data nonlocally for large refactors which demanded whole-library builds. I also would worry a bit about rewriting the file while it built (albeit at the end), then relying on its oleans for future files… Maybe it’s not strictly necessary, but for a variety of reasons I like the notion of separating the phase of recording all edits from the phase of writing all edits here—it at least feels more robust. :)

Damiano Testa (Aug 26 2025 at 18:49):

I completely agree that having some framework where linters are, one way or another, allowed to make persistent environment changes is great! I always wish that linters came with a Boolean field allowPersistent that decides whether or not the environment should be reset after their code ran...

Last updated: Feb 28 2026 at 14:05 UTC