Zulip Chat Archive

Stream: mathlib4

Topic: policy proposal: mention AI use in PR description?


Michael Rothgang (Sep 14 2025 at 20:42):

I think we should make a policy that use of AI should be disclosed in the PR description. I will review a PR differently, depending on how it was made (because humans and AI have different failure modes).

I vaguely remember this being discussed already: was this already a consensus and just nobody updated the webpage (and the new PR template on github) yet?

Michael Rothgang (Sep 14 2025 at 20:43):

In practice, my impression is that many PRs already do this --- so it would be merely documenting de facto practice, but in a manner new contributors could actually see and follow. (Reading all of zulip before may happen, but is too high a bar.)

Yakov Pechersky (Sep 14 2025 at 21:00):

I use vscode copilot to help me with refactors, like renames or typo fixes. Not via chat/agent mode, but inline intelligent suggestions. Does that count? I don't think it should.

So I would suggest that at base, if AI was used to make a new statement (def or lemma declaration) one should indicate that. Since that is the most critical point of departure.

Thomas Murrills (Sep 14 2025 at 21:14):

Perhaps a policy of “say what you used it for” could be most helpful?

To ease the burden on contributors, maybe there could be checkboxes in the template that covered common use cases, including minor ones like this.

I think it might still be useful for reviewers to know where to anticipate and scan for “the kind of thing AI might get wrong”, e.g. if copilot inserted words that flow well but actually change the meaning when completing a docstring.

Alfredo Moreira-Rosa (Sep 14 2025 at 22:19):

i don't think it's relevant. Everybody will sooner or later always use AI as a helper to write anything.
In my field, around 80% of software is already AI assisted.
My only concern is about using AI without human supervision , like claude... where from the beginning to the end the AI does everything.
And it's a concern only because of the quality produced and flooding. Not because it's AI.
When someone provides a PR you don't ask if got help from a friend, you judge from Quality, and that should always be the target.
If someone flood with PR of law quality, i think detecting the behaviour with a tag and a ban should be enough.

Michael Rothgang (Sep 14 2025 at 22:23):

This is not a discussion "should we use AI". (I have my opinions, but these don't matter here.) It's a discussion "how can we best review PRs that have AI input", as the kind of input the user got changes what I should look for.

Alfredo Moreira-Rosa (Sep 14 2025 at 22:24):

And i am answering that tagging use of AI is irrelevant for the reasons i mentionned.

Bryan Gin-ge Chen (Sep 14 2025 at 22:36):

Speaking as a maintainer (but not on behalf of the maintainers!), I also think it'd be useful to know if AI was used to a large extent in writing a PR. I also hope that if someone gets a significant amount of help from a friend on a mathlib PR, they acknowledge their friend's help as well. And the same goes for PRs that are put together by other automated means; it's useful background to include in the PR description.

Not only does that information change what we focus on while reviewing (as Michael said above), it also changes the kind of feedback that would be useful to provide to the author.

Niels Voss (Sep 14 2025 at 22:57):

I think we should be careful not to in any way disadvantage PRs based solely on their usage of AI, because otherwise we risk incentivizing people to lie and not disclose AI usage.

Chris Henson (Sep 14 2025 at 23:52):

One piece of transparency I have appreciated is when I see that a commit is marked as being coauthored by Claude (not sure if other models do this). I agree that it shouldn't inherently prejudice against the PR, though I can't help but notice correlations with quality.

Julian Berman (Sep 15 2025 at 00:12):

Just in case it's useful for comparison, https://lwn.net/Articles/1032612/ was a recent article about the kernel's back-and-forth about this topic.

Harald Husum (Sep 15 2025 at 09:57):

I think there is a difficult-to-define cutoff here.

Some base level use of AI tools in shaping a PR has become so ubiquitous that it hardly needs mentioning. But, if AI tooling can take more credit than the human author for substantive portions of a PR, it makes sense to mention. How does one judge whether the latter applies? Perhaps it is fine to leave this to the judgement of the person making the PR.

Ruben Van de Velde (Sep 15 2025 at 09:57):

It makes more sense to leave this to the judgement of the reviewer

Artie Khovanov (Sep 16 2025 at 19:57):

Agree that the most salient issue here is the error distribution
For this reason I think it should be policy that substantive AI use should be disclosed in the PR description
Otherwise I worry about accruing hidden technical debt

Violeta Hernández (Sep 16 2025 at 21:04):

From what I've seen, the biggest issue with using AI in any capacity is that it will make decisions you might not even be conscious of. And that's something any human reviewer needs to be aware of.

Violeta Hernández (Sep 16 2025 at 21:06):

I think it'd be easier policy-wise to require AI disclosure uniformly, though of course minor usages can have a similarly minor mention.


Last updated: Dec 20 2025 at 21:32 UTC