Zulip Chat Archive
Stream: mathlib4
Topic: Adèle and étale
Kevin Buzzard (Jun 30 2024 at 10:50):
I've been working with FiniteAdeleRing
recently, and much of the basic theory was written by Maria Ines, who lives in a country where the language has accents. The word "adèle" is consistently written with the accent in mathlib, although the corresponding declarations have no accent. For me this is slightly troublesome because I cannot by default search (or write!) mathlib docs for this word, as my keyboard has no è key.
On Friday me and some other people were adding comments and docstrings to some alg geom files and I added a theorem docstring /-- The localization of a formally etale map is formally etale. -/
. The PR is #14239 and Kim made the suggestion that I should be writing étale instead. In fact during the session when I was writing the docstrings I raised the issue of not being able to type accents, and Calle (whose first language contains accents) told me about a hack involving Alt Gr, which seems to be enabled by default on Ubuntu; I've written this message using Alt-Gr ; e and Alt-Gr # e to get the accents and my personal problems are now solved with this trick. But should we be sacrificing grammatical accuracy to make doc search easier for people who don't know the hack? And if not, should the declaration be FiniteAdèleRing
? Oh, apparently not, because this is not a valid token! OK so the remaining question is whether we should have accents in docstrings and comments. I'm pretty sure I've seen both etale and étale in the English mathematical literature so I'm not sure what to do here.
Michael Stoll (Jun 30 2024 at 10:55):
As a rule, I would prefer to keep accents, umlauts etc. in names (of people and of notions like "étale") in the docs.
(@Kevin Buzzard you should be able to define a "compose" key so that, e.g., <compose>"o gives you ö and <ompose>´ e gives you é, <compose>,c gives you ç etc.)
Kevin Buzzard (Jun 30 2024 at 10:57):
Indeed Alt-Gr is already my compose key, I just thought it was a random button on my keyboard and I had no idea what it did, I'm not sure I'd ever presed it until Friday!
Markus Schmaus (Jun 30 2024 at 11:01):
<compose> is great for typing all kinds of special characters including math symbols, for example <compose>*a should give you α and it works everywhere not just in vscode. I've been using wincompose to get the same functionality on windows.
Jeremy Tan (Jun 30 2024 at 11:02):
@Kevin Buzzard: on Ubuntu you can type absolutely any character using Ctrl-Shift-U + the codepoint, as I explained in the reply to your recent comment
Kevin Buzzard (Jun 30 2024 at 11:03):
I know -- this was the hack I knew beforehand, but doesn't this involve me having to remember a bunch of 4-digit numbers? So it is not actually useful in practice.
Jeremy Tan (Jun 30 2024 at 11:03):
I use this all the time to get proper – (U+2013), … (U+2026) and all the rest
Kevin Buzzard (Jun 30 2024 at 11:03):
Unfortunately I am too old to learn and remember something like this.
Kevin Buzzard (Jun 30 2024 at 11:04):
However I am capable of pressing Alt-Gr and then trying a random key to the right of L and then pressing e to see what kind of accent I just made on the e :-)
Jeremy Tan (Jun 30 2024 at 11:04):
I believe I learned this trick when I was studying French in university and had to type the accented characters on a US-Dvorak keyboard
Kevin Buzzard (Jun 30 2024 at 11:05):
Yes, in my experience the trick with learning and retaining any information is to either learn it when you're young (FLT), or to learn it when you're old and do it every day (Lean).
Joël Riou (Jun 30 2024 at 11:08):
I am on a Qwerty keyboard, and if I do setxkbmap -option compose:rwin
(Linux/Unix), the right "Windows" key becomes a "compose" key, and then pressing compose + "'" + "e" produces "é".
Jeremy Tan (Jun 30 2024 at 11:08):
Indeed the shortcuts for some emoji on Discord are named after their Unicode codepoints: :japanese_vacancy_button: is :u7a7a:
Kim Morrison (Jun 30 2024 at 11:32):
Michael Stoll said:
As a rule, I would prefer to keep accents, umlauts etc. in names (of people and of notions like "étale") in the docs.
(Kevin Buzzard you should be able to define a "compose" key so that, e.g., <compose>"o gives you ö and <ompose>´ e gives you é, <compose>,c gives you ç etc.)
I think that for any word that in the English literature usually has its accents dropped, we should go ahead and permit that in the docs! I think it's pretty rare in English to see "adèle"? But more common than not to see "étale"?
Kevin Buzzard (Jun 30 2024 at 11:34):
It might be author-dependent.
Yaël Dillies (Jun 30 2024 at 12:16):
Can we simply use the ## Tags
section in the docs for the unaccented versions?
Yaël Dillies (Jun 30 2024 at 12:17):
... or is the goal to find individual declarations within a file?
Kevin Buzzard (Jun 30 2024 at 12:30):
In my case it was simply searching for "finite adele" to find the relevant file and getting 0 results.
Kevin Buzzard (Jun 30 2024 at 12:31):
Given that my understanding is that we don't really know what we're using tags for, this could be a great use of tags!
Mario Carneiro (Jun 30 2024 at 16:01):
I remember getting hit by this issue frequently when I want to find the Bezout lemma, because the mathlib name is the un-mnemonic gcd_eq_gcd_ab
, and the docs on the theorem only say Bézout's lemma, so grep would turn up empty for "bezout". (This has since been somewhat improved because there is now a file named Bezout.lean
, and the tags on Mathlib.Data.Int.GCD
include the unaccented version as Yaël suggests.)
Mario Carneiro (Jun 30 2024 at 16:03):
It's too bad there is no vscode find in files setting for "match up to accents"
Mario Carneiro (Jun 30 2024 at 16:06):
Kevin Buzzard said:
For me this is slightly troublesome because I cannot by default search (or write!) mathlib docs for this word, as my keyboard has no è key.
Doesn't lean have a robust unicode text input system already? The editor tells me you can input this using \`e
(although amusingly this breaks the markdown formatting so it's not immediately obvious that's what it is trying to say)
Junyan Xu (Jun 30 2024 at 16:29):
\"o
doesn't work for me; in fact when I type \"
in VS Code a pair of "
will appear, and maybe that breaks things (is it due to some other extension I installed?).
I think we should make online docs search able to match up to accents (and subscripts as well), e.g. it's hard to search lemmas about Polynomial.eval₂ and Ideal.Quotient.mkₐ etc. presently.
Kevin Buzzard (Jun 30 2024 at 16:47):
Whether or not this works when writing a file, does it work in the search box?
Patrick Massot (Jun 30 2024 at 23:05):
I don't think we should compensate for people who don't want to learn how to use a keyboard. The question would be different for names using a completely different writing system that is hard to recognize. But accents are really easy for people who already know the Latin alphabet.
Thomas Murrills (Jun 30 2024 at 23:25):
Patrick Massot said:
accents are really easy for people who already know the Latin alphabet.
I don’t think so? “Hard to recognize” is relative. If you’re a native English speaker in the U.S., you probably only occasionally encounter “é” and almost never “è”. Especially in the U.S., “é” is pretty much only ever used to accent “e”s at or near the end of words (maybe the middle?); “étale” and “Bézout” make no sense in English orthography and are very unexpected appearances of an accent in that context. Also, I’ve actually only seen “adele” in mathematical usage spelled without an accent!
I would guess that “learning how to use a keyboard” does not include learning how to add diacritics for most native English speakers in the U.S. (not sure what things are like e.g. in the U.K. or in Canada), and definitely doesn’t tell you where they go, as different languages use them differently.
(I say “would guess” because although I am a native English speaker from the U.S., I wouldn’t expect other people to share my interest in learning keyboard shortcuts just for the fun of it… :grinning_face_with_smiling_eyes:)
Thomas Murrills (Jun 30 2024 at 23:32):
(I should clarify I’m not necessarily advocating for changing anything, but I am arguing against that reason for not changing anything. :) )
Gareth Ma (Jul 01 2024 at 00:09):
I don't think we should compensate for people who don't want to learn how to use a keyboard.
I think Mathlib (or just things in 2024 in general) should try to be as user-friendly as possible as long as it doesn't require too much extra effort. Here adding a few more tags takes 5 seconds for the author and makes it easier for the users, so I think we should actively do it (not just not not do it)
Peter Nelson (Jul 01 2024 at 00:44):
Patrick Massot said:
I don't think we should compensate for people who don't want to learn how to use a keyboard. The question would be different for names using a completely different writing system that is hard to recognize. But accents are really easy for people who already know the Latin alphabet.
I also think that this is a bit much. As part of an anglophone education, I was never taught how to use a keyboard to add accents to latin characters, even though I studied languages that use them for years. One could reasonably argue that the world shouldn't be this way, but I don't think putting up extra barriers for mathlib users is the way to set things right, even if the barriers are small.
Patrick Massot (Jul 01 2024 at 00:49):
Did your anglophone education include using a proof assistant?
Patrick Massot (Jul 01 2024 at 00:51):
Gareth Ma said:
I don't think we should compensate for people who don't want to learn how to use a keyboard.
I think Mathlib (or just things in 2024 in general) should try to be as user-friendly as possible as long as it doesn't require too much extra effort. Here adding a few more tags takes 5 seconds for the author and makes it easier for the users, so I think we should actively do it (not just not not do it)
Oh sure, we can add tags, and this is good.
Patrick Massot (Jul 01 2024 at 00:53):
I am only advocating against systematically misspelling things simply because they do not come from the dominant language, especially names of persons such as Bézout.
Peter Nelson (Jul 01 2024 at 00:57):
I agree the names should appear spelt correctly in theorem names, docs, etc. But it should be painless to search for them and autocomplete to them without having to type accents. Like millions of others, I would have to either remind myself with google or copy-paste from messages above this to add an acute/grave accent to an e.
Patrick Massot (Jul 01 2024 at 00:57):
Thomas Murrills said:
Patrick Massot said:
accents are really easy for people who already know the Latin alphabet.
I don’t think so? “Hard to recognize” is relative. If you’re a native English speaker in the U.S., you probably only occasionally encounter “é” and almost never “è”.
It is still comparatively easy to recognize, even if you don’t see them often. I was comparing things to the time it takes to distinguish, say 未 and 末 if you don’t know the Chinese language so that surroundings characters don’t help.
Mario Carneiro (Jul 01 2024 at 02:07):
sure but recognition is not the operative skill when entering a search query, recall is
Chris Henson (Jul 01 2024 at 02:35):
Just throwing in the fact that there is some search support for ignoring diacritics. As an example, in vim you could use [[=e=]]
which will match any of e, è, é, etc. (I'm not sure what the emacs or VSCode equivalent would be)
Not perfect because you still have to know where there might be a diacritic, but maybe you could hack together something more automatic.
Siddhartha Gadgil (Jul 01 2024 at 05:29):
How hard will it be in practice to make search work in spite of this category of spelling mistakes? In terms of equality this is simply "flattening" accents, so the question is what is exposed in the search API.
Last updated: May 02 2025 at 03:31 UTC