Zulip Chat Archive

Stream: general

Topic: pygments


Andrew Ashworth (Feb 26 2018 at 16:23):

my understanding was zulip uses an old version of pygments, and that the lean pygments update was never merged to begin with

Patrick Massot (Feb 26 2018 at 16:24):

we can still do

meta def f := by sorry

Patrick Massot (Feb 26 2018 at 16:24):

which is a bit disappointing

Patrick Massot (Feb 26 2018 at 16:24):

I had better luck with earlier tests

Sebastian Ullrich (Feb 26 2018 at 16:25):

I don't think there is such an option

Patrick Massot (Feb 26 2018 at 16:25):

let me try again:

example : 1 + 1 = 2 := by norm_num

Patrick Massot (Feb 26 2018 at 16:26):

that one is nicer

Patrick Massot (Feb 26 2018 at 16:26):

pygments doesn't like meta...

Andrew Ashworth (Feb 26 2018 at 16:30):

i think pygments for lean is still on highlighting based on lean 2's syntax, iirc

Patrick Massot (Feb 26 2018 at 16:31):

I guess that could explain a lot

Andrew Ashworth (Feb 26 2018 at 16:32):

yeah, first i think we'll need to submit a support ticket to zulip and see if they will fork pygments for us, and then we can update the pygment's lexer file for lean

Patrick Massot (Feb 26 2018 at 16:33):

Yes, I guess @Reid Barton will do that for us

Reid Barton (Feb 26 2018 at 16:33):

I've already checked with the Zulip devs and they would be happy to fork pygments if there is a request to do so.

Reid Barton (Feb 26 2018 at 16:34):

If someone here can produce a patch then I can take care of getting it upstreamed

Patrick Massot (Feb 26 2018 at 16:35):

Great, @Sebastian Ullrich already has everything we need

Patrick Massot (Feb 26 2018 at 16:36):

I'm requested elsewhere. See you

Sebastian Ullrich (Feb 26 2018 at 16:48):

Our pygments fork is at https://bitbucket.org/gebner/pygments-main

Reid Barton (Jan 15 2019 at 11:54):

I ran all of mathlib and a couple of my projects through Gabriel's pygments fork, using the following command:

find . -name '*.lean' \( -exec ~/.local/bin/pygmentize -l lean -F raiseonerror '{}' ';' -o -exec echo $'\npygmentize failed:' '{}' ';' -quit \)

There was just one lexer error, it doesn't know about the escape sequence "\t". @Gabriel Ebner should I submit a PR, or would it be easier for you to just fix it yourself?

Reid Barton (Jan 15 2019 at 11:55):

(Minimum failing example: example : string := "\t")

Reid Barton (Jan 15 2019 at 11:58):

It also doesn't know about "\x12", "\u1234", or "\'" (I checked the Lean source to see what escape sequences exist)

Gabriel Ebner (Jan 15 2019 at 12:01):

Please file a PR.

Reid Barton (Jan 15 2019 at 12:02):

OK

Kevin Buzzard (Jan 15 2019 at 12:05):

I think you can get pwned with -exec if someone writes a malicious mathlib PR, although I've never understood the details (I read this in the find man page)

Reid Barton (Jan 15 2019 at 13:02):

Please file a PR.

I think I managed to do this.

Gabriel Ebner (Jan 15 2019 at 13:08):

Merged. I've also given you write access if you want.

Reid Barton (Jan 15 2019 at 13:23):

I know @Patrick Massot was also having LaTeX problems with some specific characters but I don't know whether that's something that would be fixed within pygments or outside it

Patrick Massot (Jan 15 2019 at 13:25):

Reid, you only need to try using pygment with LaTeX to see it's all broken, and needs fixing from pygment

Patrick Massot (Jan 15 2019 at 13:38):

Now I'm confused: I can't reproduce the problem

Patrick Massot (Jan 15 2019 at 13:46):

The following works:

\documentclass{article}

\usepackage{newunicodechar}

\newunicodechar{α}{\ensuremath{\alpha}}
\newunicodechar{β}{\ensuremath{\beta}}
\newunicodechar{ι}{\ensuremath{\iota}}
\newunicodechar{𝓤}{\ensuremath{\mathcal{U}}}
\newunicodechar{⁻}{\ensuremath{^-}}
\newunicodechar{¹}{\ensuremath{^1}}
\newunicodechar{∈}{\ensuremath{\in}}
\newunicodechar{∘}{\ensuremath{\circ}}
\newunicodechar{∀}{\ensuremath{\forall}}
\newunicodechar{∃}{\ensuremath{\exists}}
\newunicodechar{𝓝}{\ensuremath{\mathcal{N}}}
\newunicodechar{⨯}{\ensuremath{\mathcal{\times}}}
\newunicodechar{⟨}{\ensuremath{\langle}}
\newunicodechar{⟩}{\ensuremath{\rangle}}

\usepackage{minted}
\begin{document}
\begin{minted}{lean}
example  {f : α → β} (hf : uniform_continuous' f) : continuous f :=
suffices ∀ a, ∀ V ∈ 𝓝 (f a), f ⁻¹' V ∈ 𝓝 a,
  from continuous_iff_nhds_sets.2 this,
assume a V Vin,
suffices ∃ U ∈ 𝓤 α, f ⁻¹' V = ι a ⁻¹' U,  by rwa mem_nhds_uniformity',
have exW : ∃ W ∈ 𝓤 β, V = ι (f a) ⁻¹' W, by rwa mem_nhds_uniformity' at Vin,
let ⟨W, W_in, hWV⟩ := exW in
⟨(f ⨯ f) ⁻¹' W, hf W W_in,  -- let's use U = (f ⨯ f) ⁻¹' W
  calc
    f ⁻¹' V = f ⁻¹' (ι (f a) ⁻¹' W) : by rw hWV
        ... = (ι (f a) ∘ f) ⁻¹' W   : rfl
        ... = (f⨯f ∘ ι a) ⁻¹' W     : by rw show ι (f a) ∘ f = (f ⨯ f) ∘ ι a,
                                            by ext ; refl
        ... = ι a ⁻¹' (f⨯f ⁻¹' W)   : rfl⟩
\end{minted}
\end{document}

compiled using xelatex -shell-escape test.tex after installing Gabriel's pygment fork

Patrick Massot (Jan 15 2019 at 13:47):

of course the newunicodechar lines are ugly, and you can't even do it once and for all because of TeX super small memory capacity. But this is not related to pygment

Sebastian Ullrich (Jan 15 2019 at 13:58):

AFAIR, we at some point declared a "fallback" token class in the Lean pygments tokenizer so that it should never fail on an unknown notation now

Sebastien Gouezel (Jan 15 2019 at 14:15):

If you use a font with enough symbols (such as the Isabelle fonts I sent you the other day), everything works out of the box without any need for \newunicodechar.
Unrelated (and out of topic): should I PR https://github.com/sgouezel/mathlib/tree/essai6 at once, or split it in smaller bits (and maybe wait for the refactoring)? :)

Patrick Massot (Jan 15 2019 at 14:26):

Oh, I forgot this email! I'll try to find it back

Patrick Massot (Jan 15 2019 at 14:27):

About the PR, here is a rule of thumb: each time GitHub says "Large diffs are not rendered by default" you may consider splitting in smaller bits

Johan Commelin (Jan 15 2019 at 14:30):

I think if you want reasonably fast merging you should aim for <200 lines per PR. If you are adding a new file, I guess you can make them a bit larger. But everything >500 lines seems to just rot away in the queue.

Patrick Massot (Jan 15 2019 at 14:30):

We are talking about " 3,734 additions and 264 deletions. " in one commit here

Johan Commelin (Jan 15 2019 at 14:30):

I do understand that this means you will have to create 10-20 PRs...

Patrick Massot (Jan 15 2019 at 14:30):

Let's wait for the reorganization maybe

Sebastien Gouezel (Jan 15 2019 at 14:37):

My question was mainly a joke, of course I won't PR it in one go. And yes, I will wait for the refactoring before I submit anything. By the way, I am really happy to have done something in Lean (define the space of all nonempty compact metric spaces, with the Gromov-Hausdorff distance) that does not even make sense in Isabelle.

Patrick Massot (Jan 15 2019 at 14:38):

Does it directly make sense in DTT, or did you still have to cheat a bit?

Sebastien Gouezel (Jan 15 2019 at 16:38):

It makes sense to consider all compact subsets of (N)\ell^\infty(\mathbb{N}) modulo isometry (and it also makes sense in Isabelle). Call the quotient GH_space. It is easy to check that any compact metric space can be isometrically embedded in (N)\ell^\infty(\mathbb{N}), which gives a canonical map associating to any compact metric type a GH_space element. This is the part that does not make sense in Isabelle: you can not define a map that will take a type (with some additional typeclass properties) and yield an element of another type.

Patrick Massot (Jan 15 2019 at 16:48):

I understand that. But my distant memory of GH distance between X and Y involve sending X and Y into any other metric space. Is such a definition already problematic (and only heuristic) in set theory?

Sebastien Gouezel (Jan 15 2019 at 17:21):

But my distant memory of GH distance between X and Y involve sending X and Y into any other metric space. Is such a definition already problematic (and only heuristic) in set theory?

Textbooks do not really go to such details, but they should: if you say "all compact metric spaces up to isometry", this is not well defined (but if you notice that the cardinal of such a space is bounded, then everything is easy to solve). In the same way, if you say "send X and Y to all metric spaces", this is not well defined, but in fact all that matters is the union of the images of X and Y, which is compact and therefore has bounded cardinal. This is essentially what I do, saying that it is enough to consider embeddings in (N)\ell^\infty(\mathbb{N}) to say something about all metric spaces.

Patrick Massot (Jan 15 2019 at 17:25):

Still, do you think everybody working with GH distance will be happy with your definition?

Patrick Massot (Jan 15 2019 at 17:25):

Should we ask Gromov? :wink:

Sebastien Gouezel (Jan 15 2019 at 17:26):

Yes, because I prove that it gives what you expect: for any embeddings of X and Y in metric spaces, the Gromov-Hausdorff distance between X and Y is bounded above by the Hausdorff distance of the images. And there is one such embedding that realizes the Gromov-Hausdorff distance.

Sebastien Gouezel (Jan 15 2019 at 17:28):

In fact, the API is designed so that, in the end, you should completely forget that it was defined using (N)\ell^\infty(\mathbb{N}): this is just a handy technical tool in the construction.

Johan Commelin (May 10 2019 at 13:57):

What is the current status of pygments?

Johan Commelin (May 10 2019 at 13:57):

Where is the most up to date version hosted? I couldn't find a repo on @Gabriel Ebner's github account with pygm in the name...

Johan Commelin (May 10 2019 at 13:59):

Aah, I now found https://bitbucket.org/leanprover/pygments-main. Is that the version I should use?

Sebastian Ullrich (May 10 2019 at 14:00):

https://bitbucket.org/gebner/pygments-main

Patrick Massot (May 12 2019 at 10:22):

We had to come back to this pygment issue for the perfectoid website. Of course we use Gabriel's pygments. But I think we should really try to PR it upstream. Very unfortunately, I see on https://pypi.org/project/Pygments/#history that we have missed a release. @Gabriel Ebner don't you want to open a PR? Clearly they are currently very active. It would be very nice to get rid of these red rectangles we get everywhere on Zulip. Maybe we should also introduce a more specific language code like "lean3" to prepare for future mess in the Lean4 transition

Gabriel Ebner (May 12 2019 at 14:20):

Ok, I'll look into making a PR. My experience with bitbucket and mercurial is very lacking though, so it might take a bit. I'm not sure if we should have different syntaxes for lean3 and lean4. Right now the differences (for syntax highlighting) are very minimal (just a few new keywords).

Patrick Massot (May 12 2019 at 14:22):

Their use of bitbucket and mercurial is indeed very painful...

Bryan Gin-ge Chen (Mar 11 2020 at 19:40):

I happened to look at pygments again today because of the mathlib doc-gen stuff and it looks like pygments development now occurs on github (and there's even been a commit that updated the Lean syntax highlighting).

@Gabriel Ebner it might be easier now to make a PR. If you'll be busy for a while with other stuff I can also try to tackle it sometime next week.

Gabriel Ebner (Mar 11 2020 at 19:40):

@Bryan Gin-ge Chen Please do it!

Johan Commelin (Mar 11 2020 at 19:43):

Wow... would that mean that the dear red boxes on Zulip might disappear in the near future?

Bryan Gin-ge Chen (Mar 11 2020 at 19:50):

I hope so. It looks like Zulip is a few versions behind on pygments but presumably we can ping them to bump their dependency after we get pygments fixed.

Patrick Massot (Mar 11 2020 at 21:17):

Getting rid of the red boxes would be so awesome!

Bryan Gin-ge Chen (Mar 14 2020 at 19:43):

PR here: https://github.com/pygments/pygments/pull/1415

Alex Peattie (Sep 12 2020 at 16:20):

Looks like Pygments just got a new release :smile: - https://github.com/pygments/pygments/releases/tag/2.7.0

Bryan Gin-ge Chen (Sep 12 2020 at 16:22):

I've just pinged Zulip about it: https://github.com/zulip/zulip/issues/16335 :pray:

Bryan Gin-ge Chen (Sep 23 2020 at 06:42):

Syntax highlighting is now fixed! Compare this old message and this one:

-- works
example (t : R) (a b : ) (h : a  b) : t * a  t * b :=
by nlinarith [norm_nonneg t]

(Edit: scrolling back, it looks like it was changed sometime around 24 hours ago.)

Jeremy Avigad (Jun 12 2021 at 14:10):

I am using sphinx with pygments for syntax highlighting. Lean code gets highlighted, but "def" is not recognized as a keyword. Before I do some digging on my own, does anyone know offhand whether there is a better version of pygments that I should be using somehow?

Sebastian Ullrich (Jun 12 2021 at 15:03):

What I usually do is to copy pygment's theorem.py and then include the modified file using

\newmintinline[lean]{theorem.py:LeanLexer -x}{bgcolor=white}
\newminted[leancode]{theorem.py:LeanLexer -x}{fontsize=\footnotesize}

Current theorem.py from our course: https://gist.github.com/Kha/ba248a9cd2d245d741c85bd8efb5442d

Sebastian Ullrich (Jun 12 2021 at 15:03):

This should of course be upstreamed... eventually

Sebastian Ullrich (Jun 12 2021 at 15:04):

What I haven't figured out yet is how to highlight #check but not check

Jeremy Avigad (Jun 12 2021 at 15:45):

Thanks! Sphinx also uses Pygments to generate the html docs, but I should be able to specify the new lexer in the conf.py configuration file. I'll try it later and report back.

Eric Wieser (Jun 12 2021 at 15:51):

Which version of pygments are you using?

Eric Wieser (Jun 12 2021 at 15:51):

The latest version has much better support than the one from about a year ago thanks to Bryan

Eric Wieser (Jun 12 2021 at 15:53):

Xref https://github.com/zulip/zulip/issues/16335#issue-700291108

Jeremy Avigad (Jun 12 2021 at 16:09):

I installed the current version of Sphinx, which has Pygments built in. So I used whatever came with it. I am guessing that it is just an older version.

Eric Wieser (Jun 12 2021 at 16:15):

Does sphinx have an all in one installer? I assumed you install it through pip, in which case it will use any pygments version already on your system, and only install the latest version if it can't find one

Eric Wieser (Jun 12 2021 at 16:16):

Your python version is likely also relevant, as the python bundled with many distros is far older than pygments still releases for.

Bryan Gin-ge Chen (Jun 12 2021 at 16:19):

Eric Wieser said:

The latest version has much better support than the one from about a year ago thanks to Bryan

(I can't take too much credit! All the actual grammar changes were actually due to other folks (I think Floris, Gabriel and Reid?))

Jeremy Avigad (Jun 12 2021 at 17:33):

Brilliant! I did pip install --upgrade pygments, and now def is highlighted correctly. @Eric Wieser, many thanks. That was exactly the advice I needed.

Heather Macbeth (Dec 20 2022 at 19:01):

I'm writing a Sphinx document with some Lean code and am curious about whether there's a better Pygments style for Lean than the default one.

  • Does anyone know which Pygments style Zulip uses?
  • I think I found the Pygments style which is the Alectryon default?

I'm also a bit confused about the wide world of syntax highlighting. What is the Lean 4 manual using, is that also Pygments? Is it possible to take a VS Code theme that you're happy with and turn it into a Pygments style?

Eric Wieser (Dec 20 2022 at 19:03):

The lean4 manual is using highlightjs; I found out by using "inspect element" in the browser and spotting the hljs class names.

Eric Wieser (Dec 20 2022 at 19:04):

Arguably using Pygments is "cleaner" because it does the highlighting up-front when you generate the pages with sphinx, rather than on the fly on the user's device on each page load

Eric Wieser (Dec 20 2022 at 19:05):

I think the pygments grammar is better than the hljs one for Lean 3 too; although I likely have some patches I could contribute back for hljs as it's what I use in presentations

Eric Wieser (Dec 20 2022 at 19:06):

#tpil uses pygments; do you not like the style there?

Heather Macbeth (Dec 20 2022 at 19:09):

Indeed, it looks like #tpil is using the default Pygments style. I think this is a nice style, but for teaching might prefer to use a style which closely mimics the VS Code theme in my setup.

Heather Macbeth (Dec 20 2022 at 19:09):

And I was also curious about what styles Zulip and Alectryon are using because I assume somebody already put a lot of thought into that?

Eric Wieser (Dec 20 2022 at 19:15):

This looks to be the Zulip style

Eric Wieser (Dec 20 2022 at 19:17):

The dark zulip style is here

Eric Wieser (Dec 20 2022 at 19:18):

(They were in the same file until this commit)

Heather Macbeth (Dec 21 2022 at 13:48):

It looks like @Niklas Bülow has written a custom Lean 4 Pygments lexer for Alectryon:
https://github.com/leanprover/alectryon/blob/master/alectryon/pygments_lexer.py
which differs from the default Pygments lexer
https://github.com/pygments/pygments/blob/master/pygments/lexers/theorem.py
written, apparently, by @Gabriel Ebner , @Sebastian Ullrich and @Reid Barton.

Would it be possible to adopt any improvements Niklas has made into the default lexer and then eliminate the duplication? I think the discrepancy will get annoying in a document which mixes Alectryon blocks with plain code blocks; their output is noticeably different.

Heather Macbeth (Dec 21 2022 at 13:48):

I had a quick look and here are some differences I noticed:

  • Niklas' lexer categorizes as Keyword.Namespace some tokens which in the standard one are marked as Keyword.Declaration, including theorem, noncomputable, ...
  • Niklas' lexer categorizes as Keyword.Namespace some tokens which in the standard one are marked as Keyword, including #eval, #check,...
  • I think Niklas' lexer has a shorter keyword list overall but because of formatting differences I'm not sure.
  • Niklas' lexer doesn't treat docstrings differently from comments
  • Niklas' lexer has a list of symbols including , , , ... which are categorized as Name.Builtin.Pseudo; these have no special handling in the standard lexer

Heather Macbeth (Dec 21 2022 at 13:49):

Note: there is the question of whether we need separate Lean 3 and Lean 4 lexers; Niklas' is specifically intended for Lean 4. The languages may be close enough that we don't.

Sebastian Ullrich (Dec 21 2022 at 13:57):

I think Niklas' lexer is just a copy of https://github.com/leanprover/lean4/blob/master/doc/latex/lean4.py. We did not contribute changes back because for the longest time upstream Pygments was unmaintained.

Heather Macbeth (Dec 21 2022 at 14:27):

Thanks! Does that mean that (now that upstream Pygments is maintained again) the default Pygments lexer is strictly better?

Eric Wieser (Dec 21 2022 at 16:04):

The default pygments lexer is for Lean3 AFAIK

Eric Wieser (Dec 21 2022 at 16:05):

Looking at the file history the last update was by me two years ago, and I was definitely thinking of only Lean 3.

Eric Wieser (Dec 21 2022 at 16:07):

Note: there is the question of whether we need separate Lean 3 and Lean 4 lexers;

I think this is a good idea, if we don't then we have to worry about breaking Lean3 highlighting each time we add support for new Lean4 syntaxes. The languages have different grammars so they should expose different Lexers.

We can of course still share implementations between them

Heather Macbeth (Dec 21 2022 at 17:44):

One tricky point is that we'll have to choose which one is our Zulip default (probably Lean 4, to look forward). And then make a mass change backwards to edit most earlier messages with unlabelled code blocks to have a Lean 3 label.

Patrick Massot (Dec 21 2022 at 17:49):

I think that even admins can't do a mass message edit.

Patrick Massot (Dec 21 2022 at 17:49):

If I remember correctly I already asked that question.

Eric Wieser (Dec 21 2022 at 17:51):

Old messages contain the output of the syntax highlighting

Eric Wieser (Dec 21 2022 at 17:51):

You can see that in really old messages where the highlighting is broken because the highlighter used at the time had bugs

Eric Wieser (Dec 21 2022 at 17:51):

So there is no issue here with changing the default

Eric Wieser (Dec 21 2022 at 17:52):

Maybe we can ask for having different per-stream defaults while the port is still happening?

Johan Commelin (Dec 21 2022 at 17:52):

If people edit an old message then they might want to change ``` into ```lean3 . But that's not really a problem.

Yaël Dillies (Dec 21 2022 at 20:40):

Another day, another reason to love :zulip:

Eric Rodriguez (Dec 21 2022 at 20:48):

zulip's team always listens too, even to crazy bugs like not being able to search the word "unit"


Last updated: Dec 20 2023 at 11:08 UTC