Zulip Chat Archive
Stream: lean4
Topic: Percent-encoded URLs mess up docgen redirect
Mario Carneiro (Mar 18 2025 at 01:39):
It's a bit difficult to demonstrate this bug because most browsers automatically do percent-decoding, but the following link does not work if you go to it directly, e.g. via wget
: https://leanprover-community.github.io/mathlib4_docs/find/?pattern=CategoryTheory.nerve%E2%82%82Adj#doc . I encountered this issue because links inside PDF documents do not do this decoding process, and so the string is sent unchanged as the find/
query string, which ends up redirecting to https://leanprover-community.github.io/mathlib4_docs/404.html#CategoryTheory.nerve%E2%82%82Adj
. This is apparently per spec - PDF does not let you put unicode in URLs at all, so AFAICT this makes it impossible to link to declarations using non-ASCII using the equivalent of docs#CategoryTheory.nerve₂Adj .
Mario Carneiro (Mar 18 2025 at 01:40):
...and of course the linkifier breaks too. Pretend I wrote docs#CategoryTheory.nerve₂Adj above. (But I will leave it unedited because it also seems to make a point, although I'm not sure what the takeaway should be, besides "non-ASCII in identifiers is still non-portable".)
Mario Carneiro (Mar 18 2025 at 01:57):
This seems to only be an issue with find/
, if one links directly to the docgen page https://leanprover-community.github.io/mathlib4_docs/Mathlib/AlgebraicTopology/SimplicialSet/NerveAdjunction.html#CategoryTheory.nerve%E2%82%82Adj
this works as intended (because the browser is interpreting the escapes). I think the source of the bug is the fact that find/
bypasses the usual browser handling to be more "lean-friendly"
Mario Carneiro (Mar 18 2025 at 02:13):
actually never mind, it was user error (the URL was double percent encoded). find.js is correctly decoding the URL fragment.
Last updated: May 02 2025 at 03:31 UTC