Zulip Chat Archive

Stream: new members

Topic: substring


Julian Berman (Sep 26 2020 at 01:03):

I can't seem to find the syntax for checking whether a string is a substring of another. Does that exist somewhere I'm not finding? It seems to exist on list as maybe is_infix, and I see I can convert strings to lists... is that what I should be doing?

Julian Berman (Sep 26 2020 at 01:05):

Oh, maybe I should try #print prefix which I saw mentioned somewhere...

Julian Berman (Sep 26 2020 at 01:13):

(No, nothing on string looks applicable to me, unless I'm missing it.)

Scott Morrison (Sep 26 2020 at 01:41):

The string API is embarrassingly sparse. Don't expect to find much useful there. Instead, either convert strings to lists of characters, or start writing some API. :-)

Julian Berman (Sep 26 2020 at 02:09):

Is there perhaps an API for getting the path Lean is installed to? Or am I out of luck there too

Julian Berman (Sep 26 2020 at 02:10):

Aha! I found some mathlib code doing essentially what I'm trying to do (and yeah doing it hackishly). So I guess "no".

Bryan Gin-ge Chen (Sep 26 2020 at 02:16):

Out of curiosity, what is it you're trying to do?

Julian Berman (Sep 26 2020 at 02:22):

I'm trying to generalize export_json that I found in doc-gen

Julian Berman (Sep 26 2020 at 02:22):

By making it only export decls that it finds in the "current project", regardless of if that's mathlib or not

Julian Berman (Sep 26 2020 at 02:23):

I can do that afterwards by post processing the filenames I see, but I was hoping not to waste time since it looks like running that takes like a minute to run over mathlib

Julian Berman (Sep 26 2020 at 02:23):

(And then more broadly what I'm trying to do is "generate docs for my thing")

Julian Berman (Sep 26 2020 at 02:28):

I'm sure that "teach that script to take an argument" is a suitably simple solution as well but I didn't/don't know how to do that argparsing so I was futzing with other parts of the code instead. But yeah I'm likely to give up and do something else soon, probably the post processing

Bryan Gin-ge Chen (Sep 26 2020 at 06:35):

Ah, cool! That sounds like a worthy task. Certainly generalizing doc-gen to work on non-mathlib projects is an eventual goal; it's just that we haven't had the time for it.

Let's loop in @Rob Lewis and @Gabriel Ebner who wrote export_json.lean.

Rob Lewis (Sep 26 2020 at 11:55):

I'd love to see doc-gen generalized! But indeed I haven't found the time to work on it myself. Ideally you'd be able to choose what projects/dependencies get exported. But there are subtleties here. We don't store old versions of the docs, so if you generate docs for your project that depends on an old version of mathlib, you can't reliably link mathlib declarations to the community docs.

Rob Lewis (Sep 26 2020 at 11:57):

There are a handful of other places in the code and UI where the assumption it's running on mathlib is pretty much baked in. Generalizing isn't a trivial job.

Rob Lewis (Sep 26 2020 at 11:58):

But identifying which project a declaration is from shouldn't be too hard. Core Lean filepaths are identifiable using the hack in doc-gen now. Dependency filepaths (e.g. mathlib, if you're in a project that depends on mathlib) are identifiable by the _target/deps/projectname prefix.

Rob Lewis (Sep 26 2020 at 11:59):

Unless your project is using local dependencies, in which case you might be out of luck, I think the user will have to provide this info.

Julian Berman (Sep 26 2020 at 12:41):

Thanks for the info. And yeah so I was going to try something different to what's in doc-gen to be honest

Julian Berman (Sep 26 2020 at 12:42):

If I can get decls out in some easy to process way (which is why I was looking at export_json) I was going to try to throw together a Lean sphinx domain

Julian Berman (Sep 26 2020 at 12:43):

At which point I'd just be able to have sphinx docs for the decls and ignore all the UI code in doc-gen which did indeed look quite mathlib-specific

Julian Berman (Sep 26 2020 at 12:48):

I tried indeed to basically do what you mentioned within process_decl, namely adding e.g. if "_target" \mem filename.split_on '/' then return none else do ... which does filter out mathlib but still leaves stuff defined in core lean (hence wanting to know how to get that path)

Julian Berman (Sep 26 2020 at 12:49):

So I tried instead using if e.is_prefix_of_file ... decl_name on the `all import as I think the mathlib code does but I was finding it hard to debug, I'm new to the notorious IO-monad world so I couldn't figure out what value wasn't what I expected in that big filter operation

Rob Lewis (Sep 26 2020 at 13:03):

You can mostly-reliably get the path to core by getting the location of some declaration you know lives in some particular file in core. E.g.

open tactic

run_cmd do
e  get_env,
os  e.decl_olean `nat,
trace os

Rob Lewis (Sep 26 2020 at 13:04):

Assuming nat lives at .../library/init/core.lean, you can strip that from the end of os and get the path to the root of core.

Rob Lewis (Sep 26 2020 at 13:06):

Julian Berman said:

So I tried instead using if e.is_prefix_of_file ... decl_name on the `all import as I think the mathlib code does but I was finding it hard to debug, I'm new to the notorious IO-monad world so I couldn't figure out what value wasn't what I expected in that big filter operation

I'm not quite sure what you mean here, could you elaborate?

Julian Berman (Sep 26 2020 at 13:24):

Sorry I mean something quite simplistic -- basically this is my entire diff at the minute using the "whitelist"-kind of approach rather than filtering out mathlib+core:

diff --git a/src/export_json.lean b/src/export_json.lean
index f476297..315bb2d 100644
--- a/src/export_json.lean
+++ b/src/export_json.lean
@@ -301,6 +301,8 @@ do ff ← d.in_current_file | return none,
    let decl_name := d.to_name,
    if decl_name.is_internal ∨ d.is_auto_generated e then return none else do
    some filename ← return (e.decl_olean decl_name) | return none,
+   some project_dir ← return (e.decl_olean `all) | return none,
+   if ¬(e.is_prefix_of_file (project_dir.popn_back 8) decl_name) then return none else do
    some ⟨line, _⟩ ← return (e.decl_pos decl_name) | return none,
    doc_string ← (some <$> doc_string decl_name) <|> return none,
    (args, type) ← get_args_and_type d.type,

Julian Berman (Sep 26 2020 at 13:24):

But running that produces no decls. How can I debug why that condition isn't matching?

Rob Lewis (Sep 26 2020 at 13:30):

docs#environment.decl_olean takes the name of a declaration, not a file. So that line will only work if there's a declaration named all in some file in project_dir.

Julian Berman (Sep 26 2020 at 13:32):

aha. and imports don't create some sort of runtime object?

Julian Berman (Sep 26 2020 at 13:34):

(i.e. I guess what I thought that'd do is that that file has import all in it, so I thought it'd give me the path that that import ended up finding)

Rob Lewis (Sep 26 2020 at 13:34):

No, AFAIK there's no convenient way to test whether one olean file imports another from within meta-Lean. You can do it externally.

Julian Berman (Sep 26 2020 at 13:35):

Got it, ok. Will go back to the other way then (of trying to filter out core + mathlib)

Rob Lewis (Sep 26 2020 at 13:35):

The semantics of decl_olean n are, "if n is the name of a declaration defined in some imported file, give me the path to that file, otherwise return none`"

Rob Lewis (Sep 26 2020 at 13:36):

If you can assume that lean is being run in your project root, you can get project_dir another way. tactic.unsafe_run_io io.env.get_cwd

Julian Berman (Sep 26 2020 at 13:46):

Is https://leanprover-community.github.io/mathlib_docs/core/init/meta/module_info.html#module_info.module_id another option?

Julian Berman (Sep 26 2020 at 13:46):

(About to see if I can figure out how to use it)

Julian Berman (Sep 26 2020 at 13:49):

Oh. It only works on the current module. OK. Nevermind

Rob Lewis (Sep 26 2020 at 13:51):

Oh, I forgot about those. I think you could make them work (of_module_name and the like) but I don't know enough about it to give you any real advice.

Julian Berman (Sep 26 2020 at 13:52):

Cool, will keep poking. Appreciate the help.

Patrick Massot (Sep 26 2020 at 15:02):

Did you at what leanproject does here? I'm away from my computer and I forgot

Julian Berman (Sep 26 2020 at 19:08):

Hm, for which part? (FWIW I got this working fairly easily, it was:

meta def get_project_dir : string :=
(string.popn_back (module_info.of_module_name "all").id "all.lean".length)

along with

   some filename  return (e.decl_olean decl_name) | return none,
   if ¬project_dir.is_prefix_of filename then return none else do

etc. but I have to propagate that change to the other keys from export_json, since that only handles decls, so yeah working through each one.

Patrick Massot (Sep 26 2020 at 20:05):

You can have a look at the output of leanproject decls

Patrick Massot (Sep 26 2020 at 20:05):

(the ouput is in decls.yaml)

Julian Berman (Sep 26 2020 at 20:07):

Yeah sorry I went back and looked after and figured you meant to point me at e.g. https://github.com/leanprover-community/mathlib-tools/blob/0c5b88b911c2aae1bfd441e7e2e07deb5084f4c2/mathlibtools/decls.lean#L37-L43


Last updated: Dec 20 2023 at 11:08 UTC