Zulip Chat Archive

Stream: new members

Topic: algorithmic information theory


Rudi Cilibrasi (Nov 09 2021 at 16:54):

greetings my new Friends! i like graph theory and automated theorem proving, looking forward to learning lean. Previously made a first order predicate logic theorem prover using linear resolution with paramodulation in the 1990's. Trying to understand lean better since last week i learned that the main theoretical math I help create twenty years ago was finally used in a proof using lean in "A parametrized family of Tversky metrics connecting the Jaccard distance to an analogue of the normalized information distance"

Eric Rodriguez (Nov 09 2021 at 17:01):

Paramodulation! We were having an interesting time with that a while ago trying to figure out the proof of things like McCune's axiom. Welcome, you may want to check out [stream#graph theory]

Rudi Cilibrasi (Nov 09 2021 at 17:33):

thank you @Eric Rodriguez i am in the stream and appreciate the pointer!

Alex J. Best (Nov 09 2021 at 17:52):

There has also been some discussion / PRs about Jaccard distance here before, hopefully you can find it by searching (I'm on mobile so unfortunately can't right now)

Arthur Paulino (Nov 09 2021 at 19:06):

Hey @Rudi Cilibrasi I am new to lean and I like graph theory too. If you'd want to team up, I'm a bit stuck on this branch. We're trying to formalize graph coloring as an homomorphism to a complete graph. You can check src/combinatorics/simple_graph/coloring.lean

Rudi Cilibrasi (Nov 09 2021 at 20:04):

thank you Arthur! that would be awesome if i can reach that level but first i will need to do the basic tutorial and environment install. thank you for this invitation it seems like a perfect introduction unless i discover it's really too hard in which case i am still happy for the practice learning

Kevin Buzzard (Nov 10 2021 at 15:11):

Several people have learnt a working knowledge of Lean by working on graph theory -- I think it's quite a nice entry point.

Bjørn Kjos-Hanssen (Nov 10 2021 at 20:12):

Hi @Rudi Cilibrasi ! That's my paper - https://arxiv.org/abs/2111.02498 (notice the "Code &Data" link) I got a lot of help from people here on the Lean part, see https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/Jaccard.20distance The paper is about an analogue of the NID (normalized information distance). Formalizing the NID itself would require first formalizing Kolmogorov complexity which I don't think has been done(?).

Rudi Cilibrasi (Nov 10 2021 at 20:13):

Wow thanks for this great detail Bjorn!

Rudi Cilibrasi (Nov 10 2021 at 20:13):

Have we got a Turing machine in math lib yet by any chance?

Bjørn Kjos-Hanssen (Nov 10 2021 at 20:16):

https://leanprover-community.github.io/mathlib_docs/computability/turing_machine.html

Bjørn Kjos-Hanssen (Nov 10 2021 at 20:25):

I don't see anything about existence of a universal Turing machine there, though?

Mario Carneiro (Nov 10 2021 at 20:27):

There is a universal partial recursive function, and some work on connecting turing machines to the partial recursive functions that is mostly complete but not all the library lemmas are there

Mario Carneiro (Nov 10 2021 at 20:28):

See https://leanprover-community.github.io/mathlib_docs/computability/halting.html for stuff about universal machines

Bjørn Kjos-Hanssen (Nov 10 2021 at 20:28):

@Mario Carneiro great, the universal partial recursive function could be used to define Kolmogorov complexity I suppose

Mario Carneiro (Nov 10 2021 at 20:29):

yes, I think that would work fine

Mario Carneiro (Nov 10 2021 at 20:29):

There is the docs#nat.partrec.code type for expressing partial recursive functions as elements of a countable type

Rudi Cilibrasi (Nov 10 2021 at 20:31):

is a countable type like peano integers?

Mario Carneiro (Nov 10 2021 at 20:32):

more precisely, docs#denumerable

Rudi Cilibrasi (Nov 10 2021 at 20:32):

i see zero and succ which seem familiar to me from that in that code

Mario Carneiro (Nov 10 2021 at 20:32):

yes, nat is the canonical denumerable type

Mario Carneiro (Nov 10 2021 at 20:33):

a denumerable type is one that is isomorphic to nat

Rudi Cilibrasi (Nov 10 2021 at 20:33):

great makes total sense

Mario Carneiro (Nov 10 2021 at 20:34):

in particular, this matters for encoding because the isomorphism gives you a way to assign bit sequences or so to individual programs

Mario Carneiro (Nov 10 2021 at 20:34):

so you can ask what is the 5th program

Mario Carneiro (Nov 10 2021 at 20:34):

and more complicated programs get bigger numbers

Rudi Cilibrasi (Nov 10 2021 at 20:35):

yes that seems pretty close to what would be required to define K and C kolmogorov complexity

Arthur Paulino (Nov 10 2021 at 20:41):

Btw I was reading turing_machine.lean today (haven't finished it yet) and it inspired me :pray:

Mario Carneiro (Nov 10 2021 at 20:41):

incidentally, the 5th program is prec zero zero which is another way of writing the zero function:

import computability.halting

open nat.partrec (code)
attribute [derive has_reflect] code
#eval let r := denumerable.of_nat code 5 in pp (reflect r) -- prec zero zero

#eval ((denumerable.of_nat code 5).eval 42).get undefined -- f 42 = 0

Reid Barton (Nov 10 2021 at 20:43):

What about the 37th program

Reid Barton (Nov 10 2021 at 20:43):

that seems more useful

Mario Carneiro (Nov 10 2021 at 20:45):

that's prec left left apparently, which is 0 for 0 and (nat.unpair (n-1)).1 otherwise

Rudi Cilibrasi (Nov 10 2021 at 20:48):

wow these are great programs!

Rudi Cilibrasi (Nov 10 2021 at 20:49):

i shouldn't let my imagine wander too far but i wonder how big the program must be to be turing complete

Rudi Cilibrasi (Nov 10 2021 at 20:50):

my friend john tromp is super into that question and has enjoyed writing up some pretty small turing machines on the order of dozens of bytes
... the rules aren't totally clear to that game though in the literature imo. which is why finding an answer with lean would be super interesting.

Mario Carneiro (Nov 10 2021 at 20:53):

I really enjoyed watching the discussion around Aaronson & Yedidia's search for the smallest unprovably halting TM, which resulted in an optimizing compiler for turing machines

Mario Carneiro (Nov 10 2021 at 20:58):

In principle, you should be able to extract the proof of computability in docs#nat.partrec.code.eval_part and turn the witness into a number, but I have reasons to believe that the result is very large and lean will probably not be able to handle it

Kyle Miller (Nov 10 2021 at 21:00):

While it's not a Turing machine, the binary lambda calculus is kind of neat. There's a 206 bit self-interpreter at https://tromp.github.io/cl/cl.html (more details: https://tromp.github.io/cl/Binary_lambda_calculus.html)

Rudi Cilibrasi (Nov 10 2021 at 21:01):

i installed lean just now and the vscode extension to try to reproduce Mario first example above for program 5. i cannot import computability.halting should i use elan to fix it or what?

Rudi Cilibrasi (Nov 10 2021 at 21:02):

yup thats my john too kyle he is super into it

Mario Carneiro (Nov 10 2021 at 21:03):

You need mathlib as well. See https://leanprover-community.github.io/get_started.html

Kyle Miller (Nov 10 2021 at 21:04):

Rudi Cilibrasi said:

yup thats my john too kyle he is super into it

(Oh, I completely missed that you said his last name earlier!)

Rudi Cilibrasi (Nov 10 2021 at 21:16):

how do i install computability.halting i am having trouble with the command
i think i have mathlib in there but i guess i can't be sure

Rudi Cilibrasi (Nov 10 2021 at 21:16):

i tried on two different linux

Rudi Cilibrasi (Nov 10 2021 at 21:17):

Otherwise pipx is ready to go! :sparkles: :glowing_star: :sparkles:
installed package mathlibtools 1.1.0, Python 3.8.10
These apps are now globally available

- leanproject

done! :sparkles: :glowing_star: :sparkles:
rudi@rsys:~$ cd ~/tmp

Rudi Cilibrasi (Nov 10 2021 at 21:18):

oh maybe pipx?

Rudi Cilibrasi (Nov 10 2021 at 21:21):

probably i will have to run a tutorial to figure out this new package management tool and determine if i really have mathlib or not. at least the eval 1+1 part did appear to work so something is working

Alex J. Best (Nov 10 2021 at 21:26):

Are you following the instructions at https://leanprover-community.github.io/install/debian.html ? And then on the linked at the bottom for lean projects?

Alex J. Best (Nov 10 2021 at 21:27):

Or https://leanprover-community.github.io/install/linux.html if not debian/ubntu?

Rudi Cilibrasi (Nov 10 2021 at 21:27):

yes i did follow that page... the wget way twice and also went through the long step way a bit too.. i am on 20.04 ubu
but i did not do lean projects at the bottom yet i guess that could be my problem

Rudi Cilibrasi (Nov 10 2021 at 21:28):

oh i see now i must need a project to add deps. cool! ty Alex

Alex J. Best (Nov 10 2021 at 21:28):

Yes mathlibtools is a helpful python package, it won't download any actual mathlib olean files for you to import things unless you give it a reason to, either you are working on your own local fork of mathlib, or a project that depends on it.

Rudi Cilibrasi (Nov 10 2021 at 21:55):

it works now i verified mario's first result. ty for help friends i have tried lean a few times before in earlier years and version 2 but i never got far and now in only one afternoon i feel like i am already equipped to make independent slow progress.

Rudi Cilibrasi (Nov 10 2021 at 22:39):

i am able to read and understand the succ and zero definitions in nat.partrec.code quite easily but the id, pair, left, and right are confusing me. i feel like i am not getting it. i am familiar with tuples, cartesian pairs, currying and element-extraction functions in other languages and high school math. i reviewed the equations definition in partrec_code.html for these and still feel like i must be missing some code. i'm familiar with how to draw a spiral in cartesian integers to show equivalence of the lattice to the line and back but i don't quite see it here yet. just another newbie question when somebody has time. meanwhile i will start to work more tutorial

Mario Carneiro (Nov 10 2021 at 22:45):

Note that nat.partrec.code doesn't explain what the functions actually do, it is only the raw structure of the program. A better place to look to understand what all the constructors do is docs#nat.partrec, which has the exact same constructors but it is talking about actual nat ->. nat partial functions

Mario Carneiro (Nov 10 2021 at 22:47):

In particular, to understand what left and right are doing you should look at docs#nat.unpair - the short answer is that it is the inverse to the cantor pairing function (or rather a slight variant on the cantor pairing function)

Mario Carneiro (Nov 10 2021 at 22:49):

We use these functions so that we can pack multiple variables into a single nat, using pair to put nats together and left/right to pull out the pieces

Rudi Cilibrasi (Nov 11 2021 at 02:36):

ok i've invited some of my science friends paul and joseph and soon i think i will invite john
lean is so much cooler now with the zulipchat :)

Rudi Cilibrasi (Nov 11 2021 at 02:36):

i want to get this K or C kolmogorov complexity show on the road

Rudi Cilibrasi (Nov 11 2021 at 13:16):

usually i think the most important weird math function necessary when talking about K and C is log. I see there is a nat/log in mathlib would that be appropriate to use for proofs bounding K and C?

Bjørn Kjos-Hanssen (Nov 11 2021 at 17:16):

Looks like nat.clog would be more useful when the log is in the upper bound

Bjørn Kjos-Hanssen (Nov 11 2021 at 17:17):

And nat.log when it's in a lower bound of K or C

Rudi Cilibrasi (Nov 11 2021 at 17:18):

excellent hint ty Bjorn!

Rudi Cilibrasi (Nov 11 2021 at 21:41):

Hmm I think maybe our aforementioned @John Tromp could be awake …

John Tromp (Nov 11 2021 at 21:41):

hi, Rudi

John Tromp (Nov 11 2021 at 21:42):

and lean fans

Rudi Cilibrasi (Nov 11 2021 at 21:43):

We have been talking about formalizing K and C in here as shown above largely based on Mario work already done with partial recursive universal fn

John Tromp (Nov 11 2021 at 21:44):

i don't know much about lean, but happy to answer any questions about binary lambda calculus

Rudi Cilibrasi (Nov 11 2021 at 21:44):

We also noticed lean mathlib already has nat log and clog suitable for basic theorems about K and C

John Tromp (Nov 11 2021 at 21:45):

i guess lean can prove any reduction in the untyped lambda calculus?

Rudi Cilibrasi (Nov 11 2021 at 21:46):

Oh I would assume so. It was @Kyle Miller who mentioned the binary lambda calc

Mario Carneiro (Nov 11 2021 at 21:47):

what is K and C? Does that stand for something?

Mario Carneiro (Nov 11 2021 at 21:48):

Lambda calculus is not defined in mathlib, but it could be

John Tromp (Nov 11 2021 at 21:48):

i use KS and KP in my paper instead of C and K

Mario Carneiro (Nov 11 2021 at 21:48):

are these combinators?

John Tromp (Nov 11 2021 at 21:49):

KS/C is plain complexity, with delimited programs

Mario Carneiro (Nov 11 2021 at 21:49):

oh, that

John Tromp (Nov 11 2021 at 21:49):

KP is prefix complexity, with self-delimited programs

Mario Carneiro (Nov 11 2021 at 21:50):

How precisely do you want to get the kolmogorov complexity definition? As I understand it you can get within a constant additive factor without trouble but the exact definition is language dependent

Mario Carneiro (Nov 11 2021 at 21:50):

so that would mean formalizing e.g. binary lambda calculus

Mario Carneiro (Nov 11 2021 at 21:51):

rather than anything with equivalent computational strength

John Tromp (Nov 11 2021 at 22:00):

yes, for theorem proving you want precise definitions. such as the TMs counted in https://oeis.org/A141475

John Tromp (Nov 11 2021 at 22:04):

or the KP/KS in my paper

Kyle Miller (Nov 11 2021 at 22:12):

@John Tromp There's a variation on binary lambda calculus that I've played with a little, but I haven't tried making a competitive self-interpreter with it yet (and I don't really expect it to be able to make terser programs). The idea is you let the De Bruijn indexing's successor (that's usually only for indices) apply to entire expressions. The interpretation is that everything inside the successor of an expression forgets the current innermost binding. This lets you factor out taking successors in applications if both variables in the application are nonzero indices.

inductive lambda
| get : lambda
| drop : lambda  lambda
| app : lambda  lambda  lambda
| abs : lambda  lambda

For the binary encoding, it's still a prefix code, but the codes are now just all four two-bit sequences.

John Tromp (Nov 11 2021 at 22:21):

so drop E is E with all de bruijin indices of free variables increased by 1?

John Tromp (Nov 11 2021 at 22:24):

and then get is equivalent to var 0

John Tromp (Nov 11 2021 at 22:24):

interesting variation!

Kyle Miller (Nov 11 2021 at 22:25):

Yep, or if you're thinking about it in terms of an evaluator that has a list of values for each de Bruijn binding, drop drops the first entry of the list, and get gets the first entry.

John Tromp (Nov 11 2021 at 22:26):

for instance S = \ \ \ 2 0 (1 0) becomes \ \ \ (+ + 0) 0 ((+ 0) 0)

John Tromp (Nov 11 2021 at 22:27):

in which case you get no savings, and get a longer program

John Tromp (Nov 11 2021 at 22:30):

hard to say if the interpreter would get longer or shorter though. without actually writing it out

John Tromp (Nov 11 2021 at 22:51):

i'll try it out tomorrow. close to bedtime now:)

John Tromp (Nov 12 2021 at 17:35):

the interpreter itself becomes simpler, but its encoding becomes larger: 222 instead of 206 bits

John Tromp (Nov 12 2021 at 17:36):

so for the goal of minimizing program sizes, i should stick with the current variable encoding

John Tromp (Nov 12 2021 at 17:37):

btw, there is only one occurrence in the interpreter where a drop can apply to an expression larger than one variable

Kyle Miller (Nov 12 2021 at 17:47):

Interesting -- thanks for looking into it. The reason I'd thought of it in the first place was to simplify case dispatch in the main loop, since there's now only a single inductive type rather than two.

Do you think it could be improved by using a base-4 stream with the encoding λλλλi\lambda\lambda\lambda\lambda i for i=0,1,2,3i=0,1,2,3? (Each of these would still count as two bits of course.)

Rudi Cilibrasi (Nov 12 2021 at 18:30):

Hey is @Joseph Dimos here yet?

Joseph Dimos (Nov 12 2021 at 18:39):

Greetings everyone! I like argumentation theory and its alignment with ATP. I often venture into applied category theory problems that interlace with satisfiability problems. Previously, I defined a formal chaining inside linear temporal logic and defined a dualistic theme for doxastic and epistemic logics according to belief probability. I'm also involved in some quantum gravity related problems that interlace 6j symbols with fusion categories in gauge theory. I'm quite interested in Lean, but am still learning its many facets.

Rudi Cilibrasi (Nov 12 2021 at 18:45):

Glad to see you here Joseph!

John Tromp (Nov 12 2021 at 20:41):

no, the interpreter makes very good use of the 2-bit encoding. moving to base 4 would certainly expand it significantly

Yaël Dillies (Nov 12 2021 at 22:28):

Rudi Cilibrasi said:

Glad to see you here Joseph!

Wow, you can wish people in existence!

John Tromp (Nov 14 2021 at 09:58):

Btw, there's a language called "Real Fast Nora's Hair Salon 3: Shear Disaster Download" (yes, really!) that looks similar to your proposal, Kyle.

John Tromp (Nov 14 2021 at 09:59):

https://esolangs.org/wiki/Real_Fast_Nora%27s_Hair_Salon_3:_Shear_Disaster_Download

John Tromp (Nov 14 2021 at 10:00):

Except they forgot to generalize to applying the ONE MORE THAN operator to arbitrary terms (while changing its encoding to 11).

Eric Rodriguez (Nov 14 2021 at 10:08):

Wow, the guy who made that is called Nathan van Doorn, but they are not related to Floris

Kevin Buzzard (Nov 14 2021 at 12:02):

@Floris van Doorn just to check you see this!

Floris van Doorn (Nov 14 2021 at 12:47):

Haha, that's great. I've contacted him.

Floris van Doorn (Nov 14 2021 at 12:48):

Thanks for the ping.

Kyle Miller (Nov 14 2021 at 16:45):

@John Tromp I guess my variant must be "Real Fast Nora's Hair Salon 4: A Little off the Top"


Last updated: Dec 20 2023 at 11:08 UTC