Zulip Chat Archive

Stream: Machine Learning for Theorem Proving

Topic: Internet Explorer: Targeted Repr Learning on the Open Web


Junyan Xu (Mar 04 2023 at 02:07):

we propose dynamically utilizing the Internet to quickly train a small-scale model that does extremely well on the task at hand. Our approach, called Internet Explorer, explores the web in a self-supervised manner to progressively find relevant examples that improve performance on a desired target dataset. It cycles between searching for images on the Internet with text queries, self-supervised training on downloaded images, determining which images were useful, and prioritizing what to search for next.

https://twitter.com/_akhaliq/status/1630413027227959296

Anyone to replace image search by literature search, vision model by language (or multimodal?) model, using mathlib as a starting point?


Last updated: Dec 20 2023 at 11:08 UTC