Zulip Chat Archive

Stream: job postings

Topic: Hiring Lean 4 experts to create benchmark for AI models


Nikil Ravi (Oct 06 2025 at 00:54):

Vals AI builds benchmarks to rigorously evaluate frontier AI models. We work with leading model providers and have developed industry-recognized benchmarks in domains such as law and finance. We are building a formal math benchmark for assessing how well AI systems can assist in mathematical research.

We’re looking for individuals with strong Lean 4 formalization experience and a background in mathematics to help design and implement a challenging formal mathematics benchmark.

Responsibilities

  • Formalize graduate-level mathematics problems in Lean 4, ensuring formalizations are precise, well-structured, and verifiable
  • Collaborate on the design and methodology for evaluating AI performance on these formalized tasks

Qualifications

  • Strong background in pure mathematics (at the undergraduate level or higher).
  • Familiarity with Lean 4 and strong experience with formalization and theorem proving
  • Interest in the intersection of AI and formal mathematics, and the ability to work independently while communicating clearly

Benefits

  • Compensation: Up to $150/hour, commensurate with experience
  • Authorship: Opportunity for co-authorship on resulting research publications and benchmark reports

To apply: Send your CV and prior experience with formal math to nikil@vals.ai

Pontus Sundqvist (Oct 28 2025 at 17:33):

is it still possible to apply to this?


Last updated: Dec 20 2025 at 21:32 UTC