Zulip Chat Archive

Stream: job postings

Topic: Hiring Lean 4 experts to create benchmark for AI models

Nikil Ravi (Oct 06 2025 at 00:54):

Vals AI builds benchmarks to rigorously evaluate frontier AI models. We work with leading model providers and have developed industry-recognized benchmarks in domains such as law and finance. We are building a formal math benchmark for assessing how well AI systems can assist in mathematical research.

We’re looking for individuals with strong Lean 4 formalization experience and a background in mathematics to help design and implement a challenging formal mathematics benchmark.

Responsibilities

Formalize graduate-level mathematics problems in Lean 4, ensuring formalizations are precise, well-structured, and verifiable
Collaborate on the design and methodology for evaluating AI performance on these formalized tasks

Qualifications

Strong background in pure mathematics (at the undergraduate level or higher).
Familiarity with Lean 4 and strong experience with formalization and theorem proving
Interest in the intersection of AI and formal mathematics, and the ability to work independently while communicating clearly

Benefits

Compensation: Up to $150/hour, commensurate with experience
Authorship: Opportunity for co-authorship on resulting research publications and benchmark reports

To apply: Send your CV and prior experience with formal math to nikil@vals.ai

Pontus Sundqvist (Oct 28 2025 at 17:33):

is it still possible to apply to this?

Last updated: Feb 28 2026 at 14:05 UTC