Zulip Chat Archive

Stream: PhysLean

Topic: Long term goal: APIs to build

Joseph Tooby-Smith (Jan 05 2026 at 05:55):

Inspired by this #PhysLean > PhysLean in 2026 @ 💬 message by @Shlok Vaibhav, I wanted to set more out more definitely some goals for PhysLean.

I personally see one of the aims of the projects being to build good API around key objects used in Physics. To this end, I have created this GitHub project:

https://github.com/orgs/HEPLean/projects/6

which tracks the building of important APIs within PhysLean (each API being one goal for the project). Each of the items (APIs to build) in the project are connected to a GitHub issue, e.g.

PhysLean#846

which specifies (or will specify):

the key data structure of the API
the need for the API
the requirements of the API
the corresponding file system within PhysLean

Some of these APIs are almost completely built, but the vast majority are have not been started or are in the very early stages.

I see a number of ways people can help here:

Helping build the APIs by defining the key data structures, and building the API with the stated requirements.
Helping add requirements and other details to the GitHub issues, expanding their details.
Suggesting other APIs we could add to the GitHub project.
Coming up with nice ways we could present, or track the building of these APIs.

Joseph Tooby-Smith (Jan 05 2026 at 13:22):

Here is one quick way I came up with to track the progress of this:

https://physlean.com/APITracker

(This just uses the underlying GitHub issues, and the information stored therein)

Screenshot 2026-01-05 at 13.21.45.png

Daniel Morrison (Jan 05 2026 at 22:27):

This is helpful! I'd also say that adding good references to go off of would be nice as well. For me I'd like to contribute more but my training in physics is at the undergrad level where things are not organized in a good way for formalization, so having a more advanced reference would be extremely helpful. Alternatively some idea of how we want to design some of this API (e.g. what level of generality, what kind of lemmas do we need, what kinds of mathematical objects are we using to model the physics object) would be useful.

Joseph Tooby-Smith (Jan 06 2026 at 05:23):

Alternatively some idea of how we want to design some of this API (e.g. what level of generality, what kind of lemmas do we need, what kinds of mathematical objects are we using to model the physics object) would be useful.

My idea was to put this sort of thing in the GitHub Issue connected to the API, e.g. PhysLean#846. Do you think it would be good if more detail was put into these then is already present?

Daniel Morrison (Jan 06 2026 at 05:52):

I guess my thought was more along the lines of a reference that goes through the kind of reasoning we want in the API in a plain text format. That's probably overkill for some basic stuff but would be great for larger or more complex steps.

Daniel Morrison (Jan 06 2026 at 05:53):

Perhaps not a requirement for all APIs but something worth including if it's known

Joseph Tooby-Smith (Jan 06 2026 at 06:10):

Understood, and I agree. Where I can I will try and find good references which can be used as a guide to building the APIs.

Daniel Morrison (Jan 06 2026 at 06:24):

I will admit that a decent part of my problem is overthinking my contributions, which is not a PhysLean issue. :upside_down:

Joseph Tooby-Smith (Jan 06 2026 at 06:37):

I know it's hard, but please try not to overthink it :slight_smile:. There is so much to do in the project, that doing something is rarely better than doing nothing. Even adding a single definition or a single lemma to the project or helping plan part of the formalization is extremely valuable - and in some cases can be more valuable than trying to do too much in one go.

Joseph Tooby-Smith (Jan 06 2026 at 14:13):

For writing the requirements part of the API in the GitHub issues, this note seems useful. Although maybe there are other standard conventions in the software engineering world that I am not privy too.

Rein Zustand (Jan 06 2026 at 15:02):

This API designing reminds me of this passage from Feynman's "The Character of Physical Law", chapter 2 "The Relation of Mathematics to Physics":

Some day, when physics is complete and we know all the laws, we may be able to start with some axioms, and no doubt somebody will figure out a particular way of doing it so that everything else can be deduced. But while we do not know all the laws, we can use some to make guesses at theorems which extend beyond the proof. In order to understand physics one must always have a neat balance, and contain in one's head all of the various propositions and their interrelationships, because the laws often extend beyond the range of their deductions. This will only have no importance when all the laws are known.

Another thing, a very strange one, that is interesting in the relation of mathematics to physics is the fact that by mathematical arguments you can show that it is possible to start from many apparently different starting points, and yet come to the same thing. That is pretty clear. If you have axioms, you can instead use some of the theorems; but actually the physical laws are so delicately constructed that the different but equivalent statements of them have such qualitatively different characters, and this makes them very interesting. To illustrate this I am going to state the law of gravitation in three different ways, all of which are exactly equivalent but sound completely different.

It seems to me that, while an API helps with deduplication of effort and standing on the shoulder of the API, it also restricts the way the physical models can be digitalized. Would that mean there could be variants of the PhysLean API where it is optimized for ease of swapping between finite dimension and continuous dimension? Or that where it is easy to swap between Poisson bracket and quantum commutator?

Joseph Tooby-Smith (Jan 06 2026 at 15:33):

It seems to me that, while an API helps with deduplication of effort and standing on the shoulder of the API, it also restricts the way the physical models can be digitalized. Would that mean there could be variants of the PhysLean API where it is optimized for ease of swapping between finite dimension and continuous dimension? Or that where it is easy to swap between Poisson bracket and quantum commutator?

I think the following should answer your question, but let me know if not:

The way I have presented it above, there appears to be a single API for a given topic e.g. "Finite dimensional quantum mechanics". But actually, I'm fully open to more than one API for the same topic, with connections built between them. This is especially true for something like axiomizations of special relativity, where there are many different axiomizations, and we would likely want an API for each of these.

Joseph Tooby-Smith (Jan 06 2026 at 15:37):

In other words, there may be situations where one needs different APIs for the same thing to optimise ease of proof, or to match with convention, or as you say swap between QM and classical mechanics. I'm totally happy with these being either put as requirements to existing APIs or as (if needed) as separate APIs entirely (with appropriate connections between APIs).

Rein Zustand (Jan 06 2026 at 18:24):

In software development in general, I find overlapping API's to be confusing to make sense of. Some of the API's may be modular, but they are not explicitly described as such in the code base. In mathematical models, one has to infer which theorems/API are modular of each other, based on their dependency graph alone.

But I wonder if it is just a human cognitive capability limitation. In the programming language analog, there is Perl "there is more than one way to do it" vs Python "there should be one-- and preferably only one --obvious way to do it", with the latter being a much more used programming language.

Joseph Tooby-Smith (Jan 07 2026 at 05:21):

I'm a bit confused by this now. Do you have an explicit example from physics where you:

Think there would need to be two separate APIs, i.e. there needs to be two separate key data structures, with one not simply defined as an equivalence to the other.
The use of two APIs would be confusing.

Joseph Tooby-Smith (Jan 07 2026 at 05:25):

(My confusion might be down to a difference in how we are using the word API here - so, to be explicit, by API I really mean 1) A key definition/data structure and 2) Lots of definitions/lemmas based of that data structure to make it easier to work with. So I see nothing wrong/confusing with having an API each for two different axiomizations of special relativity - and them eventually joined. Though I note that I think we should in these cases refrain from having two copies of the high-level special relativity results - the axiomizations should be joined before then).

Rein Zustand (Jan 07 2026 at 09:11):

An example would be Maxwell's equations:

The Heaviside form that separates time from space
The manifestly covariant form $\partial_\mu F^{\mu,\nu}$
The differential form d*F = J

You can ultimately bridge between the data structures, but they have different type signatures nonetheless. I can derive/connect to relativity, or show gauge invariance, with 2 of the methods. Which data structure should I use? If I use both 2 and 3, then I have to specify the resulting theorems/lemmas in 2 different data structures. This example happens to be a heavily studied foundational structure that a lot of people know, but the same scenario could happen in more obscure cases.

Additionally, how should the folders/files be structured? I consider the folders/files to be an API as well, but for namespace. For specific systems, why should the structure be "ClassicalMechanics/HarmonicOscillator", "QuantumMechanics/HarmonicOscillator" instead of a "HarmonicOscillator" namespace that contains "Quantum", "Classical", "Field"? Integrable systems vs non-integrable systems, fermions/bosons/anyons, and so on.

Joseph Tooby-Smith (Jan 07 2026 at 09:45):

The example of Maxwell's equations is one that we actually already have in PhysLean, and we have both implementations 1 & 2.

But in this case the API is not based around 1 or 2 it is based around the data structure which is the electromagnetic potential (as a function or as a distribution). This is the input data from this model. From this one can get $F^{\mu \nu}$ and the electric and magnetic field (which separates time from space), one can write down 1 and 2 and show that they are equivalent.

Eventually we will also want to use 3, this will involve defining a 2-form from our data structure, but this is totally possible and fits in with the API around the electromagnetic potential.

I'm guessing you are imagining we have a different data structure for the electric and magnetic field, $F^{\mu \nu}$ and the differential-form approach. This would be an approach, and if one followed it, then part of the API for each of these data structures would be conversions into the other data structures, allowing you to move between the APIs in different scenarios.

I should also say here, that I'm not intending PhysLean to be split strictly into APIs like the above, more they are a guide to things which can be built, what sorts of data structures we do actually want and what are the requirements we want those data structures to have.

Additionally, how should the folders/files be structured? I consider the folders/files to be an API as well, but for namespace. For specific systems, why should the structure be "ClassicalMechanics/HarmonicOscillator", "QuantumMechanics/HarmonicOscillator" instead of a "HarmonicOscillator" namespace that contains "Quantum", "Classical", "Field"? Integrable systems vs non-integrable systems, fermions/bosons/anyons, and so on.

This is a very good question. IMO (happy to be told this is wrong) there should be a data structure for the Harmonic Oscillator, and a namespace for this. Both the "Quantum", "Classical" and "Field" versions should use this same data structure. Maybe there is a nice parent directory we could use for things like HarmonicOscillator which are physical systems and can be treated both in terms of quantum mechanics or classical mechanics. Or we choose one convention to use for this and stick with it.

Joseph Tooby-Smith (Jan 07 2026 at 10:36):

I've added a GitHub issue template so that it is easier for others to add details of APIs to build. Such issues are automatically added to the graph I linked to above.

Rein Zustand (Jan 07 2026 at 20:44):

But in this case the API is not based around 1 or 2 it is based around the data structure which is the electromagnetic potential

I'm guessing you are imagining we have a different data structure for the electric and magnetic field, $F^{\mu \nu}$ and the differential-form approach.

It's clearer to me now, thank you. I saw that they are in ElectroMagnetism/Kinematics. Their location being inside the same folder makes sense. I'd appreciate if there is an API style guide description of deriving all the alternatives from one preferred data structure (in this case, $A_\mu$ ). I don't know if this was already obvious to people coming from Mathlib4, but certainly wasn't clear to me earlier.

I have no strong opinion on whether the top level name space should be the specific systems, being the "nouns" (simple harmonic oscillator, point charge, point mass, Lennard-Jones potential, van der Waals force, Ising model, ...), then the subsequent child namespaces being the adjectives (quantum/classical, relativistic/Galilean, few/many body, thermalized/non-thermalized). Still thinking about it.

Both the "Quantum", "Classical" and "Field" versions should use this same data structure.

If we want to use the same approach as the EM one above, then the simple harmonic oscillator needs to be described by its Lagrangian as the preferred data structure / technology. And then the Hooke's law and Hamiltonian version are secondary, which I found this to be the case in the existing ClassicalMechanics/HarmonicOscillator/Basic.lean. But the quantum version is instead first described with the Hamiltonian formulation. This makes sense historically/pedagogically, but is not consistent.

Joseph Tooby-Smith (Jan 08 2026 at 07:03):

If we want to use the same approach as the EM one above, then the simple harmonic oscillator needs to be described by its Lagrangian as the preferred data structure / technology

I understand what you mean here, and I think I agree. But we need to be a bit careful: Here the Lagrangian isn't a data structure. A data structure needs to be a type not a term of a type (which the Lagrangian is). To me in the case of the Harmonic oscillator, the data structure is the specification of the Harmonic oscillator system i.e. mass, and spring constant. I agree that working from a key technology, like the Lagrangian, is good though.

Joseph Tooby-Smith (Jan 08 2026 at 07:06):

I should add that hopefully a lot of these things will come out naturally as we build, refactor and refine.

Last updated: Feb 28 2026 at 14:05 UTC