Zulip Chat Archive

Stream: general

Topic: does mathlib have a DOI?


Kevin Buzzard (Feb 09 2022 at 19:18):

I'm doing some administrative task on Maria's grant, and I'm being asked whether there have been any contributions to research datasets. I cautiously hit "yes" and I was asked for the DOI, the title, and the year the dataset was published. Is mathlib a dataset? If so, does it have a DOI?

Bolton Bailey (Feb 09 2022 at 19:21):

This is the mathlib paper, and it lists a DOI, (edit: which links to this non-arxiv version). Is this what you're looking for?

Alex J. Best (Feb 09 2022 at 19:30):

We could add mathlib to Zenodo (https://zenodo.org/) a repository for research data and software that issues DOI's (e.g. https://github.com/alexjbest/cluster-pictures/#cluster-pictures)

Kevin Buzzard (Feb 09 2022 at 19:31):

I did type mathlib into the search box on the website which I was using to do this administrative task, and it didn't find it (and it certainly searched something).

Bolton Bailey (Feb 09 2022 at 19:33):

It would certainly be nice to be able to link to something which shows up-to-date stats on mathlib, so that when whoever is intended to look into these things looks into them, they can be impressed.

Kevin Buzzard (Feb 09 2022 at 19:59):

Bolton Bailey said:

This is the mathlib paper, and it lists a DOI, (edit: which links to this non-arxiv version). Is this what you're looking for?

I don't know. I never clicked this box before because none of my government-funded research ever had anything to do with databases. Oh -- I see that the results of my search explicitly list Zenodo as "publisher" of much of the information.

Johan Commelin (Feb 09 2022 at 20:05):

I don't think the mathlib paper should be added in this particular example.

Johan Commelin (Feb 09 2022 at 20:05):

It would be better if mathlib itself got some sort of DOI, but I don't know if that is possible.

Jason Rute (Feb 09 2022 at 20:14):

https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content

Jason Rute (Feb 09 2022 at 20:17):

It seems Github recommends Zenodo as the way to assign DOIs to repos.

Julian Berman (Feb 09 2022 at 20:33):

Zenodo requires version numbers though

Julian Berman (Feb 09 2022 at 20:34):

At least as far as I know

Julian Berman (Feb 09 2022 at 20:34):

So mathlib would need to version itself. If it did though getting Zenodo working is trivial in my experience too.

Eric Wieser (Feb 09 2022 at 20:41):

Yes, zenodo puts a lot of emphasis on version numbers; although it does assign a single "all versions" DOI too

Yury G. Kudryashov (Feb 09 2022 at 23:46):

We can have monthly version bumps

Yury G. Kudryashov (Feb 09 2022 at 23:47):

If they want version numbers

Eric Rodriguez (Feb 09 2022 at 23:53):

time it with the blogpost? that could be cool

Johan Commelin (Feb 10 2022 at 06:47):

Who can distinguish major.minor.bug from year.month.day anyway?

Mario Carneiro (Feb 10 2022 at 06:49):

lots of software projects use date versioning anyway

Mario Carneiro (Feb 10 2022 at 06:49):

I think 2022.01 is a fine version number

Julian Berman (Feb 10 2022 at 13:20):

(https://calver.org/)

Jannis Limperg (Feb 10 2022 at 13:43):

I'd suggest that Kevin just uploads a snapshot of mathlib to Zenodo as a tar/zip archive. This way, mathlib doesn't need to create arbitrary versions.

Matthew Ballard (Feb 10 2022 at 13:52):

Some CI options might be possible

Eric Wieser (Feb 10 2022 at 13:54):

I think all CI needs to do is create a tag, and zenodo can detect that automatically

Eric Wieser (Feb 10 2022 at 13:57):

Relatedly; do we have any recommendations on how to cite mathlib? If so, we should add a CITATION.md file to the github repository which will generate a "how to cite" link in the sidebar.

Rob Lewis (Feb 10 2022 at 14:34):

Eric Wieser said:

Relatedly; do we have any recommendations on how to cite mathlib? If so, we should add a CITATION.md file to the github repository which will generate a "how to cite" link in the sidebar.

Reading this as a different question than "does mathlib have a DOI?," the answer might be to cite the mathlib paper. At least this was one of our intended uses when we wrote it

Bolton Bailey (Feb 10 2022 at 18:37):

Eric Wieser said:

Relatedly; do we have any recommendations on how to cite mathlib? If so, we should add a CITATION.md file to the github repository which will generate a "how to cite" link in the sidebar.

We can alternatively make a CITATION.cff file, which github will parse to provide citations in APA and BibTeX formats.

Kevin Buzzard (Feb 10 2022 at 18:38):

Right -- my original question was exactly "although we have solved the problem of citing the library in papers, how do I report to the UK government that money spent by them (on Maria) has resulted in an enhancement of a database, when the web page is clearly expecting that database to be listed in Zenodo". Of course my current solution is "just don't bother reporting it this year", but this time next year when I'm asked again she might well have contributed several thousand lines of code.

Eric Wieser (Feb 10 2022 at 18:55):

Bolton Bailey said:

We can alternatively make a CITATION.cff file, which github will parse to provide citations in APA and BibTeX formats.

I found this to be quite limiting, and it's easier just to link to a markdown file where you can write whatever you like. My guess is that everyone who cares about formalizing mathematics is probably not planning to use APA, so a BibTeX snippet would suffice.

Bolton Bailey (Feb 10 2022 at 18:59):

As a grad student, I have some questions for the professors in the room.
Is it really worth going through the overhead of making a zenodo listing and versioning the repo to make a DOI when we could just use the DOI of the mathlib paper instead?
Do the funding agencies really care about the distinction between the repository and the paper (which you might say acts as a kind of landing page for the repository)?
Will a funding agency just look at the number of LOC in the repo and say "looks pretty good", or do they want us to say how much we personally contribute so they know specifically what they're paying for?
As I write up my paper that uses mathlib to verify cryptography, should I cite the mathlib paper? The repository as a URL? Both?

It seems to me like it's often better to cite the paper, since in terms of someone reading my paper who wants to know what mathlib is, the mathlib paper is more descriptive than a link to the repo. If it's an issue of credit, the paper describes its author as "The mathlib Community", which I guess is accurate enough (although you see the maintainers when you visit the main page of the repo, and I think it's good that they can have some extra visibility for all the work they do).

Alex J. Best (Feb 10 2022 at 19:04):

Indeed my suggestion of adding a Zenodo entry was specifically in response to Kevin's original question of how to cite mathlib as a dataset. Versioning number decision aside there is very little overhead in adding something to Zenodo in my experience.
As you say if you want to cite mathlib in a paper using the published description of the project seems way more useful.

Kevin Buzzard (Feb 11 2022 at 07:34):

And indeed the govt website I was using was specifically searching Zenodo when it was asking me for the names of the databases which their money had gone into expanding


Last updated: Dec 20 2023 at 11:08 UTC