Zulip Chat Archive

Stream: iris-lean

Topic: Data structure for sets


Zongyuan Liu (Aug 03 2025 at 09:43):

Hi,
I'm new to Lean. I'm learning it by implementing the select pattern of the proof mode and need to choose a data structure for sets with efficient insert and lookup.
Which one should I use? There seems to be many set implementations.
Thanks.

Shreyas Srinivas (Aug 03 2025 at 11:26):

docs#Std.HashSet has a decent api

Oliver Soeser (Aug 04 2025 at 06:31):

What are you using sets for in selection patterns?

Zongyuan Liu (Aug 04 2025 at 07:08):

Oliver Soeser said:

What are you using sets for in selection patterns?

I'm thinking of using a set to avoid duplications of selected hypotheses. The Rocq implementation does this by removing each selected hypothesis from the context when elaborating the pattern (https://gitlab.mpi-sws.org/iris/iris/-/blob/master/iris/proofmode/ltac_tactics.v#L155), which I think is less efficient and not straightforward to port because of iris-lean's distinct representation of the context. What do you think?

Zongyuan Liu (Aug 04 2025 at 07:12):

Shreyas Srinivas said:

docs#Std.HashSet has a decent api

Thanks. I actually want to have a set for Names, and I found NameSet: https://leanprover-community.github.io/mathlib4_docs/Lean/Data/NameMap.html#Lean.NameSet. I guess I should just use this ?

Oliver Soeser (Aug 04 2025 at 07:18):

Sounds good, I've got a very simplified implementation of specialization patterns that I'm using for implementing iapply right now that's based on lists. As long as the underlying set structure is abstracted away under helper functions that shouldn't be a problem though

Shreyas Srinivas (Aug 04 2025 at 07:25):

NameSet is essentially a red black tree of names.

Shreyas Srinivas (Aug 04 2025 at 07:26):

Unless we are expecting gigantic contexts, red black trees are perfectly fine, but asymptotically slower

Shreyas Srinivas (Aug 04 2025 at 07:27):

I don’t immediately see a useful ordering of names you can maintain

Zongyuan Liu (Aug 04 2025 at 08:17):

Shreyas Srinivas said:

I don’t immediately see a useful ordering of names you can maintain

Thanks for the info. I think the default NameSet is fine for the first implementation. I'll make a layer of abstraction, and we can optimise this later.

Markus de Medeiros (Aug 04 2025 at 12:51):

Interesting! @Zongyuan Liu FYI, we do not have a Mathlib dependency so if you want to use NameSet you would have to reimplement something similar to it.

Am I understanding correctly that the size of the set you're constructing is roughly the same as the size of the selection pattern typed by the programmer?

Shreyas Srinivas (Aug 04 2025 at 13:00):

NameSet is not in mathlib

Shreyas Srinivas (Aug 04 2025 at 13:01):

It's in Lean's core

Markus de Medeiros (Aug 04 2025 at 13:01):

Oh, lol you are right. Great!

Zongyuan Liu (Aug 04 2025 at 13:48):

Markus de Medeiros said:

Am I understanding correctly that the size of the set you're constructing is roughly the same as the size of the selection pattern typed by the programmer?

No, the upper bound is the number of the hypotheses in the context, because # and * can select multiple.

Zongyuan Liu (Aug 04 2025 at 13:51):

The first version that works with iclear is in https://github.com/leanprover-community/iris-lean/pull/82. Feedback is more than welcome!


Last updated: Dec 20 2025 at 21:32 UTC