Zulip Chat Archive
Stream: lean4
Topic: lean vs awk performance
Damiano Testa (Feb 08 2023 at 08:02):
Dear All,
first of all, I hope that this is on-topic!
While trying to learn some programming in Lean4, I started to play around with Advent of Code 2019, Day 6. The task is simple: you are essentially given the list of edges (A, B)
in a rooted tree and you want to find the sum of the lengths of all the paths that start somewhere in the tree and end in the root of the tree.
I coded this in both Lean4 and using awk in a very naive way in both cases. The performance of awk is, of course, amazing. However, my Lean4 implementation is clearly terrible.
How would you implement this in a more performant way?
Thanks!
My Lean4 implementation (working gist)
def List.sum [Add α] [Inhabited α] : List α → α
| [] => default
| [a] => a
| a::as => a + as.sum
def data : String := "a very long list of edges looking like
A)B
C)D
E)F
..."
#eval
let dat : List (String × String) := (data.splitOn "\n").map fun x => let y := x.splitOn ")"; (y[1]!, y[0]!)
let f (S : String) : Nat := Id.run do
let mut (n, T) := (0, S)
while T != "COM" do
n := n + 1
T := (dat.lookup T).get!
return n
(dat.map fun x => f x.1).sum
My awk implementation (I stored the input in a file called i6
)
awk -F")" '{ par[$2]=$1 }
END { for (tgt in par)
{ while (tgt != "COM")
{ con++
tgt=par[tgt] } }
print con }' i6
Damiano Testa (Feb 08 2023 at 08:04):
(Ideally, I would prefer to avoid changing the structure of the program too much, since I am trying to learn how to optimize the various steps, however, I welcome any form of feedback! I do have a quicker implementation that just computes the distances from the leaves of the tree and that gets the answer more quickly.)
Henrik Böving (Feb 08 2023 at 09:04):
Without profiling your code I'd say an obvious thing to do would be to use a single fold operation for what you are trying to achieve. Since it would be tail recursive and you safe iterating the list once (afaik we dont have list fusion yet)
Also maybe you'll get better performance if you use a better data structure compared to a list? At least an array should make it better ad well I guess. And in general if you want to know what makes things slow you can always profile.
Also you want to run the compiled version f lean with a main function instead of eval for performance our interpreter is not optimized for perf
Damiano Testa (Feb 08 2023 at 11:58):
Henrik, thank you very much for your comments: they are already very helpful to guide me into what could be improved! When you say "profiling" is that something that I can get Lean to do for me, to some extent? I know about the set_option profiler true
command, but, when I am lucky, it gives me one or two times and not much more insight...
Henrik Böving (Feb 08 2023 at 12:03):
You can use perf + e.g. hotspot to profile a lean binary. There is an explanation for how to do that in the lean 4 guide.
Gabriel Ebner (Feb 08 2023 at 17:52):
My first suggestion would be to use a hashmap for dat.
Gabriel Ebner (Feb 08 2023 at 17:53):
List.sum can and should be made tail recursive, but I don't think that's the biggest issue here.
Gabriel Ebner (Feb 08 2023 at 17:57):
I know you don't want any suggestions on algorithmic improvements, but that's going to be the biggest win here. If you make f recursive and memoize the output, you should be able to make the whole thing run in linear time.
Damiano Testa (Feb 08 2023 at 21:23):
Thank you both for your comments!
I realize that this is a very specific question, but I am learning from your replies!
Damiano Testa (Feb 08 2023 at 21:23):
Gabriel, when you say that List.sum
should be made tail recursive is this simply changing the order of a + as.sum
to as.sum + a
?
Damiano Testa (Feb 08 2023 at 21:24):
Also, I tried
partial def f (dat : List (String × String)) : String → Nat → Nat
| "COM", n => n
| T, n => f dat (dat.lookup T).get! (n + 1)
#eval
let dat := (data.splitOn "\n").map fun x => let y := String.splitOn x ")"; (y[1]!, y[0]!)
(dat.map fun x => f dat (Prod.fst x) 0).sum
which I think makes f
recursive, although I could not notice an actual performance gain.
Damiano Testa (Feb 08 2023 at 21:26):
What mostly prompted my question is that my Lean implementation takes approx 20-30 seconds, while I get
time awk ...
real 0m0.030s
user 0m0.027s
sys 0m0.000s
Of course I realize that this is not a fair comparison. Anyway, as I said, this has been helpful for my understanding!
Henrik Böving (Feb 08 2023 at 21:50):
No you'll have to make an accumulator argument to make it tail recursive. That's why I said a fold should help since those are in fact tail recursive.
I also wouldnt expect any performance gain from the thing you are doing above since again you are using the interpreter which to my knowledge is not smart enough to exploit things like tail recursion. Again if you want it to be fast, compile it.
Gabriel Ebner (Feb 08 2023 at 22:02):
interpreter which to my knowledge is not smart enough to exploit things like tail recursion.
The interpreter does tail-recursion just fine. :smile:
Henrik Böving (Feb 08 2023 at 22:02):
Oh really? That's pretty cool then.
Gabriel Ebner (Feb 08 2023 at 22:04):
which I think makes f recursive, although I could not notice an actual performance gain.
The important part is to store the result as well, so that f
becomes if cache contains T then return cached value else return f dat[T] + 1
.
Gabriel Ebner (Feb 08 2023 at 22:04):
But your awk version doesn't do that either so let me reiterate my first suggestion: use a hashmap for dat.
Gabriel Ebner (Feb 08 2023 at 22:05):
Oh really? That's pretty cool then.
Yes, TCO is a critical optimization. Without it, you can easily get stack overflows.
Damiano Testa (Feb 08 2023 at 22:06):
I have been trying to use docs4#Std.HashMap, but I must be doing it wrong. Here is my attempt.
Damiano Testa (Feb 08 2023 at 22:07):
partial def h (dat : Std.HashMap String String) : String → Nat → Nat
| "COM", n => n
| T, n => h dat (dat.find! T) (n + 1)
#eval do
let dat := Std.HashMap.ofList ((((← IO.FS.readFile input).dropRight 1).splitOn "\n").map fun x => let y := String.splitOn x ")"; (y[1]!, y[0]!))
return dat.fold (fun _ s t => h dat s 0) 0
However, this seems to have entered an unending loop.
Damiano Testa (Feb 08 2023 at 22:08):
(To be honest, I do not really understand the fold
, but I tried it, since I could not find a map
, nor a for
for HashMap
... :man_shrugging:)
Damiano Testa (Feb 08 2023 at 22:12):
Also, I realize that this is something that you are probably very used to, but to me, the fact that using Lists, Arrays or HashMaps can have significant impact on performance is eye-opening. This is why I wanted to avoid changing the structure of the algorithm, to get a better understanding of what can be achieved by not figuring out a quicker way to get to the answer, but simply using better the available tools.
Damiano Testa (Feb 08 2023 at 22:14):
With a different algorithm, I can get to the answer in approx 7secs, so this is not the point (for me). Besides, the fastest answer so far is
#eval 253104
:upside_down:
Gabriel Ebner (Feb 08 2023 at 22:22):
the fact that using Lists, Arrays or HashMaps can have significant impact on performance is eye-opening.
Indeed, I didn't even realize how obvious this was to me. The average runtime of List.lookup
is O(n)
where n
is the number of entries in dat
, while HashMap.find!
is O(1)
. So you're essentially speeding up your program by the number of entries if you switch to a hashmap.
Gabriel Ebner (Feb 08 2023 at 22:23):
(The performance of List.lookup
is even worse since you need to check string equality for half of the entries (on average), while the hashmap just has to compare the hashes. I don't know how long T
is in practice, but that might make a significant difference as well.)
Damiano Testa (Feb 08 2023 at 22:29):
In this example, the strings T
are all 3 character long, so I do not expect this to be too significant. The longest distance from a leaf to the root is 320, I think and there are approx 140 leaves apparently randomly distant from the root.
Damiano Testa (Feb 08 2023 at 22:30):
Anyway, I am really glad that I asked the question and that it has not been derailed to suggestions to change the algorithm, but has been focused on "use this data structure instead"!
Damiano Testa (Feb 08 2023 at 22:35):
0 | COM
1 | 9JW
2 | 7X5
3 | 7P8
4 | 457
5 | VX9
6 | HT3
7 | VC8
8 | WL3
9 | 2RL
10 | X5D
11 | T9T
12 | XH2
13 | 6MC
14 | PYF
15 | 7TH
16 | T88
17 | 31H
18 | DBZ
19 | 9TV SVG
20 | XMM 4Q8
21 | 1X8 KST
22 | 2Z3 KYJ
23 | QV7 428
24 | S9K 57B
25 | 2S2 PKN
26 | LPB
27 | 3GV
28 | J2S
29 | QM6
30 | GXD
31 | SX1
32 | VLD
33 | DWP
34 | TBM RGD
35 | RQZ G87
36 | L6C N4Z
37 | HPZ 3H9 Q72 C1W
38 | 41Z 8XK 8LJ CPW
39 | 7TL YB3 66X CMD S2Z
40 | L65 NHG HLF F31 N5Y F2J 5MP J3X
41 | H6Y Z2R Y91 YLQ 5V2 QSH BNN HPK BCY LN2
42 | DZW GTJ GYJ V8J 6QF TKH B8G 95G 7V4
43 | TBV HZ1 D88 RFH VCF R8M LBM PXF B3V HVX
44 | TLY KYL TGY WL9 VQX JDL 39X WNN SPM J82
45 | R1J 1XD LD5 JNX 1R2 5TJ CYM R4D KGB XTM 6ND
46 | H9J 1J7 T4F BCD CH2 NX9 PZQ XX9
47 | 8SC 1WP 24B BDF TYD 3RP 21X 594
48 | YV5 Z9V NB8 9CF X81 MMK V93 NBJ
49 | VT7 6L6 DY5 HBJ 65D ZGB W31 L2R V9K TS3 PS1
50 | 7M2 HK4 BBY GBM B1X N6Z G1G 6XM Q2X Z1D J6C
51 | F95 7TT 4TN H8P 9GH Z22 FFY CBG T6G 27G SJR
52 | FFZ TSG SKL 22K G6S VLG 95K LGZ 44D NZK
53 | HJ7 FF8 BSR W1W 6G5 N1C 1T3 P4Q XV1
54 | 513 FZS 9X7 7JW 2QZ
55 | RFZ DVD Q21 GXW R7H SDS LR5
56 | YC9 5TT QYJ GMR T3V J2D 47Y
57 | NSQ Q38 Y27 7L1 2G3 6PH 7SZ
58 | 8B3 QZK 8RX ZS6 5R5 RR8 WKK D6Q 5PT
59 | T4R 82V ZC7 769 D8K LXW 8LQ CJZ
60 | S1T J7T ZDZ 86H 2CQ D19 HBZ 7TN
61 | 2QG 2FT PBP P93 KYY 1X5 QX4 DYP
62 | KMM TLG RZS 47P 2NX CX3 QKM JZQ P2Y
63 | X31 KZJ 9HN H96 P2H QHC 81D SR1 CVW
64 | BHR XCF 31S TZ2 NRL 4DZ BX3 DWR 41B
65 | VK9 LJG KPQ NY4 D4Z 3KW 6X3 JSD
66 | GNW B3G 7H4 QHN BXW VC5 2R9 DQL 4SQ
67 | XW1 Q5T Z3G CCY Q2W 3KM LZK C15 25M
68 | 9QL XT3 SW5 9JL 3R1 7G4 Y9Q YYD
69 | C96 DGC M18 S8H GV7 K94
70 | NFV 9VB 37V F5C C7P 87N
71 | 12Z 7Q6 Z4H PG1 JB4 B2L
72 | BNX 5ZM DB9 SVH FCB
73 | 9SH W82 ZK9 W35
74 | Q8T W4H HLX ZGD
75 | 8WB 9T7 48R S6X
76 | 4BG 84H 79S
77 | DHN VLW YB7
78 | QYN DLR P2F
79 | 84R YH9
80 | BHH CQR PKL
81 | CDR 6YP 7CF NLM
82 | MNK 6LF K77 JD4
83 | 471 S8Q C56 1TN
84 | S2G 7VW JWD 9V7 TZH S16
85 | 51H 5TG CYC V5N RHS XJP Q3S
86 | 7BP XSS ZP9 3J8 3JX 676 4D4
87 | ZMG ZFV M5Y BXP T62 1NS V1H
88 | 3MW P9T 9P3 X8K 83F LN5 1F2 MQY
89 | VS1 YPT LB7 X83 1WG 5KZ CT5 DLD
90 | Y6C FBX HZZ 72J 4JM X5Z GXK
91 | X4V 61P YVX 3JT 3FH QG5 MFZ 3QN
92 | CVP P63 JMT WNX 717 DJ3 FJV YKD
93 | QGR DPW 28Y PX5 58W SDF 154 6ZY
94 | XJK RN3 CDX GY9 GW4 CTC
95 | SXQ TT2 WR7 DXW PJX 7XD D9Q
96 | J75 MG2 9YJ ZVJ KWM YFJ
97 | PWF N91 Z48 QTQ 9DQ 7C1 HGG
98 | SN6 SPY XGT JGB YQ2
99 | WR6 HR7 GGZ NPX 1W8
100 | FG3 LP8 P4K 79V KNY
101 | FFL CGH BGZ 4MM SH5
102 | CHG YCG 21Z 5MJ 16R PCG
103 | WVC KK5 RYC NK4 M6W
104 | BMT HNV FYM L7W B8H
105 | S2W 2MC MYP DQX 5SH W2W
106 | BHQ 6QN 6M8 RT5 221 H1D YV2 RSC
107 | WFL LNQ M48 N6X 2GL GFR VDK G2R 3W1
108 | ZN2 DF7 GS2 D55 NL7 SZ4 KHL 1P5 W2K B6B
109 | 995 XBP FTN 4DV M1P MR6 BBQ 52H FRM
110 | 825 1G1 ZYG C8G GPP TY4 CMG BXY 54Q FHK 6DV
111 | 7GK VSV PBM FZD Q8R D39 6XB 2RF 57R D59
112 | CSL 9M1 1KG 5RQ JFR BFF KQH 4PV 9SF
113 | KJJ 5TP Q4H 26S JQL HW1 586 FN6 SXK CYR
114 | MCG QP7 QYP 5W2 G8T C88 BYL 1QK Z4D 8BN QGP
115 | TDJ 3VS RTV D66 J8V HFR GJN Z1J 41X FLP SPH 4B2
116 | C7B 1JP V4T JX8 5PQ CMZ GL2 YK5 BL6 8Y5
117 | BYZ JFH 847 DGD X85 JDJ V1X KKF WZW
118 | DYC TH6 N4J GY1 RR4 ZLR YKP GF1 MD5
119 | NY8 B2W 2JC TPY FKC NYP XL8
120 | CLS CXC 689 2DW SPL JDM Z5M
121 | M7M BSQ ZTS 95D
122 | 389 H1L XQS X5W
123 | Q7Z Z8R
124 | BKJ JCR
125 | LWH 14W
126 | SRN YD1
127 | GHM YXC
128 | PY1 WDF 112
129 | FRB 815 12R
130 | R3T 6BL BM6 32K
131 | 6M6 NSY GXQ V4J
132 | 784 86C Q6G XNJ
133 | LZ9 3F5 FFX B63
134 | K1G BB4 Q1Q SBL RG8 C4S
135 | DLL 6D6 8D3 BPL VP3 WY2
136 | DW3 KKQ SYC 51T 9NJ JRR DMK
137 | V2V W2D P1C CJ1 G89 2T4 8F8
138 | BL1 BT5 M34 62X G9Y 5KG 1TH NB7
139 | RGL 3P3 1ZD K5H BSB 1CV 7KP
140 | BYG KBV 7NP GFP BFB
141 | G3H 9PT VZP F3G XGR MSL 297
142 | 64B 7PC PV5 9GR 8HK WPB 4DB QM2
143 | KRM Q8N BPN R14 CWZ
144 | TNG DGS HYK SWH 3TR
145 | 6ZN BGG 56V WV8 693
146 | VPJ 5TX X24 SGK 23R
147 | 55L N1L J44 QJD XNB
148 | DCQ FCH 7DV KM2 GK6
149 | SNG MXQ VDT F4L RTN
150 | 4L5 T5B QGD T4H
151 | FRL HQS PZ9 GJB
152 | 153 C7H 7VT T7T XFK
153 | YDL X3Y VQT G48 3TB
154 | KX6 C67 XNP NV6
155 | 52Y XGM SNZ
156 | LT3 KD8 7J4
157 | BPY 7YX DQP S3H
158 | FQC 6DY 2VD BHS
159 | 7X1 2YM MJ4
160 | FPG NY9 LHT
161 | XXQ 2MW TJJ YN5 KV5
162 | 4JW NJD 9KD D5L L9K
163 | 29H 3LT 22H 1F1 3MQ Q7S
164 | G1L BDB QBS 3ND 9NN 2FF
165 | VRF 2WS VKT G1Q LGL MRF
166 | 9MX JNH 5V6 3BL RLJ C19
167 | 611 Q64 PND KMC T5D
168 | P4D CHR YMP SP9 WKN
169 | MFL 6SN 5LR
170 | T1M 6BC RY3
171 | HP6 LW7 6HV BZ6
172 | KGV HLC PG2 997 NBZ
173 | K8D R1Y L9Y G6K 93W
174 | Q2Z ZL7 HTH WQ4 JQN
175 | TLH NL4 R8D DSP KKL 1ZC
176 | XJ7 XQ9 MC7 XTF TMT
177 | RPY 3HN VPF 99C 4VN
178 | PD7 TKY 9YR JVW ZX4
179 | DVV NYB 1YV DFH 4RG CZ8
180 | 8RC TM2 QBB 2T5
181 | VNQ RTT ZR6 9J8
182 | RDQ HTJ JM9 2B4 39B
183 | HH4 SK2 NGV C3T 898 5YB
184 | NQQ HD6 71S NKR DDD 3J7
185 | 431 SQ4 C4K SRP PF1 NND
186 | MHM L1T 171 3FX
187 | TK6 D6Z VXY HSN
188 | QHD JZM
189 | R91 BV4
190 | 6Q9 FY7
191 | YJ5 KZ6 YBC
192 | FPM K8S L2G
193 | YKV BLT 3XB
194 | B6Q NB1 X46
195 | 8PP GL4 DGY
196 | YXN HZQ K4N 4GK
197 | JC5 57G 46D KLT
198 | CFQ DZ6 5YG TBQ BBP
199 | RKP 8HT H1W XS2 LM4 WXC
200 | XCD PQY HTF CWY MZZ
201 | JLR 11Y HJS KYV 24Y
202 | 3PM 7HW 6Z7 WMS
203 | 6GT 512 4T3
204 | 651 VD7 B92 9PN
205 | VG4 M8Z C8R NVM
206 | 7RS LYX FDT H3C TSL
207 | SB2 X59 ZCD MBK
208 | 8HD 77J 9WH LVB
209 | 3Q5 DHJ SVY 9YL CXN
210 | MWN ZMD SGX 3LV R8S
211 | C5H XYG N3D 2TV 8N7 WL2
212 | QRV PZ7 9LN 9B1 FQ4 YLM 91X
213 | C68 GK9 8MT BG9 CTY RLZ 4JY
214 | X5M 8N9 S66 2D4 DXP
215 | LSV VKS K4X L3F 79F
216 | QNJ LPH CB5 Z2L
217 | VD9 6LG QSP JBV
218 | LJX V2H 82N VGM BJK
219 | P1V H4K WS3 KF6 HZW
220 | GNX WDM 24J LL7
221 | YC3 8M1 4G9
222 | FZL JWR P7J
223 | VFV G4B YGJ 572
224 | 8XR QCS XVY D8N HZ9
225 | LWX VHQ 19W MCH 82S
226 | KCT FSW CZ2 45G
227 | RRC YQC XPD B6J
228 | DKM 459 2MN R4F J31
229 | 2T3 HSX XQQ KS2
230 | 5YR BR6 J6G V5M
231 | S47 CQB B1Y WXW
232 | CRB YT7 NZJ 6LD L48
233 | 2MH XVV QNF DXM JNZ
234 | C32 Q8J 17D VVK YK2
235 | 4NJ YCJ 2SH S8C S2T JTV
236 | Y51 DYJ SFW F1B FP1 H9N
237 | 3LW NL3 CNY 23G GXP
238 | 9WT KPG QFN 3MM BHY
239 | GHV QC5 GPB 4QN PW9
240 | D2G LPK BCB MH8 5R4 TGW
241 | QWF BD5 4GQ 347 FS4 9M4
242 | FNL L7K RJZ RXJ 5C7 9K2
243 | DSW LPP 37Y 3X3
244 | PB1 N3V 5SS 79D QYB
245 | SC5 XJ1 CC7 M1S TG7
246 | QXQ 7WG 38W TT3 M1N
247 | YVZ K98 GLX SGG VDP
248 | H1M FQG 13Y F3D YG4
249 | 1S2 K9D JPY SWG MYL
250 | 1QD 2TD 5YC JXV QXK BSZ
251 | TKQ 2CS VJ1 FTX 2YW ZTR
252 | WS6 MXB 3X1 91Q ZPV
253 | R78 V9W NSJ QC2 DN6
254 | PRG DHQ ZWJ D3Y YP8 37R
255 | Q1Z ZXG XLJ P5J BXS CGG
256 | HFD SMR T31 TRZ ZPX 6ZD 25Y
257 | 25X JR9 82X 42V 16Z 3FD
258 | 15G H2M C3N 8XD MXN 5X1
259 | G6B L2Y F86 8NS 95C 4Y8 Z8Z RSB
260 | 36Y KXQ RDR LX1 KBP 19J X2N RF6 CQ9
261 | LQ9 NGG 8DG ZQK 47G BF4 R4P 4CH 27B
262 | J4J H1J GNH VWB HSQ BNY 4P2 V95 G2D
263 | 4NH J7W 7YL 9F6 BJS 412 B12 H6R
264 | 2GN V97 7LZ 8B1 PP5 ZX5 LGJ C65 YXR 11K MDP
265 | CDM 3K4 7DF BCQ 4C9 SPQ 9FN 9FS BTL
266 | MZ9 P3M 1N7 ZH5 R4Z 55P HJZ HY3 8PC
267 | 4FK FZF 878 H7Y M3D P8G 9NB C26 2RD Y3R
268 | SCH 6TB HD2 KRF 7XB S4P 91H HR2 ZZ5
269 | CBH 55K HTS Q8K KWZ S2R L5B C6F W7X 299
270 | GZK QS2 YR9 MY8 LWV ZPD BW7 KDQ 6QV KTW
271 | TBG J62 N81 BPT JKX 7W7 QCN ZTH NMC CXV
272 | LG3 PK7 NKQ 14C VH6 88Q YL8 PQ1 C89 VND
273 | C2D 3DT 76W CPS PQD BFK PL1 396
274 | SZR Y93 DVM FLH JXH YQQ
275 | RRZ GH3 1DC BJV HSJ FLB SW6
276 | 54Y KR8 29T NZS F2L
277 | TZZ PDD 7MR 12G Y4V
278 | C6X DVX 6C3
279 | NMX 3ZC YXV
280 | J5X ZXW 86J
281 | C9N HKT Z4W
282 | 2Z7 VF6 XN7
283 | 2X8 PHB
284 | WFH FHY
285 | P7V YKX
286 | 52T TYX
287 | 14K 75R
288 | 5HD MV6
289 | PGC Y5R
290 | ZXS 5MC
291 | M2Z ZZ4
292 | 38C DYB
293 | N84 4W6
294 | CV9 79N
295 | NRZ RTQ
296 | 4RH W13
297 | 2T6 483 MLV
298 | R9B 85S KSX
299 | RSP 5D9 L1Y GW2
300 | 8MK KWF JP2 VCM D4D NVR
301 | GBL J4M VBW CW3 1KX J4G QFX
302 | H2Q ZLB 661 8PB WBF 7GP NHT 2WM
303 | JHP 1S4 XB6 K87 Y1T SAN 82H 3HJ
304 | FRT 5LC 8VQ 5H1 T1J RY7
305 | 645 VX1 X7K MKS VCQ C28
306 | 7ZY DJH W96 KN8 1C7 N1D
307 | CV2 ZFL WTN WPW R66
308 | DLZ QYX Z6F
309 | TFB VT2 272
310 | YOU 711 8XM
311 | Q5L JBB 8S9
312 | 843 BVZ VZD
313 | HCL W4D
314 | ZHY
315 | 5CZ
316 | 9RS
this is how the tree more or less looks like: the first column tells you the distance from the root, the rest of each row are all the vertices at that distance. There is a little part of information missing, which is what branches, but the "image" above is enough to answer the question about distances.
Damiano Testa (Feb 08 2023 at 22:35):
(again, the picture is a courtesy of awk.)
Patrick Massot (Feb 08 2023 at 22:35):
Damiano Testa said:
Also, I realize that this is something that you are probably very used to, but to me, the fact that using Lists, Arrays or HashMaps can have significant impact on performance is eye-opening.
We should really have a couple of mandatory computer science courses for all mathematicians. Quick quizz is: what did you think was the difference between List and Array?
Damiano Testa (Feb 08 2023 at 22:36):
Patrick: clearly the fact that one requires #
to type, so it is slower... :face_palm:
Damiano Testa (Feb 08 2023 at 22:37):
Also, I still do not know what the difference is, except that I now think that for performance I should use Array. I know that I cannot pattern-match on Array, though, so I avoid it...
Damiano Testa (Feb 08 2023 at 22:39):
On a more serious note, I strongly agree with your assertion, Patrick. I have been trying to learn some computer science, but, as is obvious from this discussion, I have not learned much yet...
Patrick Massot (Feb 08 2023 at 22:42):
The issue is there way too many things that clearly everybody should learn. For instance I know there is a minimal amount of probability theory that every mathematician should know, but I'm very far below that minimum unfortunately.
Damiano Testa (Feb 08 2023 at 22:44):
And I also find that one of the hardest things is to realize that some knowledge is missing. In my code, I have been going back and forth between Lists and Arrays all the time and I now suspect that this makes my code very inefficient...
Damiano Testa (Feb 08 2023 at 22:45):
To me, the biggest difference between List and Array is that the defs/lemmas have different namespaces.
Henrik Böving (Feb 08 2023 at 22:45):
In general if you are doing lots of accesses to an Array
thing you will see much much better results than when compared to a linked list. The reason for this is that a List
has some start node and then that node points to some other node somewhere else in memory and that node point somewhere else etc. until you reach nil
. So if you access a List
by index you have two issues:
- You have to traverse the list to get your element
- In computer science we have the principle of locality. What this means is "if you access things that are close to each other in memory these accesses will be faster" due to a thing called the cache.
An array on the other hand gets rid off both of these disadvantages. it does this the following way: Unlike a linked list an array just starts somewhere in memory and then the elements are right next to it. This means that:
- Access is constant in time because you can just add n * the size of your elements to get to the nth element in memory
- If you have accesses that are close to each other or frequently access the same element you can expect this to be much much faster due to the cache.
There are a couple of other details that you have to keep in mind but this is the basics
In general what you can do is just: Try to remember the complexities (in computer science they are denoted with the big O notation) of common operations on your datastructures such as:
- insert
- lookup (by a key or an index)
- removal
And when you have a concrete problem try to think of what operations you are doing most often and then which datastructure can do this best for you. So for example in this case we know: Lookup by key in List O(n), lookup by key in Array also O(n) (it also has to traverse but: will still be faster due to locality principle) but hashmap lookup by key is O(1) so hashmap might be a pretty good idea here.
Henrik Böving (Feb 08 2023 at 22:46):
Note that, while i am only presenting disadvantages of lists above there are a couple of advantages to them as well, for example inserting in the front of a list is O(1) while it is O(n) for an Array.
Gabriel Ebner (Feb 08 2023 at 22:48):
And then there's Leany stuff like: inserting at the end of an array is either O(n) or O(1) depending on what else you do with the array.
Damiano Testa (Feb 08 2023 at 22:49):
Henrik, thank you so much! To me, the beginning and the end of an Array/List were simply a matter of visual representation. I had no idea that there is a higher cost in accessing the last element, as opposed to the first or the middle...
Damiano Testa (Feb 08 2023 at 22:51):
Of course, now that I know this, it is completely clear that you should design algorithms to work with the strengths of each different data structure that you use.
Damiano Testa (Feb 08 2023 at 22:52):
I simply lacked the concept that data structure had a cost involved for using them and that these costs were structure-specific. To me, they were simply names for the concept behind them, so Lists and Array were completely interchangeable. HashMaps are still very hazy.
Henrik Böving (Feb 08 2023 at 22:59):
What exactly a HashMap does is a little harder to explain but as a basic understanding it basically suffices to know: It has both insertion and lookup of key value pairs in O(1) so if that is your main operations they are one of the best choices you can have. However you give up properties for this, for example things are not at all ordered in the way they are inserted, they are basically randomly ordered unlike say in an Array where you can determine the ordering precisely yourself.
If you still want fast lookup but you want your elements to be stored in a certain order you might want to consider a so called RbMap
(Rb is for red black because the underlying data structure is called "red black tree" due to the way it is implemented) instead which has pretty good times still, O(log n) for insert and lookup (which, if you graph it comes pretty close to "constant" as long as your maps dont grow crazy big) and has the nice advantage that your elements are stored in a certain order that you can define by providing an ordering function that compares two elements. (There is also one other reason to use an RbMap or things that are based on an RbMap instead of a HashMap but that's a functional specific thing and a little more complex to explain)
Henrik Böving (Feb 08 2023 at 23:02):
Maybe someone should just write up a guide on which datastructures to use (and especially for things like Arrays with their special quirks in what way) that are available in Lean at some point :D
James Gallicchio (Feb 08 2023 at 23:02):
(I would argue that most people shouldn't need to understand underlying datastructures to get ~good performance, but I know that is a controversial opinion, and there's still a lot of work to get Lean to that point :)
Gabriel Ebner (Feb 08 2023 at 23:03):
for example things are not at all ordered in the way they are inserted, they are basically randomly ordered
While this is is true for the implementation of hashmaps in Lean, this is absolutely not a general property. For example in Python, insertion order is preserved (intentionally).
Damiano Testa (Feb 08 2023 at 23:03):
James, to some extent, what you say is what prompted my question: I wrote the "same" program in awk and in lean and got a performance difference of several orders of magnitude. This is why I asked the question.
Henrik Böving (Feb 08 2023 at 23:04):
James Gallicchio said:
(I would argue that most people shouldn't need to understand underlying datastructures to get ~good performance, but I know that is a controversial opinion, and there's still a lot of work to get Lean to that point :)
Oh I agree fully, I believe that for the majority of users of data structures it is perfectly acceptable that they just know complexities + properties like ordering, locality principle, sharing etc. You do definitely not need an inside view to use them well.
James Gallicchio (Feb 08 2023 at 23:05):
some day... some day i will finish my radix balanced tree implementation..............
Damiano Testa (Feb 08 2023 at 23:06):
Ok, thank you all for your help! My initial take-away is to pay more attention to data structures and slowly develop an intuition for what the strengths and weaknesses of them are. So far, it seems that I should try to use Arrays whenever I was using Lists.
Henrik Böving (Feb 08 2023 at 23:07):
The arrays instead of lists take is one that will get you quite far but be prepared that eventually you'll meet cases where you do end up wanting a list instead. :sweat_smile:
Damiano Testa (Feb 08 2023 at 23:10):
Thanks for the heads up! I certainly meant my assertion as a rough principle, not as a rule written in stone! :smile:
Damiano Testa (Feb 08 2023 at 23:12):
Also, I am now satisfied with the state of this thread: my next homework is to learn about Lists, Arrays and HashMaps and try to understand what each of them is good for.
Slavomir Kaslev (Feb 09 2023 at 08:49):
James Gallicchio said:
(I would argue that most people shouldn't need to understand underlying datastructures to get ~good performance, but I know that is a controversial opinion, and there's still a lot of work to get Lean to that point :)
Theoretically there could be a compiler transform that translates inefficient data-structure like List code to equivalent code using machine friendly data structure like Array. Doesn't something similar happen with nat and gnu mp?
All (finitary) data structures are isomorphic to tuples of size, shape identifier and an array (finitary version of the polynomial functor):
def ogf (c : ℕ → ℕ) (X) :=
Σ n:ℕ, fin (c n) × (fin n → X)
I wonder if those isomorphisms can be used to transform code using say binary trees of nat
s to code using ogf catalan nat
?
Mario Carneiro (Feb 09 2023 at 09:01):
Doesn't something similar happen with nat and gnu mp?
Not really, Nat
is specified to have this behavior, it is not a compiler transform
Mario Carneiro (Feb 09 2023 at 09:01):
Similarly Array
is specified to have the performance characteristics of a dynamic array, even though the model implementation is just a wrapper around a List
Mario Carneiro (Feb 09 2023 at 09:02):
The reason both types exist is because it's not like the performance characteristics of one strictly dominates the other. List.cons
is a lot slower on arrays than lists
Mario Carneiro (Feb 09 2023 at 09:03):
and with arrays you have to be very careful to preserve linearity (which the current type system does not help much with) while lists are very good as a persistent data structure
Mario Carneiro (Feb 09 2023 at 09:06):
your ogf
type doesn't cover
inductive Tree
| leaf : Nat -> Tree
| node : Tree -> Tree -> Tree
which I think is pretty finitary
Mario Carneiro (Feb 09 2023 at 09:07):
oh, you want c
to be catalan
here? That's a pretty crazy rewrite
Mario Carneiro (Feb 09 2023 at 09:09):
And that would be a terrible idea if you want predictable performance. Who knows how expensive it is to evaluate those generating functions
Slavomir Kaslev (Feb 09 2023 at 09:21):
Ideally those generating functions and isomorphisms should be evaluated compile time and never used during runtime in an efficient implementation. I agree that would be a crazy rewrite.
The obvious implementation that applies the isomorphism before and the inverse after each function, is also inefficient.
Mario Carneiro (Feb 09 2023 at 09:23):
what would be the point of the isomorphism then?
Slavomir Kaslev (Feb 09 2023 at 09:43):
To pin with which type, one wants the resulting code to work with.
I guess the original question I should have asked is given an isomorphism between A and B, is there a way to translate code using A to code using B at compile time without calling the isomorphism runtime.
Siddhartha Gadgil (Feb 09 2023 at 10:52):
If it is a constant, I guess one could write an elaborator using mkAppM
and reduce
so that you apply a function and simplify the result. Though what you may want for an isomorphism is to conjugate using two such functions, to transform, work with B, and transform back.
Siddhartha Gadgil (Feb 09 2023 at 10:57):
Concretely, here is code making a list into an array.
import Lean
open Lean Meta Elab Term
elab "listAsArray" t:term : term => do
let e ← elabTerm t none
let e ← mkAppM ``List.toArray #[e]
reduce e
#eval listAsArray [1,2,3,4,5] -- #[1, 2, 3, 4, 5]
Siddhartha Gadgil (Feb 09 2023 at 11:02):
So just for fun, here is the rest of the code to make a List an Array, calculate with the Array and return. I don't know if there is a case where it helps though.
elab "arrayAsList" t:term : term => do
let e ← elabTerm t none
let e ← mkAppM ``Array.toList #[e]
reduce e
macro t:term"mapViaArray" f:term : term => do
`(arrayAsList ((listAsArray $t).map $f))
#eval [1,2,3,4,5] mapViaArray (fun x => x + 1) -- [2, 3, 4, 5, 6]
Siddhartha Gadgil (Feb 09 2023 at 11:33):
One more example, of a kind that may actually be useful @Slavomir Kaslev
def mapViaArrayEg (n: Nat)(f : Nat → Nat) : List Nat :=
[n, n + 1, 2 * n] mapViaArray f
#eval mapViaArrayEg 3 (fun x => x + 1) -- [4, 5, 7]
Slavomir Kaslev (Feb 09 2023 at 14:08):
Siddhartha Gadgil said:
One more example, of a kind that may actually be useful Slavomir Kaslev
def mapViaArrayEg (n: Nat)(f : Nat → Nat) : List Nat := [n, n + 1, 2 * n] mapViaArray f #eval mapViaArrayEg 3 (fun x => x + 1) -- [4, 5, 7]
That would be the inefficient "obvious" implementation that applies and an isomorphism to and back. An efficient implementation would rewrite the list code to array code compile time without runtime conversions. That's probably hard in general for arbitrary isomorphism though.
James Gallicchio (Feb 09 2023 at 15:26):
+1 to what Mario said -- I meant that we can add more advanced data structures that give "all around good performance" without users needing to really understand how/why. There's loads of very cool functional data structures that we can use to get predictably good performance for a majority of use cases :)
Yuri de Wit (Feb 09 2023 at 20:00):
James Gallicchio said:
some day... some day i will finish my radix balanced tree implementation..............
@James Gallicchio , is this the same or similar to C-tree's (and video)?
This seemed quite interesting for Lean when I came across it in Jan. This algo seems to also manually perform drop-reuse, which, I guess, could be handled directly by Perceus (?)
James Gallicchio (Feb 09 2023 at 20:08):
That actually does sound pretty similar, yeah -- this is what I was referring to, which seems to have independently gotten to a similar design
James Gallicchio (Feb 09 2023 at 20:11):
lean's in place updates would definitely help with performance. that's part of why I'm interested in implementing them, to see whether I can get similar performance to ephemeral data structures but without sacrificing performance in the persistent case :D
Last updated: Dec 20 2023 at 11:08 UTC