Zulip Chat Archive

Stream: lean4

Topic: implemented_by type error


Parth Shastri (Oct 27 2022 at 23:20):

@[implemented_by Nat.add]
noncomputable def add (x : Nat) : Nat  Nat :=
  @Nat.rec _ x λ _ => .succ

This code results in the following error:

invalid 'implemented_by' argument 'Nat.add', 'Nat.add' has type
  Nat  Nat  Nat
but 'add' has type
  Nat  Nat  Nat

James Gallicchio (Oct 27 2022 at 23:22):

If you set pp.all it might show more info

Parth Shastri (Oct 27 2022 at 23:36):

pp.all makes no difference. However, running the following reveals that Nat.add's type is ([mdata borrowed:1 Nat]) -> ([mdata borrowed:1 Nat]) -> Nat. I think the solution would be to either relax the equality check or change the error to reflect the special type.

open Lean in
#eval show CoreM _ from do
  dbg_trace ( getConstInfo `Nat.add).type

Graham Leach-Krouse (Jun 28 2023 at 15:44):

I seem to be experiencing a similar error:

@[implemented_by String.toUTF8]
def charsToByteArray (s : String) : ByteArray:=
  String.foldl (λacc char => acc.push char.toNat.toUInt8) (ByteArray.mkEmpty s.length) s

produces

Diagnostics:
invalid 'implemented_by' argument 'String.toUTF8', 'String.toUTF8' has type
  String  ByteArray
but 'charsToByteArray' has type
  String  ByteArray

If I'm doing something silly here, a nudge in the right direction would be appreciated.

Graham Leach-Krouse (Jun 28 2023 at 15:52):

(silly things other than going from Unicode code points to UTF8 bytes incorrectly, I mean)

Henrik Böving (Jun 28 2023 at 16:05):

Do you have all types in scope and are sure byte array is not an auto implicit

Graham Leach-Krouse (Jun 28 2023 at 16:05):

Not sure - how could I check?

Graham Leach-Krouse (Jun 28 2023 at 17:21):

The issue seems to arise when the snippet is included alone in a file, with no imports - is it reproducible?

I don't think there are any auto bound implicit arguments involved here, and the types (ByteArray, String, UInt8, Char, and Nat) are all from Init.lean, I think.

Kevin Buzzard (Jun 28 2023 at 18:52):

set_option autoImplicit false to switch off autoimplicit footguns

Graham Leach-Krouse (Jun 28 2023 at 19:13):

No effect.

set_option autoImplicit false

@[implemented_by String.toUTF8]
def charsToByteArray (s : String) : ByteArray:=
  String.foldl (λacc char => acc.push char.toNat.toUInt8) (ByteArray.mkEmpty s.length) s

still throws

Diagnostics:
invalid 'implemented_by' argument 'String.toUTF8', 'String.toUTF8' has type
  String  ByteArray
but 'charsToByteArray' has type
  String  ByteArray

on the implemented_by line.

Scott Morrison (Jun 28 2023 at 23:41):

set_option pp.raw true reveals the problem:

invalid 'implemented_by' argument 'String.toUTF8', 'String.toUTF8' has type
  ([mdata borrowed:1 String]) -> ByteArray
but 'charsToByteArray' has type
  String -> ByteArray

Scott Morrison (Jun 28 2023 at 23:48):

There's Expr.consumeMData to remove the top-level meta data. Does core have a function that traverses an expression removing all MData?

Mario Carneiro (Jun 28 2023 at 23:52):

you could just put @& String on the def

Scott Morrison (Jun 28 2023 at 23:57):

Indeed.

Scott Morrison (Jun 28 2023 at 23:57):

It would be nice to give people some clue, however :-)

Scott Morrison (Jun 28 2023 at 23:58):

Isn't there some function which, given expressions e1 and e2, will try to find a pp setting that shows they are different?

Mac Malone (Jun 30 2023 at 04:52):

either relax the equality check or change the error to reflect the special type.

It would definitely have to be the later. Since implemented_by swaps out a definitions implementation for another, accepting a mismatch like this would lead to possible memory leaks.

Mario Carneiro (Jun 30 2023 at 05:32):

I'm not sure that is actually the case. I think it would just use the borrowing annotations of the replacement function, because the inc/dec insertion happens relatively late, after implemented_by replacements have occurred

Mac Malone (Jun 30 2023 at 05:36):

Oh, I was under the impression implemented_by was the last step and no further compilation occurred, but I have never checked, so I could easily be wrong.

Mario Carneiro (Jun 30 2023 at 06:35):

well you want to do inlining and stuff after that

Mario Carneiro (Jun 30 2023 at 06:35):

you can't safely inline a function which is implemented_by

Mario Carneiro (Jun 30 2023 at 06:35):

and quite a lot of compiler optimizations are triggered after inlining

Mario Carneiro (Jun 30 2023 at 06:36):

so yeah, it's pretty early

Mac Malone (Jun 30 2023 at 07:15):

Mario Carneiro said:

well you want to do inlining and stuff after that

But functions are independently compiled. Inlining is an optimization that occurs in the inliner function after the implemented_by function has long has its IR replaced with the generated IR of the implementing function. Since inc/decs are part of the IRs, the question is about the order of the optimizations when generating the implementing function (and replacing the implemented_by function), not things that happen afterwards.

Mac Malone (Jun 30 2023 at 07:19):

And since inc/decs are part of the IR and my understanding is that implemented_by just substitutes out the compiled body for the the compiled body of the implementing function, I believe memory leaks could result.

Mario Carneiro (Jun 30 2023 at 07:20):

hm, that's definitely not true for the new compiler, not sure about the old

Mac Malone (Jun 30 2023 at 07:20):

Well, I was definitely talking about the old compiler, as that is the one currently producing executed code.

Mario Carneiro (Jun 30 2023 at 07:22):

looks like implemented_by uses the cstage1 result, not stage2

Mac Malone (Jun 30 2023 at 07:24):

I will admit have absolute no clue what that means. My knowledge of the old compiler is limited to the output of trace.compiler.ir.result and the FFI/C interface (and thus the use of things like borrowing annotations).

Mario Carneiro (Jun 30 2023 at 07:24):

and indeed, stage1 is only halfway through the process, long before the IR stuff has happened and it is still an expr

Mario Carneiro (Jun 30 2023 at 07:24):

me neither, but set_option trace.compiler.stage1 true should show you what it looks like

Mac Malone (Jun 30 2023 at 07:26):

In that case, why do implemented_by functions have no stage2 trace?

Mac Malone (Jun 30 2023 at 07:27):

Or IR result?

Mario Carneiro (Jun 30 2023 at 07:28):

MWE?

Mario Carneiro (Jun 30 2023 at 07:29):

Oh I see

set_option trace.compiler.stage2 true
def bar (s : @& String) : String := s
@[implemented_by bar]
def foo (s : @& String) : String := s

Mario Carneiro (Jun 30 2023 at 07:30):

this seems reasonable, foo doesn't need to be compiled at all past stage1 since that is when it is replaced by bar

Mac Malone (Jun 30 2023 at 07:30):

Also, these lines in the C compiler code suggest compilation is skipped if a definition is implemented_by.

Mario Carneiro (Jun 30 2023 at 07:31):

also reasonable, if foo is treated as an opaque constant up until it is replaced

Mac Malone (Jun 30 2023 at 07:31):

Mario Carneiro said:

this seems reasonable, foo doesn't need to be compiled at all past stage1 since that is when it is replaced by bar

Oh, so you mean that it does just use the stage2 of bar. Thus memory leaks can result?

Mario Carneiro (Jun 30 2023 at 07:31):

no, at stage1 the constant for foo is replaced by the stage1 compiled version of bar

Mario Carneiro (Jun 30 2023 at 07:32):

the stage2 version of bar is not used, at least at this point

Mac Malone (Jun 30 2023 at 07:32):

So where does the stage 2 (result) of foo come from?

Mario Carneiro (Jun 30 2023 at 07:33):

I thought you just said the trace is skipped

Mario Carneiro (Jun 30 2023 at 07:34):

foo is not compiled

Mac Malone (Jun 30 2023 at 07:35):

Well, foo needs to get code from somewhere, so where does it get its code if it is not compiled (as the trace and C code suggest) and does not use bar's later result (as you said it only uses stage1)?

Mario Carneiro (Jun 30 2023 at 07:36):

Occurrences of foo in downstream code are replaced by bar around stage1 of the compiler pipeline

Mario Carneiro (Jun 30 2023 at 07:37):

foo itself has no code generated for it

Mac Malone (Jun 30 2023 at 07:38):

Were do you see the replacement of foo by bar in the code?

Mario Carneiro (Jun 30 2023 at 07:38):

https://github.com/leanprover/lean4/blob/ec42581d1f3a88c2f62ab93437bc3cbf843c1ee9/src/library/compiler/erase_irrelevant.cpp#L81-L84

Mario Carneiro (Jun 30 2023 at 07:38):

note the mk_cstage1 and get_implemented_by_attribute

Mario Carneiro (Jun 30 2023 at 07:41):

in the main compiler pipeline function you linked, you can see erase_irrelevant somewhere after stage1 defs are saved and much before inc/dec instructions are inserted

Mac Malone (Jun 30 2023 at 07:41):

That seems o only apply if the n in question is marked @[inline].

Mac Malone (Jun 30 2023 at 07:42):

However, you do appear to be correct none-the-less:

set_option trace.compiler.ir.result true
@[noinline] def foo1 := 1
@[implemented_by foo1, noinline] def foo2 := 2
def foo := foo2
/-
def foo : obj :=
  let x_1 : obj := foo1;
  ret x_1
-/

Mario Carneiro (Jun 30 2023 at 07:42):

in the else case it visits the name itself, which is to say it made the replacement but did not inline it

Mario Carneiro (Jun 30 2023 at 07:43):

the mk_cstage1 part is indicative of what kind of IR we are working with here

Mac Malone (Jun 30 2023 at 07:43):

Ah

Mac Malone (Jun 30 2023 at 07:44):

So this means the old compiler never actually uses the implemented_by definition at all.

Mario Carneiro (Jun 30 2023 at 07:44):

no, as we would expect

Mario Carneiro (Jun 30 2023 at 07:45):

it can even be marked noncomputable and it still works

Mac Malone (Jun 30 2023 at 07:46):

Sorry, I meant that the implemented_by definition never actually gets own symbol that is used (no that the body of the definition is compiled).

Mario Carneiro (Jun 30 2023 at 07:46):

this is all to say, as long as the replacement is lean-type-correct I think it is safe to do and ABI concerns are not an issue

Mario Carneiro (Jun 30 2023 at 07:47):

or at least stage1-type-correct

Mac Malone (Jun 30 2023 at 07:47):

I was thinking that the compiled symbol for foo2 would have foo1's code, but it seems like foo2 just does not exist.

Mario Carneiro (Jun 30 2023 at 07:47):

that's kind of a waste of symbols

Mac Malone (Jun 30 2023 at 07:47):

This makes me wonder what happens if implemented_by and export are used together.

Mario Carneiro (Jun 30 2023 at 07:49):

I would guess nothing, it's not compiled so the export would have no effect

Mario Carneiro (Jun 30 2023 at 07:49):

it might be nice to give an error about that though

Mac Malone (Jun 30 2023 at 07:52):

You are correct. It produces nothing.

Mac Malone (Jun 30 2023 at 07:53):

Mario Carneiro said:

that's kind of a waste of symbols

I mean, if, for example, some external ABI is expecting a specific symbol then duplication is useful.

Mac Malone (Jun 30 2023 at 07:54):

Hence why most low-level compilers and linkers do not skip producing symbols unless they are told to.

Mac Malone (Jun 30 2023 at 07:55):

But, in Lean's case, I imagine that is considered largely undefined implementation detail.


Last updated: Dec 20 2023 at 11:08 UTC