This proposal adds function references that are typed and can be called directly. Unlike funcref
and the existing call_indirect
instruction, typed function references do not need to be stored in a table to be called (though they can). A typed function reference can be formed from any function index.
The proposal distinguished regular and nullable function reference. The former cannot be null, and a call through them does not require any runtime check.
The proposal has instructions for producing and consuming (calling) function references. It also includes instruction for testing and converting between regular and nullable references.
Typed references have no canonical default value, because they cannot be null. To enable storing them in locals, which so far depend on default values for initialization, the proposal also tracks the initialization status of locals during validation.
-
Enable efficient indirect function calls without runtime checks
-
Represent first-class function pointers without the need for tables
-
Easier and more efficient exchange of function references between modules and with host environment
-
Separate independently useful features from GC proposal
-
This proposal is based on the reference types proposal
-
Add a new form of typed reference type
ref $t
and a nullable variant(ref null $t)
, where$t
is a type index; can be used as both a value type or an element type for tables -
Add an instruction
ref.as_non_null
that converts a nullable reference to a non-nullable one or traps if null -
Add an instruction
br_on_null
that converts a nullable reference to a non-nullable one or branches if null, as well as the invertedbr_on_non_null
-
Add an instruction
call_ref
for calling a function through aref $t
-
Refine the instruction
ref.func $f
to return a typed function reference -
Track initialisation status of locals during validation and only allow
local.get
after alocal.set/tee
in the same or a surrounding block. -
Add an optional initializer expression to table definitions, for element types that do not have an implicit default value.
The function $hof
takes a function pointer as parameter, and is invoked by $caller
, passing $inc
as argument:
(type $i32-i32 (func (param i32) (result i32)))
(elem declare funcref (ref.func $inc))
(func $hof (param $f (ref $i32-i32)) (result i32)
(i32.add (i32.const 10) (call_ref $i32-i32 (i32.const 42) (local.get $f)))
)
(func $inc (param $i i32) (result i32)
(i32.add (local.get $i) (i32.const 1))
)
(func $caller (result i32)
(call $hof (ref.func $inc))
)
It is also possible to create a typed function table:
(table 1 (ref $i32-i32) (ref.func $inc))
Such a table can neither contain null
entries nor functions of another type. Because entries can't be null
, tables of concrete type are required to be declared with an explicit initialisation value. Any use of call_indirect
on this table does hence avoid all runtime checks beyond the basic bounds check. By using multiple tables, each one can be given a homogeneous type. The table can be initialized with an initializer or by growing it.
Typed function references are a subtype of funcref
, so they can also be used as untyped references. All previous uses of ref.func
remain valid:
(func $f (param i32))
(func $g)
(func $h (result i64))
(table 10 funcref)
(func $init
(table.set (i32.const 0) (ref.func $f))
(table.set (i32.const 1) (ref.func $g))
(table.set (i32.const 2) (ref.func $h))
)
Based on reference types proposal, which introduces types funcref
and externref
.
A heap type denotes a user-defined or pre-defined data type that is not a primitive scalar:
-
heaptype ::= <typeidx> | func | extern
$t ok
iff$t
is defined in the contextfunc ok
andextern ok
, always
-
In the binary encoding,
- the
<typeidx>
type is encoded as a (positive) signed LEB - the others use the same (negative) opcodes as the existing
funcref
,externref
, respectively
- the
A reference type denotes the type of a reference to some data. It may either include or exclude null:
-
(ref null? <heaptype>)
is a new form of reference typereftype ::= ref null? <heaptype>
ref null? <heaptype> ok
iff<heaptype> ok
-
Reference types now all take the form
ref null? <heaptype>
funcref
andexternref
are reinterpreted as abbreviations (in both binary and text format) for(ref null func)
and(ref null extern)
, respectively- Note: this refactoring allows using
func
andextern
as heap types, which is relevant for future extensions such as type imports
-
In the binary encoding,
- null and non-null variant are distinguished by two new (negative) type opcodes
- the opcodes for
funcref
andexternref
continue to exist as shorthands as described above
- Type definitions are validated in sequence and without allowing recursion
functype* ok
- iff
functype* = epsilon
- or
functype* = functype'* functype''
andfunctype'* ok
andfunctype'' ok
using only type indices up to|functype'*|-1
- iff
The following rules, now defined in terms of heap types, replace and extend the rules for basic reference types.
-
Reference types are covariant in the referenced heap type
(ref null <heaptype1>) <: (ref null <heaptype2>)
- iff
<heaptype1> <: <heaptype2>
- iff
(ref <heaptype1>) <: (ref <heaptype2>)
- iff
<heaptype1> <: <heaptype2>
- iff
-
Non-null types are subtypes of possibly-null types
(ref <heaptype1>) <: (ref null <heaptype2>)
- iff
<heaptype1> <: <heaptype2>
- iff
- Any function type is a subtype of
func
$t <: func
- iff
$t = <functype>
- iff
- Type indices are subtypes only if they define equivalent types
$t <: $t'
- iff
$t = <functype>
and$t' = <functype'>
and<functype> == <functype'>
- iff
- Note: Function types are invariant for now. This may be relaxed in future extensions.
-
Any numeric value type is defaultable (to 0)
-
A reference value type is defaultable (to
null
) iff it is of the formref null $t
-
Function-level locals must have a type that is defaultable.
-
Table definitions with a type that is not defaultable must have an initializer value. (Imports are not affected.)
- Locals are now recorded in the context with types of the following form:
localtype ::= set? <valtype>
- the flag
set
records that the local has been initialized - all locals with non-defaultable type start out unset
-
Instructions and instruction sequences are now typed with types of the following form:
instrtype ::= <functype> <localidx>*
- the local indices record which locals have been set by the instructions
- most typing rules except those for locals and for instruction sequences remain unchanged, since the index list is empty
-
There is a natural notion of subtyping on instruction types:
[t1*] -> [t2*] x1* <: [t3*] -> [t4*] x2*
- iff
t1* = t0* t1'*
- and
t2* = t0* t2'*
- and
[t1'*] -> [t2'*] <: [t3*] -> [t4*]*
- and
{x2*} subset {x1*}
- iff
-
Block types are instruction types with empty index set.
Note: Extending block types with index sets to allow initialization status to last beyond a block's end is a possible extension.
-
ref.func
creates a function reference from a function indexref.func $f : [] -> [(ref $t)]
- iff
$f : $t
- iff
- this is a constant instruction
-
call_ref <typeidx>
calls a function through a referencecall_ref $t : [t1* (ref null $t)] -> [t2*]
- iff
$t = [t1*] -> [t2*]
- iff
- traps on
null
-
return_call_ref
tail-calls a function through a referencereturn_call_ref $t : [t1* (ref null $t)] -> [t2*]
- iff
$t = [t1*] -> [t2*]
- and
t2* <: C.result
- iff
- traps on
null
-
ref.null <heaptype>
is generalised to take a<heaptype>
immediateref.null ht: [] -> [(ref null ht)]
- iff
ht ok
- iff
-
ref.as_non_null
converts a nullable reference to a non-null oneref.as_non_null : [(ref null ht)] -> [(ref ht)]
- iff
ht ok
- iff
- traps on
null
-
br_on_null <labelidx>
checks for null and branches if presentbr_on_null $l : [t* (ref null ht)] -> [t* (ref ht)]
- iff
$l : [t*]
- and
ht ok
- iff
- branches to
$l
onnull
, otherwise returns operand as non-null
-
br_on_non_null <labelidx>
checks for null and branches if not presentbr_on_non_null $l : [t* (ref null ht)] -> [t*]
- iff
$l : [t* (ref ht)]
- and
ht ok
- iff
- branches to
$l
if operand is notnull
, passing the operand itself under non-null type (along with potential additional operands)
-
Note:
ref.is_null
already exists via the reference types proposal
Typing of local instructions is updated to account for the initialization status of locals.
-
local.get <localidx>
local.get $x : [] -> [t]
- iff
$x : set t
- iff
-
local.set <localidx>
local.set $x : [t] -> [] $x
- iff
$x : set? t
- iff
-
local.tee <localidx>
local.tee $x : [t] -> [t] $x
- iff
$x : set? t
- iff
Note: These typing rules do not try to exclude indices for locals that have already been set, but an implementation could.
Typing of instruction sequences is updated to account for initialization of locals.
instr*
instr1 instr* : [t1*] -> [t3*] x1* x2*
- iff
instr1 : [t1*] -> [t2*] x1*
- and
instr* : [t2*] -> [t3*] x2*
under a context wherex1*
are changed toset
- iff
epsilon : [] -> [] epsilon
Note: These typing rules do not try to eliminate duplicate indices, but an implementation could.
A subsumption rule allows to go to a supertype for any instruction:
instr
instr : [t1*] -> [t2*] x*
- iff
instr : [t1'*] -> [t2'*] x'*
- and
[t1'*] -> [t2'*] x'* <: [t1*] -> [t2*] x*
- iff
Table definitions have an initializer value:
-
(table <tabletype> <constexpr>)
is an extended form of table definition(table <limits> <reftype> <constexpr>) ok
iff<limits> <reftype> ok
and<constexpr> : <reftype>
-
(table <tabletype>)
is shorthand for(table <tabletype> (ref.null <heaptype>))
, where<heaptype>
is the element heap type contained in<tabletype>
- note: the typing rule above implies that this only validates if the table's reference type is nullable
Opcode | Type | Parameters |
---|---|---|
-0x10 | funcref |
|
-0x11 | externref |
|
-0x1c | (ref ht) |
$t : heaptype |
-0x1d | (ref null ht) |
$t : heaptype |
The opcode for heap types is encoded as an s33
.
Opcode | Type | Parameters |
---|---|---|
i >= 0 | i | |
-0x10 | func |
|
-0x11 | extern |
Opcode | Instruction | Immediates |
---|---|---|
0x14 | call_ref $t |
$t : u32 |
0x15 | return_call_ref $t |
$t : u32 |
0xd4 | ref.as_non_null |
|
0xd5 | br_on_null $l |
$l : u32 |
0xd6 | br_on_non_null $l |
$l : u32 |
Entries to the table section are extended as follows:
Table Definition | Note |
---|---|
tabletype | null-initialized table (as before) |
0x40 0x00 tabletype constexpr | explicitly initialized table |
The encoding of a table type starts with the encoding of a reference type, which cannot be 0x40 (since this is a pseudo type code otherwise only used in block types). Consequently, both forms can be distinguished by the first byte. The second byte is reserved for possible future extensions.
The group decided to go with the "no-frills" approach for the JS API for the time being.
In the context of this proposal, that means that conversions (or type reflection) at concrete reference type throws a TypeError
exception.
In addition to the rules for basic reference types:
-
If the target type of a ToWebAssemblyValue is a concrete reference type, then throw
TypeError
. -
If the source type of a ToJSValue is a concrete reference type, then throw
TypeError
.
TypeError
is produced if theGlobal
constructor is invoked without a value argument but a type that is not defaultable.
-
The
Table
constructor gets an additional optional argumentinit
that is used to initialize the table slots. It defaults tonull
. ATypeError
is produced if the argument is omitted and the table's element type is not defaultable. -
The
Table
methodgrow
gets an additional optional argumentinit
that is used to initialize the new table slots. It defaults tonull
. ATypeError
is produced if the argument is omitted and the table's element type is not defaultable.
In the presence of the JS type reflection proposal:
-
Global.type
throwsTypeError
when encountering a concrete reference type. -
Table.type
throwsTypeError
when encountering a concrete reference type as element type. -
Function.type
throwsTypeError
when encountering a concrete reference type as parameter or result type. -
Module.imports
throwsTypeError
when encountering a concrete reference type in any of the global, table or function types of imports. -
Module.exports
throwsTypeError
when encountering a concrete reference type in any of the global, table or function types of exports.
Note: The GC proposal is expected to at least add anyref
as a recognised RefType
, but possibly throw on other abstract reference types.
Post-MVP, the type reflection abilities of the JS API could be refined based on the JS type reflection proposal.
-
A
RefType
can be described by an object of the form{ref: HeapType, ...}
type RefType = ... | {ref: HeapType, nullable: bool}
-
A
HeapType
can be described by a suitable union typetype HeapType = "func" | "extern" | DefType
Note: The GC proposal adds additional heap types.
In addition to the rules for basic reference types:
-
Any function that is an instance of
WebAssembly.Function
with type<functype>
is allowed asref <functype>
orref null <functype>
. -
Any non-null external reference is allowed as
ref extern
. -
The
null
value is allowed asref null ht
.
Note: The GC proposal is expected to allow additional conversions for anyref
.
In the future (especially with the GC proposal) it will be desirable to allow the usual subtyping rules on function types. This is relevant, for example, to encode method tables.
While doing so, the semantics of static and dynamic type checks need to remain coherent. That is, both static and dynamic type checking (as well as link-time checks) must use the same subtype relation. Otherwise, values of a subtype would not always be substitutable for values of a supertype, thereby breaking a fundemental property of subtyping and various transformations and optimisations relying on substitutability.
At the same time, care has to be taken to not damage the performance of existing instructions.
In particular, call_indirect
performs a runtime type check over (structural) function types that is expected to be no more expensive than a single pointer comparison.
That effectively rules out allowing any form of subtyping to kick in for this check.
This tension with call_indirect
can be resolved by distinguishing function types that have further subtypes and those that don't.
The latter are sometimes called exact types.
Exact types might come in handy in a few other circumstances, so we could distinguish the two forms in a generic manner by enriching heap types with a flag as follows:
heaptype ::= exact? <typeidx> | func | extern
Exact types are themselves subtypes of corresponding non-exact types, but the crucial difference is that they do not have further subtypes themselves. That is, the following subtype rules would be defined on heap types:
-
exact? $t <: $t'
- iff
$t = <functype>
and$t' = <functype'>
- and
<functype> <: <functype'>
- iff
-
exact $t <: exact $t'
- iff
$t = <functype>
and$t' = <functype'>
- and
<functype> == <functype'>
- iff
in combination with the canonical rules for function subtyping itself:
[t1*]->[t2*] <: [t1'*]->[t2'*]
- iff
(t1' <: t1)*
- and
(t2 <: t2')*
- iff
(and appropriate rules for other types that may be added to the type section in the future).
With this:
-
The type of each function definition is interpreted as its exact type.
-
The type annotation on
call_indirect
is interpreted as an exact type. (It would also be possible to extendcall_indirect
to allow either choice, leaving it to to the producer to make the trade-off between performance and flexibility.) -
For imports and references, every module can decide individually where it requires an exact type.
For any import or reference in a module's interface that flows into a function table on which it invokes call_indirect
, the module can thereby enforce exact types that are guaranteed to succeed the signature check.
Consider:
(module $A
(type $proc (func))
(func (export "f") (result funcref) ...)
(func (export "g") (result (ref $proc)) ...)
)
(module $B
(type $th (func (result funcref)))
(func $h (import "..." "h") (type exact $th))
(table $tab funcref (elem $h))
(func $call
...
(call_indirect $th (i32.const 0))
...
)
(func $update (param $f (ref exact $th))
(table.set (local.get $f) (i32.const 0))
)
)
In module $B
, the imported function $h
requires an exact type match.
That ensures that a client cannot supply a function of a subtype,
such as A.g
,
which would break the use of call_indirect
in $call
.
An attempt to do so will fail at link time.
Passing A.f
would be accepted, however.
Similarly, $update
can only be passed a reference with exact type,
again ensuring that $call
will not fail unexpectedly subsequently.
If a function import (or reference) never interferes with call_indirect
, however, then the import can have a more flexible type:
(module $C
(type $th (func (result funcref)))
(func $h (import "..." "h") (type $th))
(func $k (result (ref $th))
(call $h)
)
)
Here, the laxer import type is okay: for $h
, any function that returns a subtype of funcref
is fine, since all choices will satisfy the result type (ref $th)
of function $k
performing the call.
More generally, the following imports are well-typed:
(module
(type $proc (func))
(type $tf (func (result funcref)))
(type $tg (func (result (ref $proc))))
(func $h1 (import "A" "g") (type $tg))
(func $h2 (import "A" "g") (type exact $tg))
(func $h3 (import "A" "g") (type $tf))
;; (func $h4 (import "A" "g") (type exact $tf)) ;; fails to link!
)
Note that the use of exact types for imports or references flowing into a function table is not enforced by the type system. It is the producer's responsibility to protect its interface with suitably narrow types.
For exposition, we assumed above that exact types are a new construct, denoted by an explicit flag. However, such a design would still change the meaning of existing type expressions (by suddenly allowing subtyping) and would hence not be backwards compatible, for the reasons stated above.
To avoid breaking existing modules when operating in new environments that support subtyping, we must preserve the current subtype-less, i.e. exact, meaning of the pre-existing function types -- at least in the binary format. Subtypable function references/imports must be introduced as the new construct.
It's a slightly different question what to do for the text format, however.
To maintain full backwards compatibility, the flag would likewise need to be inverted:
instead of an optional exact
flag we'd have an optional sub
flag with the opposite meaning:
heaptype ::= sub? <typeidx> | func | extern
In the binary format it doesn't really matter which way both alternatives are encoded.
But for the text format, this inversion will be rather annoying:
the normal use case is to allow subtyping;
but to express that, every import or reference type will have to explicitly state sub
.
One possibility might be to keep the exact
syntax above for the text format,
and effectively change the interpretation of the existing syntax.
That means that existing text format programs using e.g. function imports would change their meaning, and running wasm/wat converters would produce different results.
In particular, rebuilding a binary from a pre-existing text file may produce a binary with weaker interface requirements.
Is the risk involved in that worth taking? How much of a problem do we think this would be?