Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closures, why do we need func and funcC variants? #341

Open
adam-antonik opened this issue Apr 14, 2020 · 3 comments
Open

Closures, why do we need func and funcC variants? #341

adam-antonik opened this issue Apr 14, 2020 · 3 comments
Assignees

Comments

@adam-antonik
Copy link
Contributor

I'm afraid find the way closures get created and treated by the type-system an annoying aspect of hobbes. From an immediate practical point of view, why are the std lib functions not written uniformly using apply? For instance if I rewrite filter as

wherePs :: (f, [a], long, bitvec) -> bitvec
wherePs p xs i bv =
  if (i == length(xs)) then
    bv  
  else
    (let _ = bvSet(bv, i, apply(p, xs[i])) in wherePs(p, xs, i+1, bv))
 
whereP :: (f, [a]) -> bitvec
whereP p xs = wherePs(p, xs, 0L, newBitvec(length(xs)))
 
filter :: (f, [a]) -> [a] 
filter p xs = selectB(xs, whereP(p, xs))

Then filter seems to work with both functions and closures. Is there any downside to doing this?

@kthielen kthielen self-assigned this Apr 14, 2020
@kthielen
Copy link
Contributor

This is a good issue. I hope you don't mind if I lay out my background thoughts.

Should we distinguish function types?

This awkwardness comes from the fact that closures and "contextless" functions (i.e.: C-style function pointer types) are distinguished. The reason that they're distinguished is that it's much cheaper to call a contextless function than a closure -- for a contextless function, you just jump to it, whereas for a closure you have to dereference the closure record to load the address of its supporting contextless function and pass it the environment captured in the closure.

Most ML variants just lift all function types into closures (e.g. if you don't have context then just capture the unit environment). But then all function calls are as expensive as closure calls, and we've lost the ability to distinguish cases.

If we accept that function types can be distinguished, can it be made less awkward to ignore their differences in cases where we don't care?

At a basic level it should be, because our interpretation of type classes and overloading has no runtime cost (we just rewrite terms) then we should be able to leave the function kind abstract and let usage select an implementation for each kind it needs. It's a little awkward (you lose the immediate obvious structure of a -> b) but this can be done now as you do above.

Maybe syntax can make this less awkward?

It should be pretty easy to say that if you parse a type signature like (int -> bool) -> int then you can translate this behind the scenes to Function f int bool => f -> int so that you get this overloading for free. And likewise any internal call f(x) could be rewritten to apply(f,x) with no harm done.

OTOH, maybe you couldn't do that to the right of the -> because e.g. int -> (int -> bool) says that you will just produce a contextless function, which (if you do) won't be general enough to satisfy Function f int bool => int -> f. So maybe we'd have to be a little careful about this in the front end.

Is it actually useful to overload the definition of a function in this way?

Beyond distinguishing contextless functions and closures, there are reasonable interpretations of many other types as functions. An array can be considered as a (partial) function from index to value, a btree as a (partial) function from key to value, etc. And there are useful parametric definitions like that a pair of functions can be considered a function on pairs, or a sum of functions as a function on sums. APL programmers love this kind of overloading, and we can outdo them this way. :)

Where else could this kind of thing be going on?

This is actually similar to what we've seen with other types.

For example, we have a baked-in variable-length array type [a] which is laid out in memory like {n:long, values:a[n]} (ie: a length followed by that many a values). When you type [1,2,3,4,5] you get a value with this structure (which can be fed into functions like map, filter, etc).

But this isn't the only possible memory layout for an array. In our structured data files, we wind up storing sequences as "ropes" or linked lists of arrays. Although they don't have exactly the memory layout of the default array, they do have a length and can be indexed.

So we wound up making an Array type class to abstract away different representations of arrays for functions that don't care about their representations (which is most functions).

Here again we could have a front end rewriting of any function like [int] -> int to Array a int => a -> int and again we'd have to make sure it was blocked to the right of -> because int -> [int] is probably what you mean by \x.[x,x,x] and not Array a int => int -> a.

Likewise for record types there's a similar overloading that we use (so that non-records can have synthetic properties), and variants. So that's established overloading for just about every kind of type.

I think that it might be worth doing this kind of implicit abstraction in the front end, since it definitely saves you some awkwardness. At some level you'd probably want to disable it, at least e.g. where you're defining new instances of the Function type class. I'm not sure what the best way would be to manage that, maybe some kind of "language level selection" syntax that defaults you to this mode of rewriting.

@adam-antonik
Copy link
Contributor Author

Thanks. I wanted to make this exact sort of suggestion, but this clearly involves some significant changes.
However, I'm more interested here I suppose in current form of the standard library. There are a number of functions that come in separate '->' and closure forms. Does writing a single version using apply from the Function type-class introduce overhead? If so, why does "each" in farray use Function (and why does "eachC" exist when "each" can take a closure?).
Tangentially, why have lfilter and filter as separate names, and not instances of a Filter type-class?

@kthielen
Copy link
Contributor

No, there's no overhead to using one function with a Function constraint rather than two explicitly specialized functions. The way that overload resolution works, the function with a Function constraint will just get specialized into two (so the output code should be the same, just saves you from having to write the boilerplate code yourself).

The only reason that the standard library here is in the state it's in is just history -- I didn't have all of the pieces in place when I needed some of these functions. Internally we've built up a different standard library, which I'm hoping to split out at some point but maybe it'd be easier to just update these bits here, or break it out entirely so it's easier for users to swap in their own standard library.

re: filter, again just history (though lfilter is the filter for lists -- at some level you need one of those, even if it's just to give to a Filter class for rewriting user programs). We probably don't need an explicit type class for filter if it's implicitly overloaded on function and array types (though all polymorphic/qualified identifiers get a hidden type class anyway).

Safe to say, it's worth at least a full pass through the standard lib here to freshen things up and reduce them where possible. Maybe worth also or optionally factoring it out so that it can be updated independently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants