Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] User-defined functions #201

Open
aaaaaa123456789 opened this issue Sep 2, 2017 · 17 comments
Open

[Feature request] User-defined functions #201

aaaaaa123456789 opened this issue Sep 2, 2017 · 17 comments
Labels
enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM

Comments

@aaaaaa123456789
Copy link
Member

aaaaaa123456789 commented Sep 2, 2017

I'm sure that this has been mentioned before, but I've been coming up with a full feature proposal and I'd like to know how feasible it is. What follows is a proposed way of implementing functions, at least in a way I'd consider usable.

The idea is to allow the user to define functions, that would be evaluated in-line when called, similar to the various built-ins. Functions can have one or more arguments, which define variables that are local to the function; they are substituted in the expression. An example of the syntax I have in mind would be:

;definition
addnums(first, second) = ({first}) + ({second})

;usage
  ld hl, wTwoPlusTwo
  ld [hl], addnums(2, 2)

Functions can be defined with the same name as long as they have a different number of arguments; the proper version of the function would be used when called. For simplicity, all arguments are required.

average(one) = {one}
average(one, two) = (({one}) + ({two})) / 2
average(one, two, three) = (({one}) + ({two}) + ({three})) / 3

Declaring functions with the same name and number of arguments is not allowed; such declarations would either replace previous ones or cause an error (either works).

Calling built-in functions should obviously be allowed from user-defined functions:

linearaddress(label) = (BANK({label}) - 1) * $4000 + ({label})

In order to simplify functions like the average one above, varargs functions could be defined. Such functions would be used when there is no suitable non-varargs function (i.e., with the correct number of arguments) to call. For instance:

countargs() = 0
countargs(arg, ...) = 1 + countargs({...})

sum() = 0
sum(arg, ...) = ({arg}) + sum({...})

average(...) = sum({...}) / countargs({...})
roundedavg(...) = (2 * sum({...}) + 1) / (2 * countargs({...}))

The special symbol {...} is replaced by the list of variable arguments. Defining multiple varargs functions with the same name is not allowed; however, they can coexist with non-varargs functions.

As a silly example just to show how they would coexist:

fn(a) = {a}
fn(a, b, c) = {c}
fn(a, b, ...) = {b}

  db fn(4) ;4
  db fn(5, 9) ;9
  db fn(2, 6, 3) ;3
  db fn(1, 8, 0, 7) ;8
  ;db fn() is an error because there's no version that takes 0 arguments

I hope that this is doable, as it would certainly reduce some repetition and extremely macro-heavy code in some cases. As a simple realistic example, take this macro from Prism (might be slightly different because I'm typing it from memory):

;definition
coord: MACRO
  if _NARG > 3
    ld \1, (\4) + (\2) + ((\3) * SCREEN_WIDTH)
  else
    ld \1, wTilemap + (\2) + ((\3) * SCREEN_WIDTH)
  endc
ENDM

hlcoord EQUS "coord hl, "

;usage
  hlcoord 2, 5 ;sets hl to the corresponding tilemap position

Which using functions would be a lot cleaner:

;definition
coord(x, y, base) = ({base}) + ({x}) + ({y}) * SCREEN_WIDTH
coord(x, y) = coord({x}, {y}, wTilemap)

;usage
  ld hl, coord(2, 5)

I await your responses; thanks for reading.

@BenHetherington
Copy link
Contributor

Interesting! So, if I'm right in saying, the main things that distinguish this from existing macros are:

  • The ability to use them as part of an expression (like EQU/EQUS/SETs all can be)
  • Named parameters
  • Easier overloading (based on number of parameters)

I do like that you're going for a functional-style feel, but it bothers me that it's a very different syntax for something that's intuitively similar to macros.

Since I think the latter two features would be also good for macros to have, we could potentially unify their syntaxes, perhaps in a manner similar to this:

MyFunction: FUNCTION(foo, bar)   ; Reflects usage as MyFunction(foo, bar)
    \foo + \bar
    ENDF
MyMacro: MACRO foo, bar          ; Reflects usage as MyMacro foo, bar
    ld a, \foo + \bar            ; Similar to \1, \2
    ENDM

The syntax to use them would probably still have to differ, due to the existing difference between macros and built-in functions.

This approach makes functions look procedural, so I don't know if we should do so (and add a RETURN statement, and possibly make that implicit for the last line), or only allow a single expression (which would make this rather verbose for a one-liner).


Alternately, you could think of it as something more similar to EQU or SET. Then the MyFunction() = ... syntax could be a nicer way of doing MyFunction() FUNC ... (similar to SET).

But, in order to add a little consistency, I'd consider using \foo to access parameters, since it's closer to how you already access parameters in macros:

MyFunction(foo, bar) = \foo + \bar

You can probably tell that I'm just thinking around the issue, and haven't really made any decisions on what I like best. However we implement it though, I do think this would be a useful feature.

Thoughts?

@AntonioND
Copy link
Member

AntonioND commented Sep 4, 2017

I prefer the {} version because they are less ambiguous: "\foo1" vs "{foo}1" (see #63 ).

In principle it doesn't sound bad. In practice, this is a lot of work. The code that handles macros is quite annoying because it simply cannot be "just parsed", you have to inject the expanded code of the macro...

EDIT: Also, more things like macros, equs, etc would make this even worse: #64

@aaaaaa123456789
Copy link
Member Author

I'd say the functional style is simpler, although I can imagine a few cases in which procedural would be more powerful. That being said, procedural-style functions would require you to actually execute them, so that might increase the burden on your side.

I'm also assuming lazy evaluation here (see the linearaddress example above, which wouldn't work without it because BANK($4444) is invalid), which might be harder to achieve in a procedural way.

That being said, I agree that named parameters and overloads (and even varargs) could be useful for macros, so a unified syntax sounds like a good idea. The function/endf syntax looks a bit long for functional-style one-liners, but the "= replaces func" syntax looks good. I'm not really biased towards any particular syntax, though.

#64 and the numerical lexer abuse in #63 are downright silly, and the assembler should probably just fail and exit if it detects weird behavior like that with functions. I can't see something like that appearing in real code.

By the way, I'm well aware that this is a big feature, so I don't expect it to be finished soon.

@yenatch
Copy link
Contributor

yenatch commented Jan 14, 2018

Can named args be used on call as well? And can it be multiline?

    MyFunction (
        foo = 2,
        bar = 1
    )

@AntonioND AntonioND added enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM labels Apr 2, 2018
@ISSOtm ISSOtm added this to the v0.4.1 milestone Apr 4, 2020
@ISSOtm ISSOtm closed this as completed Apr 4, 2020
@ISSOtm ISSOtm reopened this Apr 4, 2020
@ISSOtm
Copy link
Member

ISSOtm commented Apr 8, 2020

One of the things for sure: such functions will be "pure". No side effects, only produce an expression.
I would like avoiding the approach macros currently take (re-lexing+parsing a buffer), since that's pretty slow. This brought up a problem we have at all: how do captures work?

CONSTANT = 0
DEF foo() => CONSTANT
CONSTANT = 1
db foo()

Should this:

  • Output 0
  • Output 1
  • Be an error
    ?

@aaaaaa123456789
Copy link
Member Author

I don't think it should be an error. As for 0 or 1, both approaches are valid, but I'd say 0 is more useful, but 1 is more natural to people used to macros — I'd prefer 0.
(or make it configurable)

@ISSOtm
Copy link
Member

ISSOtm commented Apr 8, 2020

If so, I'll go with 0 for a first implementation, and maybe extend with a "capture" syntax later so 1 can be produced.

@meithecatte
Copy link
Contributor

If I'm understanding correctly, the behavior "output 1" could create dependency cycles...

@ISSOtm
Copy link
Member

ISSOtm commented Apr 8, 2020

No, because the right-hand is always evaluated before the left-hand.
The behavior that's been decided on is "output 0", and a syntax akin to C++'s captures may be added in the future to allow the possibility for "output 1".

@pinobatch
Copy link
Member

For comparison, .define macros in ca65 behave much like #define macros in C preprocessor, except they don't use parentheses when called.

@ISSOtm ISSOtm modified the milestones: v0.4.1, v0.4.2 Jul 21, 2020
@ISSOtm
Copy link
Member

ISSOtm commented Dec 9, 2020

Making these work in a useful way would require #619, if only because recursive functions would require the same kind of expression system rewrite that short-circuiting operators would. Therefore, and because I think 0.4.2 has been delayed enough, this feature will be postponed (again) to 0.4.3.

@ISSOtm
Copy link
Member

ISSOtm commented Dec 22, 2020

Lazy evaluation of symbols is important. Eager evaluation can be forced using naked braces (since #634), but lazy evaluation cannot otherwise be done.

@Rangi42
Copy link
Contributor

Rangi42 commented Feb 1, 2021

Sjasm has something similar: "text macros with arguments". (Although these are expanded as statements, not expressions, so more like macros with named parameters.)

@Rangi42
Copy link
Contributor

Rangi42 commented Feb 1, 2021

Some implementations for common math functions that aren't built into rgbasm:

def abs(x) = x < 0 ? -x : x
def abs(x) = x * (1 - 2 * (x < 0))
def abs(x) = x * sgn(x)

def sgn(x) = x < 0 ? -1 : x > 0 ? 1 : 0
def sgn(x) = (x > 0) - (x < 0)
def sgn(x) = x ? x / abs(x) : 0

def sqrt(x) = pow(x, 0.5)

@Rangi42 Rangi42 modified the milestones: v0.5.0, v1.0.0 Feb 28, 2021
@ISSOtm ISSOtm modified the milestones: v1.0.0, v0.5.1 Mar 5, 2021
@Rangi42
Copy link
Contributor

Rangi42 commented Apr 8, 2021

Functions could allow specifying their arguments by name, in case the given order doesn't match the declaration order.

def coord(x, y) = x * SCREEN_WIDTH + y

    ld hl, wTilemap + coord(y=5, x=10)
    ld hl, wTilemap + coord(10, 5)

@ISSOtm
Copy link
Member

ISSOtm commented Apr 8, 2021

This would be especially helpful to clarify intent, as it's neither obvious nor widely agreed upon whether X or Y goes first (typical usage is X, Y, but e.g. OAM uses Y, X)

@Rangi42 Rangi42 modified the milestones: v0.5.1, v0.6.0 Apr 25, 2021
@Rangi42 Rangi42 linked a pull request May 30, 2021 that will close this issue
@Rangi42 Rangi42 removed a link to a pull request May 30, 2021
@Rangi42
Copy link
Contributor

Rangi42 commented Sep 28, 2022

Various useful functions:

;;; Mathematical functions
; https://en.cppreference.com/w/c/numeric/math

DEF abs(x) := x < 0 ? -x : x
DEF abs(x) := x & $7fff_ffff
DEF abs(x) := x * sgn(x)

DEF sgn(x) := x > 0 ? 1 : x < 0 ? -1 : 0
DEF sgn(x) := (x > 0) - (x < 0)
DEF sgn(x) := -(x < 0)
DEF sgn(x) := x ? x / abs(x) : 0
DEF sgn(x) := x ? abs(x) / x : 0

DEF min(x, y) := x < y ? x : y
DEF min(x, y) := y ^ ((x ^ y) & -(x < y))
DEF max(x, y) := x > y ? x : y
DEF max(x, y) := x ^ ((x ^ y) & -(x < y))

DEF clamp(v, x, y) := v < x ? x : v > y ? y : v
DEF clamp(v, x, y) := min(max(v, x), y)

DEF dim(x, y) := x > y ? x - y : y - x
DEF dim(x, y) := max(x, y) - min(x, y)

DEF square(x) := x * x
DEF cube(x) := x * x * x


;;; Fixed-point functions

DEF fsquare(x) := MUL(x, x)
DEF fcube(x) := MUL(MUL(x, x), x)
DEF sqrt(x) := POW(x, 0.5)

DEF log2(x) := LOG(x, 2.0)
DEF log10(x) := LOG(x, 10.0)

; works for any `opt Q` fixed-point precision
DEF trunc(x) := x & -1.0


;;; Macro replacement functions

; `db byte(X, Y)` instead of `dn X, Y`
DEF byte(hi, lo) := ((hi & $f) << 4) | (lo & $f)
; `ld bc, word(X, Y)` instead of `lb bc, X, Y`
DEF word(hi, lo) := ((hi & $ff) << 8) | (lo & $ff)

; `dw bigw(X)` instead of `bigdw X`
DEF bigw(x) := ((x & $ff) << 8) | ((x & $ff00) >> 8)

; rgb(31, 16, 0) == rgbhex($ff8000)
DEF rgb(rr, gg, bb) := rr | (gg << 5) | (bb << 10)
DEF rgbhex(hex) := ((hex & $f80000) >> 19) | ((hex & $f800) >> 6) | ((hex & $f8) << 7)
DEF rgbhex(hex) := rgb((hex & $ff0000) >> 19, (hex & $ff00) >> 11, (hex & $ff) >> 3)

DEF tilecoord(x, y) := wTilemap + y * SCRN_X_B + x
DEF attrcoord(x, y) := wAttrmap + y * SCRN_X_B + x
DEF bg0coord(x, y) := _SCRN0 + y * SCRN_VX_B + x
DEF bg1coord(x, y) := _SCRN1 + y * SCRN_VX_B + x


;;; Bit twiddling functions
; http://graphics.stanford.edu/~seander/bithacks.html
; https://en.wikipedia.org/wiki/Find_first_set

DEF ispowof2(x) := x && !(x & (x - 1))

DEF nextpow2(x) := POW(2.0, CEIL(LOG(x, 2.0)))

DEF parity(x) := popcount(x) & 1

DEF ilog2(x) := LOG(x * 1.0, 2.0) / 1.0 ; fails for x >= 32768
DEF ilog2(x) := 2 ** bsr(x)

; these require recursion support
DEF popcnt(x) := x ? (1 + popcnt(x & (x - 1))) : 0
DEF ctz(x) := x ? (x & 1) ? 0 : (1 + ctz(x >> 1)) : 0
DEF clz(x) := x ? ((x & $8000_0000) ? 0 : (1 + clz(x << 1))) : 0

DEF ffs(x) := 32 - clz(x & -x)
DEF ffs(x) := popcount(x ^ ~-x)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM
Projects
None yet
Development

No branches or pull requests

8 participants