Skip to content
unbibium edited this page Mar 18, 2013 · 7 revisions

Overview

Here are some data structures that are important to the BASIC interpreter. They differ from the 6502 versions in that addresses are always stored in a single word, but otherwise are largely the same. Currently, even the two-byte variable name structure is still two words.

BASIC Program

Program text starts at the location stored in TXTTAB.

  • 0x0000 at the first location
  • For each line of program text:
    • One word containing a pointer to the next line of program text.
    • One word containing the line number, range 0 to 63999.
    • Program text, with all reserved words and mathematical operators tokenized
    • 0x0000 to signal end-of-line
  • 0x0000 at the last location -- the last line will point to this location.

Variable data immediately follows. If there is ever any change to the program code, the variable table is cleared instead of moved.

BASIC Variables

VARTAB

VARTAB points to the start of the table for scalar variables. Each entry has two words for the name, and three bytes for the data or pointer.

  • Floats (type bits are 00) ** See [floating point format](Floating Point) for the three-word format
  • Functions (type bits are 01) ** Pointer to function definition text, followed by a pointer to the argument.
  • Strings (type bits are 10) ** Length followed by pointer, other word is unused.
  • Integers (type bits are 11) ** Value is stored in the first word; other words are unused. Integers only save memory in arrays.

VARNAM

Variable names start with a letter, optionally followed by a letter or a number, and optionally ending in a type sigil, optionally followed by an array subscript.

In the BASIC text, variable names are stored as plain text. Any additional letters or numbers after the first two are ignored.

In the variable table, variable names are stored in two bytes, VARNAM and VARNAM+1, in the same format as in memory. Each entry has the following bytes:

00000000baaaaaaa 00000000baaaaaaa
  • a: ASCII value of variable name
  • bb: variable type ** 00: integer (no sigil) ** 01: function (FN prefix) ** 10: string ($ suffix) ** 11: integer (% suffix)

ARYTAB

ARYTAB points to the start of array memory. They are stored one after the other, much like scalar variables. Since they're not stored at a fixed length, they can be stored much more compactly, so there are no wasted bytes in string pointers or integers. And since integers are stored as a single 16-bit byte, a one-dimensional array is essentially a pointer to a contiguous section of memory.

The structure of an array in memory is as follows:

  • Two bytes for the variable name and type, see VARNAM.
  • One byte containing the complete length of the array in memory.
  • One byte containing the number of dimensions. ** For each dimension, one byte containing the number of elements in that dimension. *** For each element in each dimension: **** For floats, three bytes for the value. **** For integers, one byte for the value. **** For strings, two bytes for the pointer.

Therefore, If the pointer X points to an array location:

  • and [X+1] contain the variable name and type
  • ADD X, [X+2] will skip to the next array. If this is the last array in memory, X will be equal to STREND.

CPU stack variables

The CPU stack is used in the long term for GOSUB and FOR statement tracking, and in the short term for formula evaluation. In this sections, their data is described in the order in which they are pulled, so PEEK is first, PICK 1 is next, etc.

GOSUB return vectors

  • GOSUB token constant (0x8D).
  • line number of originating GOSUB statement
  • Text pointer to statement following originating FOR statement.

FOR loops

  • FOR token constant (0x81)
  • Pointer to variable being used as the loop's index.
  • Value of STEP parameter, in 4-word format with separate sign byte. The sign byte is evaluated separately in the NEXT statement to determine when a loop has ended.
  • Value of TO parameter, in 3-word format with sign byte replacing the normalization bit.
  • line number of originating FOR statement
  • Text pointer to statement following originating FOR statement.

Formula evaluation

Intermediate processing

See also

[Floating point format](Floating Point)