struct.tex

% 1. Structure of the language

\chapter{Overview of Scheme}

\section{Semantics}
\label{semanticsection}

This section gives an overview of Scheme's semantics.  A
detailed informal semantics is the subject of
chapters~\ref{basicchapter} through~\ref{builtinchapter}.  For reference
purposes, section~\ref{formalsemanticssection} provides a formal
semantics of Pico Scheme.

\vest Scheme is a statically scoped programming
language.  Each use of a variable is associated with a lexically
apparent binding of that variable.

\vest Scheme is a dynamically typed language.  Types
are associated with values (also called objects\mainindex{object}) rather than
with variables.  
Statically typed languages, by contrast, associate types with
variables and expressions as well as with values.

\vest All objects created in the course of a Scheme computation, including
procedures, have unlimited extent.
No Scheme object is ever destroyed.  The reason that
implementations of Scheme do not (usually!)\ run out of storage is that
they are permitted to reclaim the storage occupied by an object if
they can prove that the object cannot possibly matter to any future
computation.

\vest Scheme procedures are objects in their own right.  Procedures can be
created dynamically, stored in data structures, returned as results of
procedures, and so on.

\vest Arguments to Scheme procedures are always passed by value, which
means that the actual argument expressions are evaluated before the
procedure gains control, regardless of whether the procedure needs the
result of the evaluation.

\vest Pico Scheme's model of arithmetic is simplified compared to
\rsevenrs{} and only integers are required.

\section{Syntax}

Scheme, like most dialects of Lisp, employs a fully parenthesized prefix
notation for programs and other data; the grammar of Scheme generates a
sublanguage of the language used for data.  An important
consequence of this simple, uniform representation is that
Scheme programs and data can easily be treated uniformly by other Scheme programs.

The formal syntax of Scheme is described in section~\ref{BNF}.

\subsection{Base and optional features}
\label{qualifiers}

Pico Scheme is already reduced, but if a smaller subset is desired,
either symbols or integers could be removed. Either {\cf cond} or {\cf
  if} could be removed. Both {\cf let} and {\cf apply} could be
removed. If only a REPL is provided, output could be removed.
If extended, we suggest using \rsevenrs\ as a
guide. For cases where both \rsevenrs\ and Pico Scheme are using
defined behavior, it is intended that Pico Scheme should have
identical results.


\subsection{Error situations and unspecified behavior}
\label{errorsituations}

\mainindex{error}
When speaking of an error situation, this report uses the phrase ``an
error is signaled'' to indicate that implementations must detect and
report the error.

\vest If such wording does not appear in the discussion of
an error, then implementations are not required to detect or report the
error, though they are encouraged to do so.

\vest If the value of an expression is said to be ``unspecified,'' then
the expression must evaluate to some object without signaling an error,
but the value depends on the implementation; this report explicitly does
not say what value is returned. \mainindex{unspecified}

\vest Finally, the words and phrases ``must,'' ``must not,'' ``shall,''
``shall not,'' ``should,'' ``should not,'' ``may,'' ``required,''
``recommended,'' and ``optional,'' although not capitalized in this
report, are to be interpreted as described in RFC~2119~\cite{rfc2119}.
They are used only with reference to implementer or implementation behavior,
not with reference to programmer or program behavior.

\subsection{Entry format}

Chapters~\ref{expressionchapter} and~\ref{builtinchapter} are organized
into entries.  Each entry describes one language feature or a group of
related features, where a feature is either a syntactic construct or a
procedure.  An entry begins with one or more header lines of the form

\noindent\pproto{\var{template}}{\var{category}}\unpenalty

for identifiers in the base language.

If \var{category} is ``\exprtype,'' the entry describes an expression
type, and the template gives the syntax of the expression type.
Components of expressions are designated by syntactic variables, which
are written using angle brackets, for example \hyper{expression} and
\hyper{variable}.  Syntactic variables are intended to denote segments of
program text; for example, \hyper{expression} stands for any string of
characters which is a syntactically valid expression.  The notation
\begin{tabbing}
\qquad \hyperi{thing} $\ldots$
\end{tabbing}
indicates zero or more occurrences of a \hyper{thing}, and
\begin{tabbing}
\qquad \hyperi{thing} \hyperii{thing} $\ldots$
\end{tabbing}
indicates one or more occurrences of a \hyper{thing}.

If \var{category} is ``auxiliary syntax,'' then the entry describes a
syntax binding that occurs only as part of specific surrounding
expressions. Any use as an independent syntactic construct or
variable is an error.

If \var{category} is ``procedure,'' then the entry describes a procedure, and
the header line gives a template for a call to the procedure.  Argument
names in the template are \var{italicized}.  Thus the header line

\noindent\pproto{(car \var{pair})}{procedure}\unpenalty

indicates that the procedure bound to the {\tt car} variable takes
one argument, a \var{pair} (see below).  The header lines

\noindent%
\pproto{(- \var{n})}{procedure}
\pproto{(- \vri{n} \vrii{n})}{procedure}\unpenalty

indicate that the {\tt -} procedure must be defined to take
either one or two arguments.

\label{typeconventions}
It is an error for a procedure to be presented with an argument that it
is not specified to handle.  For succinctness, we follow the convention
that if an argument name is also the name of a type listed in
section~\ref{disjointness}, then it is an error if that argument is not of the named type.
For example, the header line for {\tt car} given above dictates that the
only argument to {\tt car} is a pair.  The following naming
conventions also imply type restrictions:
\newcommand{\foo}[1]{\vr{#1}, \vri{#1}, $\ldots$ \vrj{#1}, $\ldots$}
$$
\begin{tabular}{ll}
\vr{boolean}&boolean value (\schtrue{} or \schfalse{})\\
\foo{list}&list (see section~\ref{listsection})\\
\foo{n}&integer\\
\var{obj}&any object\\
\vr{pair}&pair\\
\vr{proc}&procedure\\
\vr{symbol}&symbol\\
\end{tabular}
$$


\subsection{Evaluation examples}

The symbol ``\evalsto'' used in program examples is read
``evaluates to.''  For example,

\begin{scheme}
(* 5 8)      \ev  40%
\end{scheme}

means that the expression {\tt(* 5 8)} evaluates to the object {\tt 40}.
Or, more precisely:  the expression given by the sequence of characters
``{\tt(* 5 8)}'' evaluates, in the initial environment, to an object
that can be represented externally by the sequence of characters ``{\tt
40}.''  See section~\ref{externalreps} for a discussion of external
representations of objects.

\subsection{Naming conventions}

By convention, \ide{?} is the final character of the names
of procedures that always return a boolean value.
Such procedures are called \defining{predicates}.
Predicates, like all procedures in Pico Scheme, are generally
side-effect free, except that they
may have an error when passed the wrong type of argument.