\documentstyle[11pt]{article}

\oddsidemargin 0in
\evensidemargin 0in
%\footheight 1.0in
\textheight 8.5in 
\textwidth 6.5in

\newcommand{\rlang}[1]{\verb+{#1}+}

\renewcommand{\baselinestretch}{1}

\begin{document}


\title{Lexical Scope and Statistical Computing}
\author{R. Gentleman and R. Ihaka \\
Department of Statistics\\
University of Auckland}

\date{\today}
\maketitle

\begin{abstract}

The nature of statistical computing has changed a great deal in the past
10 years mainly due to the influence of programming environments such as S 
(Becker, Chambers, and Wilks, 1988) and Lisp-Stat (Tierney, 1990).
Both these programming environments have languages for performing 
computations, data storage mechanisms and a graphical interface.
Their usefulness in providing an interactive interface to data analysis
is invaluable.
In order to take full advantage of these programming environments 
statisticians must understand the differences between them.
Ihaka and Gentleman (1996) introduced R, a version of S
which uses a different scoping regimen.
In some ways this makes R behave more like Lisp-Stat.
In this paper we discuss the concept of scoping rules and
and show how lexical scope enhances the functionality of a language.

\end{abstract}

\noindent
KEY WORDS: statistical computing, function closures, lexical scope, random number generators.


\section{Introduction}

The nature of statistical computing has changed a great deal in the past
10 years mainly due to the influence of programming environments such as S
(Becker, Chambers, and Wilks, 1988) and Lisp-Stat (Tierney, 1990).
These environments have made it possible and useful for statisticians to also
become programmers thereby giving them more effective control over the
data analytic process.
Statisticians should, therefore, benefit from considering programming
paradigms and their implications.
In this paper we will discuss scoping rules. These are the rules by which 
variables, ie. symbols, and values are associated.
We will show the effect scoping has on evaluation in computer languages.

We will first introduce the rules themselves and then discuss several 
implications which follow from the use of lexical scope.
Adopting lexical scope enables two major changes in the programming
language.
It makes it easy
to construct function closures and to construct
objects with mutable state.
While there are other means of achieving these two properties they
are, in some sense, natural consequences of lexical scope.
If the terms are unfamiliar you can think of a function closure as a
function together with a permanent surrounding environment that may affect the
way the function evaluates. 
This surrounding environment gives the function state.
When the variable bindings in this surrounding environment can be altered
the function closure is said to have mutable state.

To exemplifying these ideas and their consequences we use 
code written in two languages
with different scoping rules and compare the results.
To this end our examples are coded in both R and S.
The syntax and semantics of the two languages are very similar and
the reader can easily compare the results.

We consider examples such as, likelihood functions, probability
functions and random number generators and show how the scoping rules used
affect these.
In many places we have chosen to use simple examples and simple data sets not 
because we think that they are
important nor because we lack real data or real examples but rather to make
it easy for the reader to check on the details.

In section 2 we define a number of terms and discuss scoping rules.
Then in general terms we indicate how the scoping rules affect evaluation.
In section 3 we provide examples of the advantages of function closures.
Then in section 4 we comment on the advantages of mutable state.
Finally in section 5 we draw our conclusions.

\section{Scoping Rules and Evaluation}

Most computer languages have values, symbols (or variables), and
functions.
Often the same symbol appears
in several different contexts within a computer program.
For example, the symbol \verb+x+ can be a global variable, a local
variable in a function, and a formal parameter to another function
simultaneously.
An important part of the computational process is associating
symbols with values.
The rules that are used to do this are called scoping rules.
Virtually every computer language has its own set of scoping rules
but for the most part they fall into two broad classes,
dynamic scope and lexical scope.
We further identify two other classes, trivial scope and static scope.
We also note that in some languages, and in particular in both S and R, 
it is possible to alter the scoping rules.
In this paper the term {\em scoping rules} always refers to the 
default scoping rules.

The symbols which occur in the body of a function can be divided into three
classes; formal parameters, local variables and free variables.
The formal parameters of a function are those occurring in the argument
list of the function.
Their values are determined by the process of {\em binding} the actual
function arguments to the formal parameters.
Local variables are those whose values are determined by the evaluation of 
expressions in the body of the functions.
Variables which are not formal parameters or local variables are called free
variables.
Free variables become local variables if they are assigned to.
Consider the following function definition.
\begin{verbatim}
  f<-function(x) {
        y<-2*x
        print(x)
        print(y)
        print(z)
     }
\end{verbatim}

In this function \verb+x+ is a formal parameter, \verb+y+ is a local variable
and \rlang{z} is a free variable.
The values associated with the free variable are determined by the scoping 
rules.
Under trivial scope free variables
are not allowed. Under dynamic scope the value associated with a free variable
is determined by searching back up the sequence of calling functions and 
using the most recently defined value. Under static scope the values 
of the free variables are determined by a set of global variables. 
This is the kind of scoping used in
C and S. If the values of the free variables are defined by the bindings that
were in effect at the time the function was created then the scoping is called
lexical. This is the kind of scoping used in R and Scheme.

While
the usual definition of static or lexical scope in computer science is that
the variable bindings can be determined from a printed copy of the code
this definition is not specific enough.
Computer scientists tend not to differentiate between the two
because their concerns are different.
However, if we consider some commonly used computer languages that satisfy
this definition of lexical scope we see that
there can be large differences between them.
For example,
in C you are not allowed to nest function definitions so only global variables
can be bound to free variables.
In S nested function definitions are permitted but S does not resolve
the free variable bindings within the calling function instead it resolves
them globally.
In both R and Scheme the free variable bindings are resolved by first looking
in the environment in which the function was created. 
Since we want to contrast R and S with respect to this difference we have
refined the scoping definitions to our purpose.

The following code shows the difference between static scope and lexical
scope, as we have defined them.
First we define a function called \verb+cube+.
\begin{verbatim}
  cube<-function(n){
       sq<-function() n*n
       n*sq()
     }
\end{verbatim}

The variable \verb+n+ in the function \verb+sq+ is not an
argument to that function. 
Therefore it is a free variable and the scoping rules must be used to 
ascertain the value that is to be associated with it.
Under static scope the value is that associated with a global variable
named \verb+n+.
Under lexical scope it is the parameter to the function \verb+cube+ since that
is the active binding for the variable \verb+n+ at the time the 
function \verb+sq+ was defined.
The difference between evaluation in R and evaluation in S is that 
S looks for a global variable called \verb+n+ while R first looks for a 
variable called \verb+n+ in the environment created when \verb+cube+ was
invoked.

\begin{verbatim}
#first evaluation in S
S> cube(2)
Error in sq(): Object "n" not found
Dumped
S> n<-3
S> cube(2)
[1] 18
#then the same function evaluated in R
R> cube(2)
[1] 8
\end{verbatim}

\subsection{Evaluation}

We must develop more terminology.
An {\em environment} is a collection of symbols together with the values that
are to be associated with the symbols.
It is not necessary that every symbol have a value associated with it.
The value may be obtained as a result of computation or it may be obtained
from some other associated environment.
Usually there is some means of ordering the environments so that after
one is examined there is a next one to be examined.
We call the next environment the enclosing environment or parent environment.

When an interactive language is invoked the user is put into an environment
which we shall call the {\em top--level} environment.
At any point during a computation a single environment is {\em active}.
The resolution of symbol--value constructs takes place in this environment.
When the evaluator requires a value for a particular symbol that it has
encountered it first examines the active environment. It then searches the
enclosing environment and so on up to the top--level environment. 
At this point a special mechanism is used to locate built--in symbols.
Specific languages have some differences but for the most part this is
a reasonable representation of how they associate values and symbols.

Next we consider functions.
Functions can be thought of as having three parts.
There are the formal parameters, the body of the function and an
environment.
The formal parameters to the function are the arguments supplied 
by the user. The body is the sequence of statements that are to be evaluated
when the function is invoked.
The associated environment is an environment that is typically retained 
throughout the life of the function.
Under lexical scope
whenever a function is created the associated environment is set to the 
currently active environment.
This ensures that there is always access to all variable 
definitions that were in effect at the time the function was created.
Under static scope the associated environment is set to 
the top--level environment.

The process of function evaluation is similar across programming languages.
When a function is invoked a new environment, the evaluation environment, is 
created and in this environment
the formal arguments (symbols) are bound to the actual arguments (values).
The parent of this new environment is set to be the environment associated with
the function.
Bindings for free variables are resolved by looking in the new environment,
then the parent of this environment and then its parent and so on
until the top--level environment is encountered.

Consider the function defined below.
\begin{verbatim}
fun1<-function(x) x+y

fun2<-function() {
         y<-20
         return(function(x) x+y)
      }
\end{verbatim}

In a lexically scoped language we get the following results.
\begin{verbatim}
R>fun3<-fun2()
R>#now look at fun1 and fun3
R>fun1
  function (x) 
     x + y
R>fun3
  function (x) 
  x + y
  <environment: 0x14b488>
R>y<-2
R>fun1(5)
[1] 7
R>fun3(5)
[1] 25
\end{verbatim}
Since, \verb+fun1+ was created at the top--level this is its associated
environment.
For \verb+fun3+ the situation is more complicated. When \verb+fun2+ was invoked
a new environment was created, \rlang{<environment: 0x14b488>}.
It is here that \rlang{fun3} was created and hence 
\rlang{<environment: 0x14b488>} is its associated environment. So, for
{\tt fun3} {\tt y} is bound to 20.
In S both \rlang{fun1} and \rlang{fun3} have the top--level environment
as their associated environment and hence both evaluate to 7.

Notice that in R functions that have environments other than the top--level
environment associated with them have that environment explicitly printed.
What appears to be peculiar behaviour may result
when two functions have the same body and formal parameters
but different associated environments. 
If there is a free variable in the function body with different bindings in
the associated environments then the function(s) 
may evaluate differently.
In a lexically scoped language you need to be aware of the
associated environments.
In statically scoped languages 
this situation arises
when a free variable is bound to a global
variable that has been altered between function invocations.

There are other implications of lexical scope and environments.
The environments for function evaluation must be maintained, that
is, they must be persistent objects.
If the function returns a function then this function must have to access
to the environment it was created in.
Under static scope environments can be transient, that is they can 
disappear once the function has returned a value.
We can be sure that this environment will never be needed again.
This implies that if we did not allow functions to be returned
as values in a lexically scoped language we would not require
environments to be persistent.
Having persistent environments raises many other issues some of which are
addressed in Ihaka and Gentleman (1996).

In particular this implies that functions cannot easily be stored as files.
The function and the whole of the associated environment must be stored
together. 
This does not seem formidable but we must also store any other function
that has access to that environment.
Determining which functions have access to an environment is not easy.
The implementation usually ensures that functions know which environments
they need rather than have environments keep track of the functions
that might change them.

\section{Function Closures}

In this section we will give some examples which indicate the usefulness of
function closures.
Function closures are function bodies bound together with values for
the free variables.
In most lexically scoped languages this binding is achieved through the
associated environment but the binding can be achieved in many different
ways.

An associate editor supplied the following method of creating function
closures in S. A function closure can be created in S by taking a base function
and extending its formal arguments to include the new bindings. 
The function should be written so that the body contains free
variables that will later have bindings supplied.
An implementation of this is given by the function {\tt MC} below.
It should be noted that
this function relies very heavily on the implementation of functions in S.

\begin{verbatim}
   MC <- function(f, env=NULL) {
           env<-as.list(env)
           if(mode(f) != "function")
             stop(paste("not a function:",f))
           if(length(env)>0 && any(names(env) == ""))
             stop(paste("all arguments are not named:", env))
           fargs<-if(length(f)>1) f[1:length(f)-1] else NULL
           fargs<-c(fargs,env)
           if(any(duplicated(names(fargs))))
             stop(paste("duplicated arguments:",paste(names(fargs),
                  collapse=", ")))
           fbody<-f[length(f)]
           cf<-c(fargs,fbody)
           mode(cf)<-"function"
           return(cf)
         }
\end{verbatim}

\subsection{Likelihoods}

Consider the Exponential density, $f(x) = (1/ \lambda) \exp(-x/ \lambda)$
where both $\lambda$ and $x$ must be positive.
This describes the probability distribution of an Exponential random
variable.
Suppose we had observed a sample of size $n$ which we believed to be
from this density.
The likelihood principal suggests that we use the product of
the densities, considered as a function of $\lambda$, as a means
of determining those values of $\lambda$ which are plausible in light of
the data. 
The value of $\lambda$ which maximises this function (or any monotonic
function of it) is called the maximum likelihood estimate.
The log likelihood function for a sample, $(x_1,\ldots,x_n)$, from an
Exponential$(\lambda)$ distribution is 
$l(\lambda)= -n \log(\lambda) - \sum(x_i)/\lambda$.
Here we are thinking of this as a function of $\lambda$ with the
data, $x_1,\ldots,x_n$ fixed.

Likelihood functions are commonly used in both research and teaching.
It would be convenient to have some means of creating a likelihood function.
This means that we want to have some function, which we will call a 
{\em creator}, that we pass data to and get back a likelihood function.
We will call this function the {\em returned function}.
This likelihood function would then take as arguments values of the parameter
($\lambda$ in the case above) and return the likelihood at that point for
the data that was supplied to the creator.
To do so the returned function needs to have access to the values of the
data that were passed to the creator.

If the programming language has lexical scope there is no problem 
because the returned function is created inside the creator and hence
has access to all variable definitions that were in effect at the time
that it was created.

In the following example \rlang{mlfun} is a creator.
It sets up several local variables which will be needed by the likelihood 
function and whose values depend on the data supplied.
Then the likelihood function \rlang{rfun} is
created and returned.
The environment associated with \rlang{rfun} is the environment
that was created by the invocation of \rlang{mlfun} which means
that the variables \rlang{n} and \rlang{sumx} will have bindings
in that environment.

\begin{verbatim}
      mlfun<-function(x) {
               sumx<-sum(x)
               n<-length(x)
               rfun<-function(lambda) {
                       if(lambda<0) stop("only positive values allowed")
                       return( -n * log(lambda) - sumx/lambda )
                     }
               return( rfun )
             }
\end{verbatim}

Subsequent evaluation of \rlang{mlfun} cause the creation of a new
environment with bindings to \rlang{n} and \rlang{sumx} which depend 
on the arguments supplied to \rlang{mlfun}. 
This environment does not interfere in any way with any environment
created by previous invocations of \rlang{mlfun}.

\begin{verbatim}
R>efun<-mlfun(1:10) # efun is a function!
R>efun(3)
[1] -48.33333
R>efun2<-mlfun(20:30)
R>efun2(3)
[1] -124.6667
R> efun(3)  #nothing has changed for efun
[1] -48.33333
\end{verbatim}

This example does not work in S.
In S \rlang{mlfun} returns a function with the correct body
but when \rlang{efun} is evaluated
an error occurs because the frame in which \rlang{n} and \rlang{sumx} 
were
bound evaporated when \rlang{mlfun} returned its value.
They are free variables in \rlang{mlfun}
and unless there are global variables with the same names an error
will be signalled.
However, the function MC, given above, can be used to provide similar
functionality in S.

\begin{verbatim}
## with the definition of mlfun then being
mlfun<-function(x) {
 sumx<-sum(x)
 n<-length(x)
 rfun<-function(lambda) {
  if(lambda<0) stop("only positive values allowed")
  return( -n * log(lambda) - sumx/lambda )
 }
 MC(rfun,list(sumx=sumx,n=n))
}
##and finally
S> efun<-mlfun(1:10)
S> efun(3)
[1] -29.31946
\end{verbatim}

\subsection{Probabilities and Related Concepts}

Suppose we want to take a set of data and make an empirical cumulative
distribution function from it.
One representation of an ecdf is as Pr($X \leq t$).
A functional version of the ecdf accepts
as input a real number $t$ and returns a value between 0 and 1.

\begin{verbatim}
mkecdf<-function(x) {
          n<-length(x)
          rval<-function(t) 
                  return( sum(x<=t)/n )
          return( rval )
}

mkinvecdf<-function(x) {
             n<-length(x)
             sortx<-sort(x)
             rval<-function(p) {
                     if( (p<0) || (p>1) )
                       stop("invedcf: wrong argument")
                     t1<-ceiling(p*n)
                     return(sortx[t1])
                   }
             return( rval )
           } 
\end{verbatim}

The local variables in the returned functions are bound to the values
they held at the time the returned functions were created.
These examples are useful in the teaching of statistics and probability
since they function in a manner consistent with our perceptual model
and hence
allow for easy abstraction to related problems.
 
In fact these functions and concepts are of more general use.
Gentleman and Crowley (1989) examine their use for
smoothing data.
Consider a scatter plot with a response variable plotted in the $y$
direction and a covariate plotted in the $x$ direction.
A {\em scatterplot smoother} is a line added to the plot which 
represents an estimate of the mean response conditional upon the 
value of the covariate.
Adding a scatterplot smoother enhances our perception of the 
dependence between the response and the covariate.
An example of a smoother is a running mean which is given by
\[
\hat{m}_x = \frac{1}{k} \sum_{N_x} x_i
\]
where $N_x$ is the list of indices of the $k$ nearest neighbours of the
point $x$.
This scatter plot smoother shows how the mean response changes as the 
value of the covariate changes.

Many smoothers may be written as functionals of the estimated 
conditional distribution function.
For example, the running mean can be written as
\[
\hat{m}_z = \int u d\hat{F}_z(u \mid k)
\]
where $\hat{F}_z(\cdot \mid k)$ is the empirical distribution function
based on the $k$ nearest neighbours of $z$.
This latter representation of the running mean 
can easily be generalised to use estimates of $F$
other than the empirical distribution function.
It also lends itself to easily being adapted to providing estimates other
than a running mean (eg. running quantiles) by simply changing the integrand.

This formulation is also useful computationally.
Take the entire data set, break it down into
neighbourhoods and estimate a conditional distribution in each.
Return a vector or a list of all the conditional distributions.
Now writing a function which takes a conditional distribution function
as input and returns the desired quantity
yields the appropriate smoother with little additional 
computational effort.

\section{Mutable State}

One could describe the association of an environment with a function as
giving that function state.
We saw in Section 2 that this association can be mimicked in many languages
through clever programming.
Next we will explore the effects of being able to change that local
state information.
That is, change the bindings of variables which are used to give
local state.
At this point the situation becomes more complicated and the methods
suggested above for achieving local state in S do not adapt well.

We can arrange to have several returned function share the same state
variable.
Perhaps the simplest example is the bank account suggested in
Abelson, Sussman and Sussman (1985).
\begin{verbatim}
        account <- function(total)
            list(
                balance = function() total,
                deposit = function(amount) total <<- total + amount,
                withdraw = function(amount) total <<- total - amount
            )
\end{verbatim}
A functioning bank account needs to have a balance or total, a function
for making withdrawals, a function for making deposits and a function
for stating the current balance.
We achieve this by creating the three functions within {\tt account}
and then returning a list containing them.
When {\tt account} is invoked it takes a numerical argument {\tt total}
and returns a list containing the three functions.
Because these functions are defined in an environment which
contains {\tt total}, they will have access to its value.

Now we could create accounts, make deposits and withdrawals
and check on the balances.
\begin{verbatim}
R> Joe.account<-account(1000)
R> Joe.account$withdraw(50)
R> Joe.account$balance()
[1] 950
\end{verbatim}
The method of augmenting the argument list suggested earlier will not
work here. 
The three returned functions must share the local variable {\tt total}
in some mutually accessible location.
If each of them has {\tt total} as a parameter then when {\tt total}
is changed there will need to be some mechanism for finding the associated
functions and changing their argument lists.
Unfortunately this is simply not practicable. 
Creating and maintaining lists of such associations would be
a horrendous task.

You will have noticed that we introduced a new type of assignment.
The reason for this is that assignment is often synonymous with allocation in
R (and S).
Local variables do not need to be declared and are simply created as
needed.
If the \verb+<-+ form of assignment is used and the symbol on the left hand
side does not exist in the current environment it is created and associated
with the value from the right hand side.
So we must have
a new {\em operator} to indicate to the interpretor when we want to 
assign to a variable in one of the parent environments rather than create 
a local variable.
In R this is the special assignment operator, \rlang{<<-}.

\subsection{Random Number Generators}

Pseudo--random number generation is an important part of statistical
computing. 
Two common uses are simulation and bootstrapping.
Pseudo--random number generators require a seed and for a given
seed they produce a sequence of numbers that, hopefully, has all the
properties of a sequence of truly random numbers.
The difference between pseudo--random number generators and random
number generators is that given the same seed the pseudo--random
number generator will reproduce the same sequence.
Since we only discuss pseudo--random numbers we will drop
the pseudo prefix.

With most random number generators the seed is updated for each random
number generated and in order to get the next number in the sequence
you need the last value of the seed.
This implies that in order to implement a simulation
all routines which handle the random numbers need to have an
extra parameter passed to them, the seed.
This is often not practical and instead the solution of making the
seed a global variable is taken.
Having the seed be a global variable can have some undesirable effects 
on the simulation since it makes the seed accessible to every function 
and hence increases the likelihood of some function
inadvertently changing the seed.
There is also a chance that the seed could be reset to the original
starting value.
These sorts of occurrences can and usually do invalidate the results of
the simulation and hence should be guarded against.
Unfortunately there is seldom ever any evidence that such an event has
occurred and the results are accepted as if they were valid.

These problems can be overcome by the use of lexical scope.
Lexical scope allows an instance of the seed to be bound to a
particular random number generator and to be inaccessible to 
any other random number generator. Thus, it cannot be altered
inadvertently.
The binding mechanism is sufficiently general to allow the user
to set the seed, query the seed and generate the next number in
the sequence through a collection of returned functions.
But, the seed is accessible only through these functions.

An example will clarify the situation.
First consider the function {\tt make.random}.
\begin{verbatim}
   make.random<-function(seed) 
                  return( list( rand=function() {
                                       seed<<-(9*seed+5)%%1024
                                       seed },
                                setseed=function(nseed)
                                          seed<<-nseed,
                                getseed=function() seed
                              )
                          )
\end{verbatim}
The function \rlang{make.random} is a function that has one argument, 
\rlang{seed} and the value it returns a list of functions. 
One to generate random numbers, {\tt rand}, one to set the seed of the
random number generator, {\tt setseed}, and one to get the current value
of the seed, {\tt getseed}.

When {\tt make.random} is invoked an environment is created and in
this environment the symbol {\tt seed} is bound to the value supplied
as an argument.
Next the list of functions is created.
Each of these functions has the environment with a binding for {\tt seed}
as its associated environment and hence can access the current value
associated with {\tt seed}.

\begin{verbatim}
R> rand<-make.random(1)
R> rand$rand()
[1] 14
R> rand$rand()
[1] 131
R> rand$getseed()
[1] 131
\end{verbatim}

Now \rlang{rand\$rand} is a function and evaluating it produces a sequence
of random numbers.
Again it is essential that in \rlang{rand\$rand} the special assignment 
operator \rlang{<<-} is used since this ensures that rather than creating a 
new variable called {\tt seed} the variable in the associated environment
has a new value bound to it. 

Several versions of \rlang{rand} may exist simultaneously and as they
will all be the result of separate invocations of \rlang{make.random}
their associated environments will be distinct and each will have
its own protected version of \rlang{seed}.

\begin{verbatim}
> rand1<-make.random(1)
> rand2<-make.random(101)
> rand1$rand()
[1] 14
> rand2$rand()
[1] 914
> rand1$rand()
[1] 131
\end{verbatim}
We have created two random number generators, with different seeds, 
that do not interfere with each other.
Even with the call to \rlang{rand2} between the two calls to 
\rlang{rand1} we see that
we get the same sequence as above for \rlang{rand1}.

\subsection{An Interesting Aside}

At this point it is worth mentioning another consequence of having
an associated environment.
Both R and S have what is known as pass--by--value parameters to functions.
This means that when a value is passed to a function that object itself
is not passed but rather a copy of it is passed.
This ensures that any changes made to an object inside a function do
not affect the object that was passed to the function.

Consider the example below.
When the function, {\tt f} is called a copy of the value associated with
the symbol {\tt x} is passed to it.
Thus, when the function changes the value associated with its parameter
this has no effect on the global variable {\tt x}.
\begin{verbatim}
> x<-10
> f<-function(x) x<-12
> f(x)
> x
[1] 10
\end{verbatim}

Had this been a pass--by--reference language then the function {\tt f}
would have gotten a pointer to the storage area containing the value 
associated with {\tt x} 
Then the assignment, {\tt x<-12}, would have changed the value in the
storage location referenced by {\tt x} to 12.
On return from the function we would have found that {\tt x}
was now bound to the value 12 rather than 10.

Now, we can achieve some pass by reference results in 
in a pass by value language because of the associated environments.
These occur because in most implementations the associated environments
are not copied when the function is invoked.
Only the function body and formal arguments are copied and thus
changes to variables in the associated environment within the function
call will affect the global variable.

Again consider our function {\tt make.random}.

\begin{verbatim}
> rand<-make.random(1)
> f<-function(rgen) 10*rgen()/2
> f(rand$rand)
> rand$rand()
[1] 131
\end{verbatim}

We created a random number generator and then passed it to
a function that evaluated it. 
The next evaluation of {\tt rand\$rand} gave the second value in the
sequence as reported above.
Therefore, passing it to {\tt f} and the evaluation in {\tt f} affected
the global variable.
We have achieved pass--by--reference in
a pass--by--value language.

\section{Concluding Remarks}

In this paper we have argued the usefulness of lexical scoping and
its consequences
in several situations that are important to statisticians.
The use of function closures being one of the more beneficial.
These make it much easier for statisticians to program complicated
algorithms without having to delve too deeply into the basic functioning
of the language.
In some ways one could argue that this is one of the true strengths of
S, that you can write programs without having to worry about memory 
allocation or type--checking of variables.
We believe that lexical scope has large advantages and have incorporated
it in R.

S has become a popular tool for statistical programming
and research.
It has many strengths, among these the fact that it is a functional
language with functions as first class values.
This means that functions can be passed to other functions as arguments
and they can be returned as values.
We argue that this latter property is very important but has been neglected
mainly because of the scoping rules used in S.

The reader is reminded that
whether a particular method can be programmed in a particular language
is not the issue here.
All computations could be carried out in machine language (they are anyways)
but the point of a computer language is to easily allow us to abstract and
program the ideas we have.
We want a language in which it is easy to do the things that we want or need
to do.
Users switch between languages to gain ease of programming.
We are not suggesting that every user 
will want to switch to R, or some other
lexically scoped language. 
However, it is useful for them to consider the differences.

Many of the examples were inspired by Abelson, Sussman and Sussman (1985)
and by Kamin (1990).
They are based on SCHEME, a computer language that relies quite heavily
on lexical scope and having functions as first class objects.
Abelson, Sussman and Sussman (1985) also provide much richer examples
including developing an object oriented system through lexical closures.

\vspace{0.1in}
\Large
\noindent
Acknowledgements

\vspace{0.1in}
\normalsize

The authors would like to thank an Associate Editor for his or her careful
reading of a previous draft of this manuscript and the many helpful suggestions
which have led to its improvement.
We would also like to thank John Chambers, Mike Meyer and Duncan Murdoch
for their helpful comments on a draft of this paper.

\section{References}
\begin{list}{}{\setlength{\leftmargin}{0in}}
\item{} Abelson, H., Sussman, G. J. and Sussman J. (1985).
{\em Structure and Interpretation of Computer Programs}.
MIT Press.
\item{} Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988).
	{\em The new S Language: A programming environment for data analysis
	and graphics.}
	Wadsworth \& Brooks/Cole.
\item{} Gentleman, R. and Crowley, J. (1990).
Smoothing censored data.
Technical Report 90-13, Department of Statistics and Actuarial Science,
University of Waterloo.
\item{} Ihaka, R. and Gentleman, R. (1996).
 R: A language for data analysis and graphics.
 {\em The Journal of Computational and Graphical Statistics,}
    to appear.
\item{} Kamin, S. N. (1990).
	{\em Programming languages.}
	Addison Wesley.
\item{} McCarthy, John (1960).
	Recursive functions of symbolic expressions and their computation
	by machine, part 1.
	{\em Communications of the ACM,}
	{\bf 3},185--95.
\item{} Tierney, L. (1990).
	{\em Lisp--Stat: An Object Oriented Environment for Statistical
	Computing and Data Analysis.}
	John Wiley and Sons.
\end{list}


\end{document}