\documentstyle[11pt]{article} \oddsidemargin 0in \evensidemargin 0in %\footheight 1.0in \textheight 8.5in \textwidth 6.5in \newcommand{\rlang}[1]{\verb+{#1}+} \renewcommand{\baselinestretch}{1} \begin{document} \title{Lexical Scope and Statistical Computing} \author{R. Gentleman and R. Ihaka \\ Department of Statistics\\ University of Auckland} \date{\today} \maketitle \begin{abstract} The nature of statistical computing has changed a great deal in the past 10 years mainly due to the influence of programming environments such as S (Becker, Chambers, and Wilks, 1988) and Lisp-Stat (Tierney, 1990). Both these programming environments have languages for performing computations, data storage mechanisms and a graphical interface. Their usefulness in providing an interactive interface to data analysis is invaluable. In order to take full advantage of these programming environments statisticians must understand the differences between them. Ihaka and Gentleman (1996) introduced R, a version of S which uses a different scoping regimen. In some ways this makes R behave more like Lisp-Stat. In this paper we discuss the concept of scoping rules and and show how lexical scope enhances the functionality of a language. \end{abstract} \noindent KEY WORDS: statistical computing, function closures, lexical scope, random number generators. \section{Introduction} The nature of statistical computing has changed a great deal in the past 10 years mainly due to the influence of programming environments such as S (Becker, Chambers, and Wilks, 1988) and Lisp-Stat (Tierney, 1990). These environments have made it possible and useful for statisticians to also become programmers thereby giving them more effective control over the data analytic process. Statisticians should, therefore, benefit from considering programming paradigms and their implications. In this paper we will discuss scoping rules. These are the rules by which variables, ie. symbols, and values are associated. We will show the effect scoping has on evaluation in computer languages. We will first introduce the rules themselves and then discuss several implications which follow from the use of lexical scope. Adopting lexical scope enables two major changes in the programming language. It makes it easy to construct function closures and to construct objects with mutable state. While there are other means of achieving these two properties they are, in some sense, natural consequences of lexical scope. If the terms are unfamiliar you can think of a function closure as a function together with a permanent surrounding environment that may affect the way the function evaluates. This surrounding environment gives the function state. When the variable bindings in this surrounding environment can be altered the function closure is said to have mutable state. To exemplifying these ideas and their consequences we use code written in two languages with different scoping rules and compare the results. To this end our examples are coded in both R and S. The syntax and semantics of the two languages are very similar and the reader can easily compare the results. We consider examples such as, likelihood functions, probability functions and random number generators and show how the scoping rules used affect these. In many places we have chosen to use simple examples and simple data sets not because we think that they are important nor because we lack real data or real examples but rather to make it easy for the reader to check on the details. In section 2 we define a number of terms and discuss scoping rules. Then in general terms we indicate how the scoping rules affect evaluation. In section 3 we provide examples of the advantages of function closures. Then in section 4 we comment on the advantages of mutable state. Finally in section 5 we draw our conclusions. \section{Scoping Rules and Evaluation} Most computer languages have values, symbols (or variables), and functions. Often the same symbol appears in several different contexts within a computer program. For example, the symbol \verb+x+ can be a global variable, a local variable in a function, and a formal parameter to another function simultaneously. An important part of the computational process is associating symbols with values. The rules that are used to do this are called scoping rules. Virtually every computer language has its own set of scoping rules but for the most part they fall into two broad classes, dynamic scope and lexical scope. We further identify two other classes, trivial scope and static scope. We also note that in some languages, and in particular in both S and R, it is possible to alter the scoping rules. In this paper the term {\em scoping rules} always refers to the default scoping rules. The symbols which occur in the body of a function can be divided into three classes; formal parameters, local variables and free variables. The formal parameters of a function are those occurring in the argument list of the function. Their values are determined by the process of {\em binding} the actual function arguments to the formal parameters. Local variables are those whose values are determined by the evaluation of expressions in the body of the functions. Variables which are not formal parameters or local variables are called free variables. Free variables become local variables if they are assigned to. Consider the following function definition. \begin{verbatim} f<-function(x) { y<-2*x print(x) print(y) print(z) } \end{verbatim} In this function \verb+x+ is a formal parameter, \verb+y+ is a local variable and \rlang{z} is a free variable. The values associated with the free variable are determined by the scoping rules. Under trivial scope free variables are not allowed. Under dynamic scope the value associated with a free variable is determined by searching back up the sequence of calling functions and using the most recently defined value. Under static scope the values of the free variables are determined by a set of global variables. This is the kind of scoping used in C and S. If the values of the free variables are defined by the bindings that were in effect at the time the function was created then the scoping is called lexical. This is the kind of scoping used in R and Scheme. While the usual definition of static or lexical scope in computer science is that the variable bindings can be determined from a printed copy of the code this definition is not specific enough. Computer scientists tend not to differentiate between the two because their concerns are different. However, if we consider some commonly used computer languages that satisfy this definition of lexical scope we see that there can be large differences between them. For example, in C you are not allowed to nest function definitions so only global variables can be bound to free variables. In S nested function definitions are permitted but S does not resolve the free variable bindings within the calling function instead it resolves them globally. In both R and Scheme the free variable bindings are resolved by first looking in the environment in which the function was created. Since we want to contrast R and S with respect to this difference we have refined the scoping definitions to our purpose. The following code shows the difference between static scope and lexical scope, as we have defined them. First we define a function called \verb+cube+. \begin{verbatim} cube<-function(n){ sq<-function() n*n n*sq() } \end{verbatim} The variable \verb+n+ in the function \verb+sq+ is not an argument to that function. Therefore it is a free variable and the scoping rules must be used to ascertain the value that is to be associated with it. Under static scope the value is that associated with a global variable named \verb+n+. Under lexical scope it is the parameter to the function \verb+cube+ since that is the active binding for the variable \verb+n+ at the time the function \verb+sq+ was defined. The difference between evaluation in R and evaluation in S is that S looks for a global variable called \verb+n+ while R first looks for a variable called \verb+n+ in the environment created when \verb+cube+ was invoked. \begin{verbatim} #first evaluation in S S> cube(2) Error in sq(): Object "n" not found Dumped S> n<-3 S> cube(2) [1] 18 #then the same function evaluated in R R> cube(2) [1] 8 \end{verbatim} \subsection{Evaluation} We must develop more terminology. An {\em environment} is a collection of symbols together with the values that are to be associated with the symbols. It is not necessary that every symbol have a value associated with it. The value may be obtained as a result of computation or it may be obtained from some other associated environment. Usually there is some means of ordering the environments so that after one is examined there is a next one to be examined. We call the next environment the enclosing environment or parent environment. When an interactive language is invoked the user is put into an environment which we shall call the {\em top--level} environment. At any point during a computation a single environment is {\em active}. The resolution of symbol--value constructs takes place in this environment. When the evaluator requires a value for a particular symbol that it has encountered it first examines the active environment. It then searches the enclosing environment and so on up to the top--level environment. At this point a special mechanism is used to locate built--in symbols. Specific languages have some differences but for the most part this is a reasonable representation of how they associate values and symbols. Next we consider functions. Functions can be thought of as having three parts. There are the formal parameters, the body of the function and an environment. The formal parameters to the function are the arguments supplied by the user. The body is the sequence of statements that are to be evaluated when the function is invoked. The associated environment is an environment that is typically retained throughout the life of the function. Under lexical scope whenever a function is created the associated environment is set to the currently active environment. This ensures that there is always access to all variable definitions that were in effect at the time the function was created. Under static scope the associated environment is set to the top--level environment. The process of function evaluation is similar across programming languages. When a function is invoked a new environment, the evaluation environment, is created and in this environment the formal arguments (symbols) are bound to the actual arguments (values). The parent of this new environment is set to be the environment associated with the function. Bindings for free variables are resolved by looking in the new environment, then the parent of this environment and then its parent and so on until the top--level environment is encountered. Consider the function defined below. \begin{verbatim} fun1<-function(x) x+y fun2<-function() { y<-20 return(function(x) x+y) } \end{verbatim} In a lexically scoped language we get the following results. \begin{verbatim} R>fun3<-fun2() R>#now look at fun1 and fun3 R>fun1 function (x) x + y R>fun3 function (x) x + y R>y<-2 R>fun1(5) [1] 7 R>fun3(5) [1] 25 \end{verbatim} Since, \verb+fun1+ was created at the top--level this is its associated environment. For \verb+fun3+ the situation is more complicated. When \verb+fun2+ was invoked a new environment was created, \rlang{}. It is here that \rlang{fun3} was created and hence \rlang{} is its associated environment. So, for {\tt fun3} {\tt y} is bound to 20. In S both \rlang{fun1} and \rlang{fun3} have the top--level environment as their associated environment and hence both evaluate to 7. Notice that in R functions that have environments other than the top--level environment associated with them have that environment explicitly printed. What appears to be peculiar behaviour may result when two functions have the same body and formal parameters but different associated environments. If there is a free variable in the function body with different bindings in the associated environments then the function(s) may evaluate differently. In a lexically scoped language you need to be aware of the associated environments. In statically scoped languages this situation arises when a free variable is bound to a global variable that has been altered between function invocations. There are other implications of lexical scope and environments. The environments for function evaluation must be maintained, that is, they must be persistent objects. If the function returns a function then this function must have to access to the environment it was created in. Under static scope environments can be transient, that is they can disappear once the function has returned a value. We can be sure that this environment will never be needed again. This implies that if we did not allow functions to be returned as values in a lexically scoped language we would not require environments to be persistent. Having persistent environments raises many other issues some of which are addressed in Ihaka and Gentleman (1996). In particular this implies that functions cannot easily be stored as files. The function and the whole of the associated environment must be stored together. This does not seem formidable but we must also store any other function that has access to that environment. Determining which functions have access to an environment is not easy. The implementation usually ensures that functions know which environments they need rather than have environments keep track of the functions that might change them. \section{Function Closures} In this section we will give some examples which indicate the usefulness of function closures. Function closures are function bodies bound together with values for the free variables. In most lexically scoped languages this binding is achieved through the associated environment but the binding can be achieved in many different ways. An associate editor supplied the following method of creating function closures in S. A function closure can be created in S by taking a base function and extending its formal arguments to include the new bindings. The function should be written so that the body contains free variables that will later have bindings supplied. An implementation of this is given by the function {\tt MC} below. It should be noted that this function relies very heavily on the implementation of functions in S. \begin{verbatim} MC <- function(f, env=NULL) { env<-as.list(env) if(mode(f) != "function") stop(paste("not a function:",f)) if(length(env)>0 && any(names(env) == "")) stop(paste("all arguments are not named:", env)) fargs<-if(length(f)>1) f[1:length(f)-1] else NULL fargs<-c(fargs,env) if(any(duplicated(names(fargs)))) stop(paste("duplicated arguments:",paste(names(fargs), collapse=", "))) fbody<-f[length(f)] cf<-c(fargs,fbody) mode(cf)<-"function" return(cf) } \end{verbatim} \subsection{Likelihoods} Consider the Exponential density, $f(x) = (1/ \lambda) \exp(-x/ \lambda)$ where both $\lambda$ and $x$ must be positive. This describes the probability distribution of an Exponential random variable. Suppose we had observed a sample of size $n$ which we believed to be from this density. The likelihood principal suggests that we use the product of the densities, considered as a function of $\lambda$, as a means of determining those values of $\lambda$ which are plausible in light of the data. The value of $\lambda$ which maximises this function (or any monotonic function of it) is called the maximum likelihood estimate. The log likelihood function for a sample, $(x_1,\ldots,x_n)$, from an Exponential$(\lambda)$ distribution is $l(\lambda)= -n \log(\lambda) - \sum(x_i)/\lambda$. Here we are thinking of this as a function of $\lambda$ with the data, $x_1,\ldots,x_n$ fixed. Likelihood functions are commonly used in both research and teaching. It would be convenient to have some means of creating a likelihood function. This means that we want to have some function, which we will call a {\em creator}, that we pass data to and get back a likelihood function. We will call this function the {\em returned function}. This likelihood function would then take as arguments values of the parameter ($\lambda$ in the case above) and return the likelihood at that point for the data that was supplied to the creator. To do so the returned function needs to have access to the values of the data that were passed to the creator. If the programming language has lexical scope there is no problem because the returned function is created inside the creator and hence has access to all variable definitions that were in effect at the time that it was created. In the following example \rlang{mlfun} is a creator. It sets up several local variables which will be needed by the likelihood function and whose values depend on the data supplied. Then the likelihood function \rlang{rfun} is created and returned. The environment associated with \rlang{rfun} is the environment that was created by the invocation of \rlang{mlfun} which means that the variables \rlang{n} and \rlang{sumx} will have bindings in that environment. \begin{verbatim} mlfun<-function(x) { sumx<-sum(x) n<-length(x) rfun<-function(lambda) { if(lambda<0) stop("only positive values allowed") return( -n * log(lambda) - sumx/lambda ) } return( rfun ) } \end{verbatim} Subsequent evaluation of \rlang{mlfun} cause the creation of a new environment with bindings to \rlang{n} and \rlang{sumx} which depend on the arguments supplied to \rlang{mlfun}. This environment does not interfere in any way with any environment created by previous invocations of \rlang{mlfun}. \begin{verbatim} R>efun<-mlfun(1:10) # efun is a function! R>efun(3) [1] -48.33333 R>efun2<-mlfun(20:30) R>efun2(3) [1] -124.6667 R> efun(3) #nothing has changed for efun [1] -48.33333 \end{verbatim} This example does not work in S. In S \rlang{mlfun} returns a function with the correct body but when \rlang{efun} is evaluated an error occurs because the frame in which \rlang{n} and \rlang{sumx} were bound evaporated when \rlang{mlfun} returned its value. They are free variables in \rlang{mlfun} and unless there are global variables with the same names an error will be signalled. However, the function MC, given above, can be used to provide similar functionality in S. \begin{verbatim} ## with the definition of mlfun then being mlfun<-function(x) { sumx<-sum(x) n<-length(x) rfun<-function(lambda) { if(lambda<0) stop("only positive values allowed") return( -n * log(lambda) - sumx/lambda ) } MC(rfun,list(sumx=sumx,n=n)) } ##and finally S> efun<-mlfun(1:10) S> efun(3) [1] -29.31946 \end{verbatim} \subsection{Probabilities and Related Concepts} Suppose we want to take a set of data and make an empirical cumulative distribution function from it. One representation of an ecdf is as Pr($X \leq t$). A functional version of the ecdf accepts as input a real number $t$ and returns a value between 0 and 1. \begin{verbatim} mkecdf<-function(x) { n<-length(x) rval<-function(t) return( sum(x<=t)/n ) return( rval ) } mkinvecdf<-function(x) { n<-length(x) sortx<-sort(x) rval<-function(p) { if( (p<0) || (p>1) ) stop("invedcf: wrong argument") t1<-ceiling(p*n) return(sortx[t1]) } return( rval ) } \end{verbatim} The local variables in the returned functions are bound to the values they held at the time the returned functions were created. These examples are useful in the teaching of statistics and probability since they function in a manner consistent with our perceptual model and hence allow for easy abstraction to related problems. In fact these functions and concepts are of more general use. Gentleman and Crowley (1989) examine their use for smoothing data. Consider a scatter plot with a response variable plotted in the $y$ direction and a covariate plotted in the $x$ direction. A {\em scatterplot smoother} is a line added to the plot which represents an estimate of the mean response conditional upon the value of the covariate. Adding a scatterplot smoother enhances our perception of the dependence between the response and the covariate. An example of a smoother is a running mean which is given by \[ \hat{m}_x = \frac{1}{k} \sum_{N_x} x_i \] where $N_x$ is the list of indices of the $k$ nearest neighbours of the point $x$. This scatter plot smoother shows how the mean response changes as the value of the covariate changes. Many smoothers may be written as functionals of the estimated conditional distribution function. For example, the running mean can be written as \[ \hat{m}_z = \int u d\hat{F}_z(u \mid k) \] where $\hat{F}_z(\cdot \mid k)$ is the empirical distribution function based on the $k$ nearest neighbours of $z$. This latter representation of the running mean can easily be generalised to use estimates of $F$ other than the empirical distribution function. It also lends itself to easily being adapted to providing estimates other than a running mean (eg. running quantiles) by simply changing the integrand. This formulation is also useful computationally. Take the entire data set, break it down into neighbourhoods and estimate a conditional distribution in each. Return a vector or a list of all the conditional distributions. Now writing a function which takes a conditional distribution function as input and returns the desired quantity yields the appropriate smoother with little additional computational effort. \section{Mutable State} One could describe the association of an environment with a function as giving that function state. We saw in Section 2 that this association can be mimicked in many languages through clever programming. Next we will explore the effects of being able to change that local state information. That is, change the bindings of variables which are used to give local state. At this point the situation becomes more complicated and the methods suggested above for achieving local state in S do not adapt well. We can arrange to have several returned function share the same state variable. Perhaps the simplest example is the bank account suggested in Abelson, Sussman and Sussman (1985). \begin{verbatim} account <- function(total) list( balance = function() total, deposit = function(amount) total <<- total + amount, withdraw = function(amount) total <<- total - amount ) \end{verbatim} A functioning bank account needs to have a balance or total, a function for making withdrawals, a function for making deposits and a function for stating the current balance. We achieve this by creating the three functions within {\tt account} and then returning a list containing them. When {\tt account} is invoked it takes a numerical argument {\tt total} and returns a list containing the three functions. Because these functions are defined in an environment which contains {\tt total}, they will have access to its value. Now we could create accounts, make deposits and withdrawals and check on the balances. \begin{verbatim} R> Joe.account<-account(1000) R> Joe.account$withdraw(50) R> Joe.account$balance() [1] 950 \end{verbatim} The method of augmenting the argument list suggested earlier will not work here. The three returned functions must share the local variable {\tt total} in some mutually accessible location. If each of them has {\tt total} as a parameter then when {\tt total} is changed there will need to be some mechanism for finding the associated functions and changing their argument lists. Unfortunately this is simply not practicable. Creating and maintaining lists of such associations would be a horrendous task. You will have noticed that we introduced a new type of assignment. The reason for this is that assignment is often synonymous with allocation in R (and S). Local variables do not need to be declared and are simply created as needed. If the \verb+<-+ form of assignment is used and the symbol on the left hand side does not exist in the current environment it is created and associated with the value from the right hand side. So we must have a new {\em operator} to indicate to the interpretor when we want to assign to a variable in one of the parent environments rather than create a local variable. In R this is the special assignment operator, \rlang{<<-}. \subsection{Random Number Generators} Pseudo--random number generation is an important part of statistical computing. Two common uses are simulation and bootstrapping. Pseudo--random number generators require a seed and for a given seed they produce a sequence of numbers that, hopefully, has all the properties of a sequence of truly random numbers. The difference between pseudo--random number generators and random number generators is that given the same seed the pseudo--random number generator will reproduce the same sequence. Since we only discuss pseudo--random numbers we will drop the pseudo prefix. With most random number generators the seed is updated for each random number generated and in order to get the next number in the sequence you need the last value of the seed. This implies that in order to implement a simulation all routines which handle the random numbers need to have an extra parameter passed to them, the seed. This is often not practical and instead the solution of making the seed a global variable is taken. Having the seed be a global variable can have some undesirable effects on the simulation since it makes the seed accessible to every function and hence increases the likelihood of some function inadvertently changing the seed. There is also a chance that the seed could be reset to the original starting value. These sorts of occurrences can and usually do invalidate the results of the simulation and hence should be guarded against. Unfortunately there is seldom ever any evidence that such an event has occurred and the results are accepted as if they were valid. These problems can be overcome by the use of lexical scope. Lexical scope allows an instance of the seed to be bound to a particular random number generator and to be inaccessible to any other random number generator. Thus, it cannot be altered inadvertently. The binding mechanism is sufficiently general to allow the user to set the seed, query the seed and generate the next number in the sequence through a collection of returned functions. But, the seed is accessible only through these functions. An example will clarify the situation. First consider the function {\tt make.random}. \begin{verbatim} make.random<-function(seed) return( list( rand=function() { seed<<-(9*seed+5)%%1024 seed }, setseed=function(nseed) seed<<-nseed, getseed=function() seed ) ) \end{verbatim} The function \rlang{make.random} is a function that has one argument, \rlang{seed} and the value it returns a list of functions. One to generate random numbers, {\tt rand}, one to set the seed of the random number generator, {\tt setseed}, and one to get the current value of the seed, {\tt getseed}. When {\tt make.random} is invoked an environment is created and in this environment the symbol {\tt seed} is bound to the value supplied as an argument. Next the list of functions is created. Each of these functions has the environment with a binding for {\tt seed} as its associated environment and hence can access the current value associated with {\tt seed}. \begin{verbatim} R> rand<-make.random(1) R> rand$rand() [1] 14 R> rand$rand() [1] 131 R> rand$getseed() [1] 131 \end{verbatim} Now \rlang{rand\$rand} is a function and evaluating it produces a sequence of random numbers. Again it is essential that in \rlang{rand\$rand} the special assignment operator \rlang{<<-} is used since this ensures that rather than creating a new variable called {\tt seed} the variable in the associated environment has a new value bound to it. Several versions of \rlang{rand} may exist simultaneously and as they will all be the result of separate invocations of \rlang{make.random} their associated environments will be distinct and each will have its own protected version of \rlang{seed}. \begin{verbatim} > rand1<-make.random(1) > rand2<-make.random(101) > rand1$rand() [1] 14 > rand2$rand() [1] 914 > rand1$rand() [1] 131 \end{verbatim} We have created two random number generators, with different seeds, that do not interfere with each other. Even with the call to \rlang{rand2} between the two calls to \rlang{rand1} we see that we get the same sequence as above for \rlang{rand1}. \subsection{An Interesting Aside} At this point it is worth mentioning another consequence of having an associated environment. Both R and S have what is known as pass--by--value parameters to functions. This means that when a value is passed to a function that object itself is not passed but rather a copy of it is passed. This ensures that any changes made to an object inside a function do not affect the object that was passed to the function. Consider the example below. When the function, {\tt f} is called a copy of the value associated with the symbol {\tt x} is passed to it. Thus, when the function changes the value associated with its parameter this has no effect on the global variable {\tt x}. \begin{verbatim} > x<-10 > f<-function(x) x<-12 > f(x) > x [1] 10 \end{verbatim} Had this been a pass--by--reference language then the function {\tt f} would have gotten a pointer to the storage area containing the value associated with {\tt x} Then the assignment, {\tt x<-12}, would have changed the value in the storage location referenced by {\tt x} to 12. On return from the function we would have found that {\tt x} was now bound to the value 12 rather than 10. Now, we can achieve some pass by reference results in in a pass by value language because of the associated environments. These occur because in most implementations the associated environments are not copied when the function is invoked. Only the function body and formal arguments are copied and thus changes to variables in the associated environment within the function call will affect the global variable. Again consider our function {\tt make.random}. \begin{verbatim} > rand<-make.random(1) > f<-function(rgen) 10*rgen()/2 > f(rand$rand) > rand$rand() [1] 131 \end{verbatim} We created a random number generator and then passed it to a function that evaluated it. The next evaluation of {\tt rand\$rand} gave the second value in the sequence as reported above. Therefore, passing it to {\tt f} and the evaluation in {\tt f} affected the global variable. We have achieved pass--by--reference in a pass--by--value language. \section{Concluding Remarks} In this paper we have argued the usefulness of lexical scoping and its consequences in several situations that are important to statisticians. The use of function closures being one of the more beneficial. These make it much easier for statisticians to program complicated algorithms without having to delve too deeply into the basic functioning of the language. In some ways one could argue that this is one of the true strengths of S, that you can write programs without having to worry about memory allocation or type--checking of variables. We believe that lexical scope has large advantages and have incorporated it in R. S has become a popular tool for statistical programming and research. It has many strengths, among these the fact that it is a functional language with functions as first class values. This means that functions can be passed to other functions as arguments and they can be returned as values. We argue that this latter property is very important but has been neglected mainly because of the scoping rules used in S. The reader is reminded that whether a particular method can be programmed in a particular language is not the issue here. All computations could be carried out in machine language (they are anyways) but the point of a computer language is to easily allow us to abstract and program the ideas we have. We want a language in which it is easy to do the things that we want or need to do. Users switch between languages to gain ease of programming. We are not suggesting that every user will want to switch to R, or some other lexically scoped language. However, it is useful for them to consider the differences. Many of the examples were inspired by Abelson, Sussman and Sussman (1985) and by Kamin (1990). They are based on SCHEME, a computer language that relies quite heavily on lexical scope and having functions as first class objects. Abelson, Sussman and Sussman (1985) also provide much richer examples including developing an object oriented system through lexical closures. \vspace{0.1in} \Large \noindent Acknowledgements \vspace{0.1in} \normalsize The authors would like to thank an Associate Editor for his or her careful reading of a previous draft of this manuscript and the many helpful suggestions which have led to its improvement. We would also like to thank John Chambers, Mike Meyer and Duncan Murdoch for their helpful comments on a draft of this paper. \section{References} \begin{list}{}{\setlength{\leftmargin}{0in}} \item{} Abelson, H., Sussman, G. J. and Sussman J. (1985). {\em Structure and Interpretation of Computer Programs}. MIT Press. \item{} Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). {\em The new S Language: A programming environment for data analysis and graphics.} Wadsworth \& Brooks/Cole. \item{} Gentleman, R. and Crowley, J. (1990). Smoothing censored data. Technical Report 90-13, Department of Statistics and Actuarial Science, University of Waterloo. \item{} Ihaka, R. and Gentleman, R. (1996). R: A language for data analysis and graphics. {\em The Journal of Computational and Graphical Statistics,} to appear. \item{} Kamin, S. N. (1990). {\em Programming languages.} Addison Wesley. \item{} McCarthy, John (1960). Recursive functions of symbolic expressions and their computation by machine, part 1. {\em Communications of the ACM,} {\bf 3},185--95. \item{} Tierney, L. (1990). {\em Lisp--Stat: An Object Oriented Environment for Statistical Computing and Data Analysis.} John Wiley and Sons. \end{list} \end{document}