Special Features of SML/NJ

Extra top-level structures

Standard ML of New Jersey provides the complete SML'97 Basis Library, and in addition provides several other library modules.

The SMLofNJ structure contains substructures that provide:

  • access to compiler internals,
  • weak pointers,
  • lazy suspensions,
  • first-class continuations,
  • interval timers, and
  • information about the underlying system.
  • Other top-level structures provide:

  • unsafe operations (such as unchecked array access),
  • general operating system signal values,
  • unix-specific signal values,
  • control of automatic compiler-generated polling,
  • hooks for cleanup operations at ML exit and restart, and
  • interface to runtime-system functions.
  • SML/NJ also provides a Compiler structure that controls the ML compiler itself, and gives access to internal phases of the compiler. There are substructures for:

  • execution profiling,
  • user-customizable pretty printing,
  • control of compiler error-message printing, and
  • control of warning messages.
  • Other Compiler substructures provide access to the SML/NJ "visible compiler", including environments, syntax trees (Ast), abstract syntax (Absyn), parsing, and other basic compilation operations. These "visible compiler" structures are not yet documented, other than by source code for the signatures.

    Vector expressions and patterns

    Vectors are homogeneous, immutable arrays (see the Vector structure). Vectors are a standard feature of SML'97, but SML/NJ also has special syntax for vector expressions and vector patterns. In SML'97, vectors can be created only by calling functions from the Vector structure, and cannot be pattern-matched.

    The vector expression

      #[exp0, exp1, ..., expn-1]
    
    (where n >= 0) creates a vector of length n whose elements are the values of the corresponding subexpressions. As with other aggregate expressions, the element expressions are evaluated from left to right. Vectors may be pattern-matched by vector patterns of the form
      #[pat0, pat1, ..., patn-1]
    
    Such a pattern will only match a vector value of the same length.

    Vector expressions and vector patterns are more compact and efficient than lists, and are comparable in cost to records.

    Or-patterns

    SML/NJ has also extended the syntax of patterns to allow ``or-patterns.'' The basic syntax is:
      (apat1 | ... | apatn)
    
    where the apati are atomic patterns. The other restriction is that the variables bound in each apati must be the same, and have the same type. A simple example is:
      fun f ("y"|"yes") = true
        | f _ = false;
    
    which has the same meaning as:
      fun f "y" = true
        | f "yes" = true
        | f _ = false;
    

    First-class continuations

    A set of primitives has been added to ML to give access to continuations:
      structure SMLofNJ.Cont : sig
        type 'a cont
        val callcc : ('a cont -> 'a) -> 'a
        val throw : 'a cont -> 'a -> 'b
        . . .
      end
    
    The continuation of an expression is an abstraction of what the system will do with the value of the expression.

    The use of callcc is described with structure SMLofNJ.Cont.

    Quote/Antiquote

    An early use of ML was as a MetaLanguage for manipulating terms in an object language. Edinburgh LCF/ML had features to parse one particular object language (called OL). Standard ML of New Jersey has support for arbitrary object languages, with user supplied object-language parsing.

    Higher-order Modules

    The module system of Standard ML has supported first-order parametric modules in the form of functors. But there are occasions when one like to parameterize over functors as well as structures, which requires a truly higher-order module system (see, for instance, the powerset functor example. SML/NJ now provides a higher-order extension of the module system.

    Parameterization over functors can be provided in a straightforward way by allowing functors to be components of structures. Syntactically this can be accomplished merely by allowing functor declarations inside of structure bodies, and by providing syntax for functor specifications in signatures. Functor specifications were already part of the module syntax of the Definition of Standard ML (Figure 8, p. 14), so we have implemented that syntax and added it to the spec class (Figure 7, p. 13). In addition, it is convenient to have a way of declaring functor signatures and some syntactic sugar for curried functor definitions and partial application of curried functors, so these have also been provided. This extension is an ``upward-compatible'' enrichment of the language that should break no existing programs.

    Functors as structure components. In the extended language, a signature can contain a functor specification:

      signature SIG =
      sig
        type t
        val a : t
        functor F(X: sig type s
    		     val b: s
    		 end) : sig val x : t * X.s end
      end
    
    To match such a signature, a structure is allowed to contain a functor declaration:
      structure S : SIG =
      struct
        type t = int
        val a = 3
        functor F(X: sig type s val b: s end) = 
          struct val x = (a,X.b) end
      end
    
    This makes it possible higher-order functors by including a functor as a component of a parameter structure or of a result structure. The case of a functor parameter is illustrated by the following example.
      signature MONOID = 
      sig
        type t
        val plus: t*t -> t
        val e: t
      end;
    
      (* functor signature declaration *)
      funsig PROD (structure M: MONOID
    	       structure N: MONOID) = MONOID
    
      functor Square(structure X: MONOID
    		 functor Prod: PROD): MONOID =
          Prod(structure M = X
    	   structure N = X);
    
    
    Note that this example involves the definition of a functor signature PROD. Currently functor signature declarations take one of the following forms:
      funsig funid (strid: sigexp) = sigexp}
      funsig funid (specs) = sigexp
    
    This syntax is viewed as provisional and subject to change.

    A common use of functors returning functors in their result is to approximate a curried functor with multiple parameters. Here is how one might define a curried monoid product functor:

      functor CurriedProd (M: MONOID) =
      struct
        functor Prod1 (N: MONOID) : MONOID =
          struct
    	type t = M.t * N.t
    	val e = (M.e, N.e)
    	fun plus((m1,n1),(m2,n2))=(M.plus(m1,m2),N.plus(n1,n2))
          end;
      end
    
    This works, but the partial application of this functor is rather awkward because it requires the explicit creation of an intermediate structure:
      structure IntMonoid =
      struct
        type t = int
        val e = 0
        val plus = (op +): int*int -> int
      end;
    
      structure Temp = CurriedProd(IntMonoid);
    
      functor ProdInt = Temp.Prod1;
    
    To simplify the use of this sort of functor, some derived forms provide syntactic sugar for curried functor definition and partial application. Thus the above example can be written:
      functor CurriedProd (M: MONOID) (N: MONOID) : MONOID =
      struct
        type t = M.t * N.t
        val e = (M.e, N.e)
        fun plus((m1,n1),(m2,n2))=(M.plus(m1,m2),N.plus(n1,n2))
      end;
    
      functor ProdInt = CurriedProd(IntMonoid);
    
    The syntax for curried forms of functor signature and functor declarations and for the corresponding partial applications can be summarized as follows:
      funsig funsigid (par1) ... (parn) = sigexp
    
      functor funid (par1) ... (parn) = strexp
    
      functor funid1 = funid2 (arg1) ... (argn)
    
      structure strid = funid (arg1) ... (argn)
    
    where
      par ::= id : sigexp | specs
      arg ::= strexp | dec
    
    In the case of a partial application defining a functor, it is assumed that the funid2 on the right hand side takes more than n arguments, while in the case of the structure declaration funid should take exactly n arguments. As a degenerate case where n=0 we have identity functor declarations:
      functor funid1 = funid2
    
    There is also a "let" form of functor expression:
      fctexp ::= let dec in fctexp end
    
    which can only be used in functor definitions of the form:
      functor funid = let dec in fctexp end.
    

    The curried functor declaration

      functor F (par1) ... (parn) = strexp
    
    is a derived form that is translated into the following declaration
      functor F (par1) =
      struct
        functor %fct% (par2) ... (parn) = strexp
      end
    
    and the declarations
      structure S = F (arg1) ... (argn)
      functor G = F (arg1) ... (argn)
    
    are derived forms expanding into (respectively):
      local
        structure %hidden% = F (arg1)
      in
        structure S = %hidden%.%fct% (arg2) ... (argn)
      end
    
    and
      local
        structure %hidden% = F (arg1) ... (argn)
      in
        functor G = %hidden%.%fct%
      end
    
    Currently there is no checking that a complete set of arguments is supplied when a curried functor is applied to define a structure, as illustrated by the following example:
      functor Foo (X: sig type s end) (Y: sig type t end) =
      struct type u = X.s * Y.t end
    
      structure A = struct type s = int end
    
      structure S = Foo (A)  (* Foo A yields a (useless) structure *)
    
      functor G = Foo (A)    (* Foo A yields a functor *)
    
    Of course, the structure S defined in this way is useless, since we cannot use the pseudo-identifier %fct% to select its functor component. Arity checking to prevent this sort of error will be added in a future release.

    | SML/NJ Home Page |
    | SML/NJ Documentation Home Page |

    Send your comments to sml-nj@research.bell-labs.com
    Copyright © 1998, Lucent Technologies; Bell Laboratories.