You are currently viewing a snapshot of www.mozilla.org taken on April 21, 2008. Most of this content is highly out of date (some pages haven't been updated since the project began in 1998) and exists for historical purposes only. If there are any pages on this archive site that you think should be added back to www.mozilla.org, please file a bug.



Simple Packages in JavaScript

Norris Boyd



Why "simple" packages?

There are a large number of proposals on the table for the next version of the ECMA standard. Among these are ideas for packages, classes, and static types. Using these features should allow JavaScript programmers to write larger programs in a way that is robust and maintainable.

However, the question arises: Is it possible to define a simple packages feature that will provide a large amount of utility to JavaScript programmers in the short term? I believe so. This document describes a small set of additional features for JavaScript that provide for many of the modularity benefits that we expect from the eventual complete packages design.

Defining a package

A package is defined using a PackageDeclaration at the start, followed by normal JavaScript source (with a few restrictions detailed below). The PackageDeclaration has the following syntax:

PackageDeclaration ::= package Identifier ; The Identifier is the package name. The trailing semicolon is not optional; that way, syntax extensions can be added more easily.

The syntax of function definitions would be broadened to include a public attribute that will be used to indicate which functions may be accessed by importers of the package.

So a simple, complete package definition might look like

package pkg;
public function increment(x) { return x+1; }
This defines a package named pkg and a single public function named increment.

Packages can only contain function and variable declarations (and no variable initialization). This heavy restriction allows us to dodge the potential for problems of timing the execution of global code.

Using a package

To use a package, just begin the script with an import declaration. The syntax of the import declaration is

ImportDeclaration ::= import DottedList from URI;

DottedList ::= Identifier

| DottedList . Identifier

Again, the semicolon is not optional to allow for additional syntax to be added later. URI is a quoted string containing a Uniform Resource Identifier. We may also want to allow special unquoted identifiers in place of URI, like system, that would indicate that the package is to be loaded from a special standard place that the embedding knows about. We could do the File and Internationalization packages using this mechanism. (E.g., "classpath:com.netscape.examples.File".)

When an import declaration is encountered, the engine first checks to see if it has already loaded the package. If so, it can just import the names from that package. Otherwise, it locates the source of the package, creates a new object to use as the top-level object for the package, and executes the source of the package. Since the package is restricted from having top-level code, executing the source will merely initialize the package's top-level object with all the top-level functions and variables. It also creates a list of all the public functions in the package.

If DottedList ends with the package name, then all public names are imported into the top-level object of the importing script. Otherwise, DottedList must end with a package name followed by the name of a public function in the package, and just that name is imported. In either case, if any name being imported is already used for a function or variable, an error is reported. When the name is imported, a read-only, permanent property of the importer's top-level object is created and initialized with a closure of the associated function object in the package's top-level object. Issue: is the closure really needed?

When an imported function is called, the closure in turn calls the function object in the package's top-level object. The scope chain thus consists of the activation object of the function followed by the top-level object of the package. This means that any global objects other than public functions will be visible to functions inside the package but cannot be seen by functions outside the package unless returned from function calls.

Standard objects

ECMA defines several standard objects and functions (Function, isNaN, Date.prototype, etc.) that are properties of the top-level object or are reachable from the top-level. The requirement of separate namespaces for packages in turn leads to the requirement for several "global" objects in the ECMA sense, which are now called "top-level" objects since they are now no longer global. (This situation has existed forever in the browser embedding, where each window object is a top-level object.)

However, now that we have language constructs that result in several top-level objects, we now need to rationalize the behavior of these standard objects so that they can be shared across multiple top-level objects. The only other option is to have multiple copies of the global objects for each top-level scope. But then we run into the series of problems that Mike McCabe was unable to work around in his implementation of errors as exceptions in order to make instanceofand equality work as expected. Also, if scripts begin to make heavy use of the package mechanism, then we'll have a large number of duplicate copies of the built-in objects consuming the memory resources of the system. In fact, by defining packages as a global repository of code and state we fix the problem that plagues people trying to write JavaScript applications with multiple pages: where to put the shared functionality and state across multiple windows.

I believe, therefore, that we need to define the standard objects as being sealed with all properties ReadOnly. By "sealed" I mean that all properties are DontDelete and it is not possible to add additional properties. Would this impose a burden on existing scripts? We could come up with some way to impose this restriction selectively through versioning or by detecting the presence of import or package declarations. However, I believe this approach could be problematic (what if some scripts on the page are "old-style" and others are "new-style": we can't substitute out the global objects). We could also make the change more globally. I've already implemented this restriction on a private copy of 4.5 and haven't been able to find any sites that depend on this behavior.

Example session

The following sessions were run against a version of our Java version of our JavaScript engine (which we can't publish yet, unfortunately) in which I've made changes for simple packages. The implementation is not complete; in particular, error handling is often not implemented.

I've implemented a single set of standard objects that are shared across all packages. These standard objects are sealed and have readonly properties.

E:\src\ns\js\rhino> cat pkg.js
package pkg;

var count = 0;

public function incr() { return g(); }

function g() { return count++; }

public function getDate() { return new Date(); }
This JavaScript file defines a package pkg with two public functions, incr and getDate.
E:\src\ns\js\rhino> rhino
js import pkg.incr from "file:///E:/src/ns/js/rhino/pkg.js";
js incr()
0
js incr()
1
js typeof getDate
undefined
js typeof g
undefined
js quit()
Here we import a single function from the package and call it. Note that none of the other functions have been imported.
E:\src\ns\js\rhino> rhino
js import pkg from "file:///E:/src/ns/js/rhino/pkg.js";
js incr()
0
js incr()
1
js typeof getDate
function
js typeof g
undefined
js var d = getDate()
js d
Tue Jan 26 15:05:26 GMT-0800 (PST) 1999
js d instanceof Date
true
js quit()
Now we import the entire package. Both incr and getDate are available, but g is not. Note that since we're sharing standard objects, the instanceof check functions as you would expect.
E:\src\ns\js\rhino> jsc pkg.js

E:\src\ns\js\rhino> rhino
js import pkg from "classpath:pkg";
js incr()
0
js incr()
1
js typeof g
undefined
js typeof getDate
function
js getDate() instanceof Date
true
js quit()


Now we compile the package into a Java class and import the class. Here's an example of a non-URI form of the from clause. There's nothing special about the generated Java class except that it implements the Script interface, so it would also be possible to write a Java class and make it appear as a JavaScript package. I would expect that XPConnect objects could be loaded in a similar way.

E:\src\ns\js\rhino> rhino
js String.x = 87
js: Cannot add a property to a sealed object.
js String.prototype.x = 879
js: Cannot add a property to a sealed object.
js String.prototype.substring.x = 8374
js: Cannot add a property to a sealed object.
js String.prototype = 7
7
js String.prototype

js String.prototype.substring = 88
88
js String.prototype.substring
function substring() {
[native code]
}

js quit()
Here we show that the standard objects are sealed and have readonly properties.

Possible extensions

Several possible extensions come to mind to augment the simple scheme outlined above.

Versioning

It would be easy to add support for versioning by allowing multiple version names to be defined and then importing only the names corresponding to the appropriate version. Additional lookups could be performed as they are today, and then when classes are available, the special versioned member lookup could be added. Dynamic properties could continue to be looked up without regard to version. Otherwise, you'd have to specify a version corresponding to the dynamic property when you created it, which would mean that you'd have to declare the dynamic property in some way, which would mean that it is no longer dynamic.

Top-level renaming

I initially implemented top-level renaming so that name conflicts could be resolved at import time. However, Waldemar Howat pointed out that top-level renaming makes top-level scripts less like classes, so I've disabled the feature. Currently there's no way to resolve name conflicts. We could also add some object that works like LiveConnect's Packages object so that it is possible to walk down into the exported properties of a package. However, Waldemar has pointed out that he's considering some possibly different syntax that is an analog to the :: operator in C++ to perform this sort of operation.

Top-level scripts

Currently top-level scripts of packages are executed when the import declaration is processed. However, I've made no attempt to deal with possible cycles in a package loading graph and, as stated above, think the simplest thing would be to disallow top-level scripts. One possible extension would be to define top-level scripts as executing when the first function is called, but this doesn't appear to have a good migration path to classes.

Acknowlegements

Brendan Eich's original design of import/export in JavaScript 1.2 had the idea of names imported from multiple global objects.
Waldemar Horwat's ideas on language futures are the inspiration for versioning and the syntax, and has given good feedback on what not to do.