JavaScript 2.0 Member Lookup

There have been much discussion in the TC39 subgroup about the meaning of a member lookup operation. Numerous considerations intersect here.

We will express a general unqualified member lookup operation as a.b, where a is an expression and b is an identifier. We will also consider qualified member lookup operations and write them as a.n::b, where n is an expression that evaluates to some namespace. In almost all cases we will be interested in the dynamic type Td of a. In one scheme we will also consider the static type Ts of the expression a. If the language is sound, we will always have Td Ts.

In the simplest approach, we treat an object as merely an association table of member names and member values. In this interpretation we simply look inside object a and check if there is a member named b. If there is, we return the member's value; if not, we return undefined or signal an error.

There are a number of difficulties with this simple approach, and most object-oriented languages have not adopted it:

Once we allow private or package-protected members, we must allow for the possibility that object a will have more than one member named b -- abstraction considerations require that users of a class C not be aware of expose C's private members, so, in particular, a user should be able to create a subclass D of C and add members to D without knowing the names of C's private members. Both C++ and Java allow this. We must also allow for the possibility that object a will have a member named b but we are not allowed to access it. We will assume that access control is specified by lexical scoping, as is traditional in modern languages.

Desirable Criteria

Lookup Models

There are three main competing models for performing a general unqualified member lookup operation as a.b. Let S be the set of members named b of the object obtained by evaluating expression a (hereafter shortened to just "object a") that are accessible via the visibility rules applied in the lexical scope where a.b is evaluated. All three models pick some member s S. Clearly, if the set S is empty, then the member lookup fails. In addition, the Spice and pure Static models may sometimes deliberately fail even when set S is not empty. Except for such deliberate failures, if the set S contains only one member s, all three models return that element s. If the set S contains multiple members, the three models will likely choose different members.

Another interesting (and useful) tidbit is that the Static and Dynamic models always agree on the interpretation of member lookup operations of the form this.b. All three models agree on on the interpretation of member lookup operations of the form this.b in the case where b is a member defined in the current class.

A note about overriding: When a subclass D overrides a member m of its superclass C, then the definition of the member m is conceptually replaced in all instances of D. However, the three models are only concerned with the topmost class in which member m is declared. All three models handle overriding the way one would expect of an object-oriented language. They differ in the cases where class C has a member named m, subclass D of C has a member with the same name m, but D's m does not override C's m because C's m is not visible inside D (it's not well known, but such non-overriding does and must happen in C++ and Java as well).

Static Model

In the Static model we look at the static type Ts of expression a. Let S1 be the subset of S whose class is either Ts or one of Ts's ancestors. We pick the member in S1 with the most derived class.

The pure static model above is implemented by Java and C++. It would not work well in that form in JavaScript because many, if not most, expressions have type Any. Because type Any has no members, users would have to cast expression a to a given type T before they could access members of type T. Because of this we must extend the static model to handle the case where the subset S1 is empty, or, in other words, the static lookup fails. (Rather than doing this, we could extend the static model in the case where the static type Ts is some special type, but then we would have to decide which types are special and which ones are not. Any is clearly special. What about Object? What about Array? It's hard to draw the line consistently.)

In whichever cases way we extend the static model, we also have a choice of which member we choose. We could back off to the dynamic model, we could choose the most derived member in S, or perhaps we could choose some other approach.

Safety	Good within the pure static model. Problems in the extended static model (a subclass could silently shadow a member) that could perhaps be addressed by warnings.
Abstraction	Good.
Robustness	Very bad. Updating a function's or global variable return type silently changes the meaning of all code that uses that function or global variable; in a large project such a change would be quite difficult. Difficult to correctly split expressions into subexpressions.
Namespace independence	Good.
Compatibility	Bad within the pure static model (type casts needed everywhere). May be good in the extended static model, depending on the choice of how we extend it.
Other	This model may be difficult to compile well because the compiler may have difficulty in determining the intermediate types in compound expressions. Languages based on the static model have traditionally been compiled off-line, and such compilers tend to be difficult to write for on-line compilation without requiring the programmer to predeclare all of his data structures (if there are any forward-referenced ones, then the compiler doesn't know whether they should have a type or not). The streaming execution model may actually help because it defers compilation until more information is known.

Spice Model

In the Spice model we think of each member m defined in a class C as though it were a function definition for a (possibly overloaded) function whose first argument has type C. Definitions in an inner lexical scope shadow definitions in outer scopes. The Spice model does not consider the static type Ts of expression a.

Let L be the innermost lexical scope enclosing the member lookup expression a.b such that some member named b is defined in L. Let Lb be the set of all members named b defined in lexical scope L, and let S1 = S Lb (the intersection of S and Lb). If S1 is empty, we fail. If S1 contains exactly one member s, we use s. If S1 contains several members, we fail (this would only happen for import conflicts).

Safety	Good.
Abstraction	Good.
Robustness	Poor. Renaming a `package`-visible member may break code outside the class that defines that member even if that code does not access that member. Converting a member from `private` to one of the other two visibilities also can introduce conflicts in other, unrelated classes in the same package that just happen to have an unrelated member with the same name. Fortunately these conflicts usually (but not always) result in errors rather than silent changes to the meaning of the program, so one can often find them by exhaustively testing the program after making a change.
Namespace independence	Bad. Members with the same name in unrelated classes often conflict.
Compatibility	Poor? Many existing programs rely on namespace independence and would have to be restructured.
Other	Most object-oriented programmers would be confused by a violation of namespace independence. Programming without this assumption requires a different point of view than most programmers are used to. (I am not talking about Lisp and Self programmers, who are familiar with that way of thinking.)

Dynamic Model

In the Dynamic model we pick the member s in S defined in the innermost lexical scope L enclosing the member lookup expression a.b. We fail if the innermost such lexical scope L contains more than one member in S (this would only happen for import conflicts).

Safety	Good at the language level, but see "other" below.
Abstraction	Good.
Robustness	Good. All of these changes are easy to do.
Namespace independence	Good.
Compatibility	Good.
Other	Packages using the dynamic model may be vulnerable to hijacking (coerced into doing something other than what the author intended) by a determined intruder. It is possible for a compiler to detect such vulnerabilities and warn about them.

Namespaces

The various models make it possible to get into situations where either there is no way to access a visible member of an object or it is not safe to do so (see member hijacking). In these cases we'd like to be able to explicitly choose one of several potential members with the same name. The :: namespace syntax allows this. The left operand of :: is an expression that evaluates to a package or class; we may also allow special keywords such as public, package, or private instead of an expression here, or omit the expression altogether. The right operand of :: is a name. The result is the name qualified by the namespace.

As we have seen, the name b in a member access expression a.b does not necessarily refer to a unique accessible member of object a. In a qualified member access expression a.n::b, the namespace n narrows the set of members considered, although it's possible that the set may still contain more than one member, in which case the lookup model again disambiguates. Let S be the set of members named b of object a that are accessible. The following table shows how a.n::b subsets set S depending on n:

n	Subset
None	Only the ad-hoc member named `b`, if any exists
A class `C`	The fixed member of `C` named `b`, if it exists; if not, try `C`'s superclass instead, and so on up the chain
A package `P`	The subset of `S` containing all accessible members of `P`
`private`	The fixed member named `b` of the current class
`package`	The subset of `S` containing all accessible members that have `package` visibility
`public`	The subset of `S` containing all accessible members that have `public` visibility

The :: operator serves a different role from the . operator. The :: operator produces a qualified name, while the . operator produces a value. A qualified name can be used as the right operand of .; a value cannot. If a qualified name is used in a place where a value is expected, the qualified name is looked up using the lexical scoping rules to obtain the value (most likely a global variable).

Ad-Hoc Members

All of the models above address only access to fixed members of a class. JavaScript also allows one to dynamically add members to individual instances of a class. For simplicity we do not provide access control or versioning on these ad-hoc members -- all of them are public and open to everyone. Because of the safety criterion, a member lookup of a private or package-protected member must choose the private or package-protected member even if there is an ad-hoc member of the same name. To satisfy the robustness criterion, we should treat public members as similarly as possible to private or package-protected members, so we always give preference to a fixed member when there is an ad-hoc member of the same name.

To access an ad-hoc member that is shadowed by a fixed member, we can either prefix the member's name with :: or use an indirect member access.

Indirect Member Access

How should we define the behavior of the expression a[b] (assuming the [] operator is not overridden by a's class)? There are a couple of possibilities:

In general it seems like it would be a bad idea to extend the syntax of the string "s" to allow :: operators inside the string. Such strings are too easily forged to play the role of pointers to members.

Introduction