ECMAScript 4 Netscape Proposal
Rationale
Member Lookup
previousupnext

Friday, September 20, 2002

This page is somewhat out of date.

Introduction

There have been much discussion in the TC39 subgroup about the meaning of a member lookup operation. Numerous considerations intersect here.

We will express a general unqualified member lookup operation as a.b, where a is an expression and b is an identifier. We will also consider qualified member lookup operations and write them as a.n::b, where n is an expression that evaluates to some namespace. In almost all cases we will be interested in the dynamic type Td of a. In one scheme we will also consider the static type Ts of the expression a. If the language is sound, we will always have Td  Ts.

In the simplest approach, we treat an object as merely an association table of member names and member values. In this interpretation we simply look inside object a and check if there is a member named b. If there is, we return the member’s value; if not, we return undefined or signal an error.

There are a number of difficulties with this simple approach, and most object-oriented languages have not adopted it:

Once we allow private or internal members, we must allow for the possibility that object a will have more than one member named b — abstraction considerations require that users of a class C not be aware of expose C’s private members, so, in particular, a user should be able to create a subclass D of C and add members to D without knowing the names of C’s private members. Both C++ and Java allow this. We must also allow for the possibility that object a will have a member named b but we are not allowed to access it. We will assume that access control is specified by lexical scoping, as is traditional in modern languages.

Desirable Criteria

Some of the criteria we would like the member lookup model to satisfy are:

  1. Safety. The lookup does not permit access to a private member outside the class where the member is defined, nor does it allow access to an internal member outside the package where the member is defined. Furthermore, if a class C accesses its private member m, a hostile subclass D of C cannot silently substitute a member m' that would masquerade as m inside C’s code.
  2. Abstraction. private and internal members are invisible outside their respective classes or packages. For programming in the large, a class can provide several public versions to its importers, and public members of more recent versions are invisible to importers of older versions. This is needed to provide robust libraries.
  3. Robustness. We can make any of the following program changes without having to restructure the program:
    1. Add valid type annotations to variables and functions.
    2. Change a member’s visibility to private, internal, or public, assuming, of course, that that member is not used outside its new visibility.
    3. Split a complicated expression statement e into several statements that compute subexpressions of e, store them in local variables, and then combine them to compute e. We should be able to do this without intimate knowledge of what e does or calls.
    4. Rename a member to a different name, assuming, of course, that the new name does not cause conflicts and that we fix up all references to that member.
  4. Namespace independence. If one class C has a member named m, this should not place restrictions on an unrelated class D having an unrelated member with the same name m.
  5. Compatibility. An ECMAScript 4 class should be usable from ECMAScript 3 code and ECMAScript 3 code minimally upgraded to ECMAScript 4 without having to restructure the latter code. Achieving compatibility should not require the ECMAScript 4 class itself to be restructured or give up any of the other desirable criteria. Code without type annotations works as expected.

Lookup Models

There are three main competing models for performing a general unqualified member lookup operation as a.b. Let S be the set of members named b of the object obtained by evaluating expression a (hereafter shortened to just "object a") that are accessible via the namespace rules applied in the lexical scope where a.b is evaluated. All three models pick some member s  S. Clearly, if the set S is empty, then the member lookup fails. In addition, the Spice and pure Static models may sometimes deliberately fail even when set S is not empty. Except for such deliberate failures, if the set S contains only one member s, all three models return that element s. If the set S contains multiple members, the three models will likely choose different members.

Another interesting (and useful) tidbit is that the Static and Dynamic models always agree on the interpretation of member lookup operations of the form this.b. All three models agree on on the interpretation of member lookup operations of the form this.b in the case where b is a member defined in the current class.

A note about overriding: When a subclass D overrides a member m of its superclass C, then the definition of the member m is conceptually replaced in all instances of D. However, the three models are only concerned with the topmost class in which member m is declared. All three models handle overriding the way one would expect of an object-oriented language. They differ in the cases where class C has a member named m, subclass D of C has a member with the same name m, but D’s m does not override C’s m because C’s m is not visible inside D (it’s not well known, but such non-overriding does and must happen in C++ and Java as well).

Static Model

In the Static model we look at the static type Ts of expression a. Let S1 be the subset of S whose class is either Ts or one of Ts’s ancestors. We pick the member in S1 with the most derived class.

The pure static model above is implemented by Java and C++. It would not work well in that form in ECMAScript because many, if not most, expressions have type Any. Because type Any has no members, users would have to cast expression a to a given type T before they could access members of type T. Because of this we must extend the static model to handle the case where the subset S1 is empty, or, in other words, the static lookup fails. (Rather than doing this, we could extend the static model in the case where the static type Ts is some special type, but then we would have to decide which types are special and which ones are not. Any is clearly special. What about Object? What about Array? It’s hard to draw the line consistently.)

In whichever cases way we extend the static model, we also have a choice of which member we choose. We could back off to the dynamic model, we could choose the most derived member in S, or perhaps we could choose some other approach.

Constraints:

Safety Good within the pure static model. Problems in the extended static model (a subclass could silently shadow a member) that could perhaps be addressed by warnings.
Abstraction Good.
Robustness   Very bad. Updating a function’s or global variable return type silently changes the meaning of all code that uses that function or global variable; in a large project such a change would be quite difficult. Difficult to correctly split expressions into subexpressions.
Namespace independence   Good.
Compatibility Bad within the pure static model (type casts needed everywhere). May be good in the extended static model, depending on the choice of how we extend it.
Other

This model may be difficult to compile well because the compiler may have difficulty in determining the intermediate types in compound expressions. Languages based on the static model have traditionally been compiled off-line, and such compilers tend to be difficult to write for on-line compilation without requiring the programmer to predeclare all of his data structures (if there are any forward-referenced ones, then the compiler doesn’t know whether they should have a type or not). A more dynamic execution model may actually help because it defers compilation until more information is known.

Spice Model

In the Spice model we think of each member m defined in a class C as though it were a function definition for a (possibly overloaded) function whose first argument has type C. Definitions in an inner lexical scope shadow definitions in outer scopes. The Spice model does not consider the static type Ts of expression a.

Let L be the innermost lexical scope enclosing the member lookup expression a.b such that some member named b is defined in L. Let Lb be the set of all members named b defined in lexical scope L, and let S1 = S  Lb (the intersection of S and Lb). If S1 is empty, we fail. If S1 contains exactly one member s, we use s. If S1 contains several members, we fail (this would only happen for import conflicts).

Constraints:

Safety Good.
Abstraction Good.
Robustness   Poor. Renaming an internal member may break code outside the class that defines that member even if that code does not access that member. Converting a member from private to one of the other two visibilities also can introduce conflicts in other, unrelated classes in the same package that just happen to have an unrelated member with the same name. Fortunately these conflicts usually (but not always) result in errors rather than silent changes to the meaning of the program, so one can often find them by exhaustively testing the program after making a change.
Namespace independence   Bad. Members with the same name in unrelated classes often conflict.
Compatibility Poor? Many existing programs rely on namespace independence and would have to be restructured.
Other

Most object-oriented programmers would be confused by a violation of namespace independence. Programming without this assumption requires a different point of view than most programmers are used to. (I am not talking about Lisp and Self programmers, who are familiar with that way of thinking.)

[There are numerous other variants of the Spice model as well.]

Dynamic Model

In the Dynamic model we pick the member s in S defined in the innermost lexical scope L enclosing the member lookup expression a.b. We fail if the innermost such lexical scope L contains more than one member in S (this would only happen for import conflicts).

Constraints:

Safety Good at the language level, but see "other" below.
Abstraction Good.
Robustness   Good. All of these changes are easy to do.
Namespace independence   Good.
Compatibility Good.
Other

Packages using the dynamic model may be vulnerable to hijacking (coerced into doing something other than what the author intended) by a determined intruder. It is possible for a compiler to detect such vulnerabilities and warn about them.

Namespaces

The various models make it possible to get into situations where either there is no way to access a visible member of an object or it is not safe to do so (see member hijacking). In these cases we’d like to be able to explicitly choose one of several potential members with the same name. The :: namespace syntax allows this. The left operand of :: is an expression that evaluates to a package or class; we may also allow special keywords such as public, internal, or private instead of an expression here, or omit the expression altogether. The right operand of :: is a name. The result is the name qualified by the namespace.

As we have seen, the name b in a member access expression a.b does not necessarily refer to a unique accessible member of object a. In a qualified member access expression a.n::b, the namespace n narrows the set of members considered, although it’s possible that the set may still contain more than one member, in which case the lookup model again disambiguates. Let S be the set of members named b of object a that are accessible. The following table shows how a.n::b subsets set S depending on n:

n   Subset
None Only the dynamic member named b, if any exists
A class C The fixed member of C named b, if it exists; if not, try C’s superclass instead, and so on up the chain
A package P   The subset of S containing all accessible members of P
private The fixed member named b of the current class
internal The subset of S containing all accessible members that have package (internal) visibility
public The subset of S containing all accessible members that have public visibility

The :: operator serves a different role from the . operator. The :: operator produces a qualified name, while the . operator produces a value. A qualified name can be used as the right operand of .; a value cannot. If a qualified name is used in a place where a value is expected, the qualified name is looked up using the lexical scoping rules to obtain the value (most likely a global variable).

Dynamic Members

All of the models above address only access to fixed properties of a class. ECMAScript also allows one to dynamically add properties to individual instances of a class. For simplicity we do not provide access control or versioning on these dynamic properties — all of them are public and open to everyone. Because of the safety criterion, a member lookup of a private or internal member must choose the private or internal member even if there is a dynamic member of the same name. To satisfy the robustness criterion, we should treat public members as similarly as possible to private or internal members, so we always give preference to a fixed property when there is a dynamic property of the same name.

To access a dynamic property that is shadowed by a fixed property, we can either prefix the member’s name with :: or use an indirect property access.

Indirect Member Access

How should we define the behavior of the expression a[b] (assuming a’s class is not a typed array or other class that overrides the default meaning of the [] operator)? There are a couple of possibilities:

  1. We could evaluate the expression b to some string "s" and treat a[b] as though it were a.s. This is essentially what ECMAScript 3 does. Unfortunately it’s hard to keep this behavior consistent with ECMAScript 3 programs’ expectations (they expect no more than one member with the same name, etc.), and this kind of indirection is also vulnerable to hijacking. It may be possible to solve the hijacking problem by devising restricted variants of the [] operator such as a.n::[b] that follow the rules given in the namespaces section above.
  2. We could evaluate the expression b to some string "s" and treat a[b] as though it were a.::s, thus limiting our selection to dynamic members. Dynamic members are well-behaved, but this kind of behavior would violate the compatibility criterion when ECMAScript 3 scripts try to reflect an ECMAScript 4 object using the [] operator.

In general it seems like it would be a bad idea to extend the syntax of the string "s" to allow :: operators inside the string. Such strings are too easily forged to play the role of pointers to members.

Member Hijacking

[explain security attacks]


Waldemar Horwat
Last modified Friday, September 20, 2002
previousupnext