April 2002 Draft
Tuesday, October 9, 2001
There have been much discussion in the TC39 subgroup about the meaning of a member lookup operation. Numerous considerations intersect here.
We will express a general unqualified member lookup operation as a
.b, where a
is an expression and b is an identifier. We will also consider qualified member lookup operations and write them
::b, where n is an expression that evaluates to
some namespace. In almost all cases we will be interested in the dynamic type Td of a. In one scheme
we will also consider the static type Ts of the expression a. If the language is sound, we will always
have Td Ts.
In the simplest approach, we treat an object as merely an association table of member names and member values. In this
interpretation we simply look inside object a and check if there is a member named b. If there is, we return the
member’s value; if not, we return
undefined or signal an error.
There are a number of difficulties with this simple approach, and most object-oriented languages have not adopted it:
Once we allow
internal members, we must allow for the possibility that object a
will have more than one member named b — abstraction considerations require that users of a class C
not be aware of expose C’s private members, so, in particular, a user should be able to create a subclass D
of C and add members to D without knowing the names of C’s private members. Both C++ and
Java allow this. We must also allow for the possibility that object a will have a member named b but
we are not allowed to access it. We will assume that access control is specified by lexical scoping, as is traditional in
Some of the criteria we would like the member lookup model to satisfy are:
privatemember outside the class where the member is defined, nor does it allow access to an
internalmember outside the package where the member is defined. Furthermore, if a class C accesses its private member m, a hostile subclass D of C cannot silently substitute a member m' that would masquerade as m inside C’s code.
internalmembers are invisible outside their respective classes or packages. For programming in the large, a class can provide several
publicversions to its importers, and
publicmembers of more recent versions are invisible to importers of older versions. This is needed to provide robust libraries.
public, assuming, of course, that that member is not used outside its new visibility.
There are three main competing models for performing a general unqualified member lookup operation as a
Let S be the set of members named b of the object obtained by evaluating expression a (hereafter
shortened to just "object a") that are accessible via the namespace rules
applied in the lexical scope where a
.b is evaluated. All three models pick some member
s S. Clearly, if the set S
is empty, then the member lookup fails. In addition, the Spice and pure Static models may sometimes deliberately fail even
when set S is not empty. Except for such deliberate failures, if the set S contains only one member
s, all three models return that element s. If the set S contains multiple members, the three
models will likely choose different members.
Another interesting (and useful) tidbit is that the Static and Dynamic models always agree on the interpretation of member
lookup operations of the form
this.b. All three models agree on on the interpretation of member lookup
operations of the form
this.b in the case where b is a member defined in the current class.
A note about overriding: When a subclass D overrides a member m of its superclass C, then the definition of the member m is conceptually replaced in all instances of D. However, the three models are only concerned with the topmost class in which member m is declared. All three models handle overriding the way one would expect of an object-oriented language. They differ in the cases where class C has a member named m, subclass D of C has a member with the same name m, but D’s m does not override C’s m because C’s m is not visible inside D (it’s not well known, but such non-overriding does and must happen in C++ and Java as well).
In the Static model we look at the static type Ts of expression a. Let S1 be the subset of S whose class is either Ts or one of Ts’s ancestors. We pick the member in S1 with the most derived class.
if not most, expressions have type
Any. Because type
Any has no members, users would have to cast
expression a to a given type T before they could access members of type T. Because of this
we must extend the static model to handle the case where the subset S1 is empty, or, in other words, the static
lookup fails. (Rather than doing this, we could extend the static model in the case where the static type Ts is
some special type, but then we would have to decide which types are special and which ones are not.
Any is clearly
special. What about
Object? What about
Array? It’s hard to draw the line consistently.)
In whichever cases way we extend the static model, we also have a choice of which member we choose. We could back off to the dynamic model, we could choose the most derived member in S, or perhaps we could choose some other approach.
|Safety||Good within the pure static model. Problems in the extended static model (a subclass could silently shadow a member) that could perhaps be addressed by warnings.|
|Robustness||Very bad. Updating a function’s or global variable return type silently changes the meaning of all code that uses that function or global variable; in a large project such a change would be quite difficult. Difficult to correctly split expressions into subexpressions.|
|Compatibility||Bad within the pure static model (type casts needed everywhere). May be good in the extended static model, depending on the choice of how we extend it.|
This model may be difficult to compile well because the compiler may have difficulty in determining the intermediate types in compound expressions. Languages based on the static model have traditionally been compiled off-line, and such compilers tend to be difficult to write for on-line compilation without requiring the programmer to predeclare all of his data structures (if there are any forward-referenced ones, then the compiler doesn’t know whether they should have a type or not). A more dynamic execution model may actually help because it defers compilation until more information is known.
In the Spice model we think of each member m defined in a class C as though it were a function definition for a (possibly overloaded) function whose first argument has type C. Definitions in an inner lexical scope shadow definitions in outer scopes. The Spice model does not consider the static type Ts of expression a.
Let L be the innermost lexical scope enclosing the member lookup expression a
such that some member named b is defined in L. Let Lb be the set of all members named b
defined in lexical scope L, and let S1 = S Lb
(the intersection of S and Lb). If S1 is empty, we fail. If S1 contains exactly
one member s, we use s. If S1 contains several members, we fail (this would only happen for
|Robustness||Poor. Renaming an
|Namespace independence||Bad. Members with the same name in unrelated classes often conflict.|
|Compatibility||Poor? Many existing programs rely on namespace independence and would have to be restructured.|
Most object-oriented programmers would be confused by a violation of namespace independence. Programming without this assumption requires a different point of view than most programmers are used to. (I am not talking about Lisp and Self programmers, who are familiar with that way of thinking.)
[There are numerous other variants of the Spice model as well.]
In the Dynamic model we pick the member s in S defined in the innermost lexical scope L
enclosing the member lookup expression a
.b. We fail if the innermost such lexical
scope L contains more than one member in S (this would only happen for import conflicts).
|Safety||Good at the language level, but see "other" below.|
|Robustness||Good. All of these changes are easy to do.|
Packages using the dynamic model may be vulnerable to hijacking (coerced into doing something other than what the author intended) by a determined intruder. It is possible for a compiler to detect such vulnerabilities and warn about them.
The various models make it possible to get into situations where either there is no way to access a visible member of an
object or it is not safe to do so (see member hijacking). In these cases we’d like to be able to
explicitly choose one of several potential members with the same name. The
:: namespace syntax allows this. The
left operand of
:: is an expression that evaluates to a package or class; we may also allow special keywords
private instead of an expression here, or omit the expression
altogether. The right operand of
:: is a name. The result is the name qualified by the namespace.
As we have seen, the name b in a member access expression a
.b does not necessarily
refer to a unique accessible member of object a. In a qualified member access expression a
the namespace n narrows the set of members considered, although it’s possible that the set may still contain more
than one member, in which case the lookup model again disambiguates. Let S be the set of members named b
of object a that are accessible. The following table shows how a
subsets set S depending on n:
|None||Only the dynamic member named b, if any exists|
|A class C||The fixed member of C named b, if it exists; if not, try C’s superclass instead, and so on up the chain|
|A package P||The subset of S containing all accessible members of P|
||The fixed member named b of the current class|
||The subset of S containing all accessible members that have package (
||The subset of S containing all accessible members that have
:: operator serves a different role from the
. operator. The
:: operator produces
a qualified name, while the
. operator produces a value. A qualified name can be used as
the right operand of
.; a value cannot. If a qualified name is used in a place where a value is expected, the
qualified name is looked up using the lexical scoping rules to obtain the value (most likely a global variable).
All of the models above address only access to fixed properties of a class.
access control or versioning on these dynamic properties — all of them are
public and open to everyone. Because of the safety criterion, a member lookup of a
internal member must choose the
internal member even if there is a dynamic
member of the same name. To satisfy the robustness criterion, we should treat
members as similarly as possible to
internal members, so we always give preference to
a fixed property when there is a dynamic property of the same name.
To access a dynamic property that is shadowed by a fixed property, we can either prefix the member’s name with
or use an indirect property access.
How should we define the behavior of the expression a
] (assuming the
 operator is not overridden by a’s class)? There are a couple
"and treat a
]as though it were a
operator such as a
]that follow the rules given in the namespaces section above.
"and treat a
]as though it were a
In general it seems like it would be a bad idea to extend the syntax of the string
:: operators inside the string. Such strings are too easily forged to play the role of pointers to members.
[explain security attacks]
Last modified Tuesday, October 9, 2001