April 2002 Draft
JavaScript 2.0
Rationale
Member Lookup
|
Tuesday, October 9, 2001
There have been much discussion in the TC39 subgroup about the meaning of a member lookup operation. Numerous considerations intersect here.
We will express a general unqualified member lookup operation as a.
b, where a
is an expression and b is an identifier. We will also consider qualified member lookup operations and write them
as a.
n::
b, where n is an expression that evaluates to
some namespace. In almost all cases we will be interested in the dynamic type Td of a. In one scheme
we will also consider the static type Ts of the expression a. If the language is sound, we will always
have Td Ts.
In the simplest approach, we treat an object as merely an association table of member names and member values. In this
interpretation we simply look inside object a and check if there is a member named b. If there is, we return the
member’s value; if not, we return undefined
or signal an error.
There are a number of difficulties with this simple approach, and most object-oriented languages have not adopted it:
private
or internal
.Once we allow private
or internal
members, we must allow for the possibility that object a
will have more than one member named b — abstraction considerations require that users of a class C
not be aware of expose C’s private members, so, in particular, a user should be able to create a subclass D
of C and add members to D without knowing the names of C’s private members. Both C++ and
Java allow this. We must also allow for the possibility that object a will have a member named b but
we are not allowed to access it. We will assume that access control is specified by lexical scoping, as is traditional in
modern languages.
Some of the criteria we would like the member lookup model to satisfy are:
private
member outside the class
where the member is defined, nor does it allow access to an internal
member outside the package where the
member is defined. Furthermore, if a class C accesses its private member m, a hostile subclass D
of C cannot silently substitute a member m' that would masquerade as m inside C’s
code.private
and internal
members are invisible outside
their respective classes or packages. For programming in the large, a class can provide several public
versions
to its importers, and public
members of more recent versions are invisible to importers of older versions.
This is needed to provide robust libraries.private
, internal
, or public
, assuming, of
course, that that member is not used outside its new visibility.There are three main competing models for performing a general unqualified member lookup operation as a.
b.
Let S be the set of members named b of the object obtained by evaluating expression a (hereafter
shortened to just "object a") that are accessible via the namespace rules
applied in the lexical scope where a.
b is evaluated. All three models pick some member
s S. Clearly, if the set S
is empty, then the member lookup fails. In addition, the Spice and pure Static models may sometimes deliberately fail even
when set S is not empty. Except for such deliberate failures, if the set S contains only one member
s, all three models return that element s. If the set S contains multiple members, the three
models will likely choose different members.
Another interesting (and useful) tidbit is that the Static and Dynamic models always agree on the interpretation of member
lookup operations of the form this.
b. All three models agree on on the interpretation of member lookup
operations of the form this.
b in the case where b is a member defined in the current class.
A note about overriding: When a subclass D overrides a member m of its superclass C, then the definition of the member m is conceptually replaced in all instances of D. However, the three models are only concerned with the topmost class in which member m is declared. All three models handle overriding the way one would expect of an object-oriented language. They differ in the cases where class C has a member named m, subclass D of C has a member with the same name m, but D’s m does not override C’s m because C’s m is not visible inside D (it’s not well known, but such non-overriding does and must happen in C++ and Java as well).
In the Static model we look at the static type Ts of expression a. Let S1 be the subset of S whose class is either Ts or one of Ts’s ancestors. We pick the member in S1 with the most derived class.
The pure static model above is implemented by Java and C++. It would not work well in that form in JavaScript because many,
if not most, expressions have type Any
. Because type Any
has no members, users would have to cast
expression a to a given type T before they could access members of type T. Because of this
we must extend the static model to handle the case where the subset S1 is empty, or, in other words, the static
lookup fails. (Rather than doing this, we could extend the static model in the case where the static type Ts is
some special type, but then we would have to decide which types are special and which ones are not. Any
is clearly
special. What about Object
? What about Array
? It’s hard to draw the line consistently.)
In whichever cases way we extend the static model, we also have a choice of which member we choose. We could back off to the dynamic model, we could choose the most derived member in S, or perhaps we could choose some other approach.
Constraints:
Safety | Good within the pure static model. Problems in the extended static model (a subclass could silently shadow a member) that could perhaps be addressed by warnings. |
Abstraction | Good. |
Robustness | Very bad. Updating a function’s or global variable return type silently changes the meaning of all code that uses that function or global variable; in a large project such a change would be quite difficult. Difficult to correctly split expressions into subexpressions. |
Namespace independence | Good. |
Compatibility | Bad within the pure static model (type casts needed everywhere). May be good in the extended static model, depending on the choice of how we extend it. |
Other |
This model may be difficult to compile well because the compiler may have difficulty in determining the intermediate types in compound expressions. Languages based on the static model have traditionally been compiled off-line, and such compilers tend to be difficult to write for on-line compilation without requiring the programmer to predeclare all of his data structures (if there are any forward-referenced ones, then the compiler doesn’t know whether they should have a type or not). A more dynamic execution model may actually help because it defers compilation until more information is known. |
In the Spice model we think of each member m defined in a class C as though it were a function definition for a (possibly overloaded) function whose first argument has type C. Definitions in an inner lexical scope shadow definitions in outer scopes. The Spice model does not consider the static type Ts of expression a.
Let L be the innermost lexical scope enclosing the member lookup expression a.
b
such that some member named b is defined in L. Let Lb be the set of all members named b
defined in lexical scope L, and let S1 = S Lb
(the intersection of S and Lb). If S1 is empty, we fail. If S1 contains exactly
one member s, we use s. If S1 contains several members, we fail (this would only happen for
import conflicts).
Constraints:
Safety | Good. |
Abstraction | Good. |
Robustness | Poor. Renaming an internal member may break code outside the class that defines that member
even if that code does not access that member. Converting a member from private to one of the other two
visibilities also can introduce conflicts in other, unrelated classes in the same package that just happen to have an
unrelated member with the same name. Fortunately these conflicts usually (but not always) result in errors rather than
silent changes to the meaning of the program, so one can often find them by exhaustively testing the program after making
a change. |
Namespace independence | Bad. Members with the same name in unrelated classes often conflict. |
Compatibility | Poor? Many existing programs rely on namespace independence and would have to be restructured. |
Other |
Most object-oriented programmers would be confused by a violation of namespace independence. Programming without this assumption requires a different point of view than most programmers are used to. (I am not talking about Lisp and Self programmers, who are familiar with that way of thinking.) |
[There are numerous other variants of the Spice model as well.]
In the Dynamic model we pick the member s in S defined in the innermost lexical scope L
enclosing the member lookup expression a.
b. We fail if the innermost such lexical
scope L contains more than one member in S (this would only happen for import conflicts).
Constraints:
Safety | Good at the language level, but see "other" below. |
Abstraction | Good. |
Robustness | Good. All of these changes are easy to do. |
Namespace independence | Good. |
Compatibility | Good. |
Other |
Packages using the dynamic model may be vulnerable to hijacking (coerced into doing something other than what the author intended) by a determined intruder. It is possible for a compiler to detect such vulnerabilities and warn about them. |
The various models make it possible to get into situations where either there is no way to access a visible member of an
object or it is not safe to do so (see member hijacking). In these cases we’d like to be able to
explicitly choose one of several potential members with the same name. The ::
namespace syntax allows this. The
left operand of ::
is an expression that evaluates to a package or class; we may also allow special keywords
such as public
, internal
, or private
instead of an expression here, or omit the expression
altogether. The right operand of ::
is a name. The result is the name qualified by the namespace.
As we have seen, the name b in a member access expression a.
b does not necessarily
refer to a unique accessible member of object a. In a qualified member access expression a.
n::
b,
the namespace n narrows the set of members considered, although it’s possible that the set may still contain more
than one member, in which case the lookup model again disambiguates. Let S be the set of members named b
of object a that are accessible. The following table shows how a.
n::
b
subsets set S depending on n:
n | Subset |
---|---|
None | Only the dynamic member named b, if any exists |
A class C | The fixed member of C named b, if it exists; if not, try C’s superclass instead, and so on up the chain |
A package P | The subset of S containing all accessible members of P |
private |
The fixed member named b of the current class |
internal |
The subset of S containing all accessible members that have package (internal )
visibility |
public |
The subset of S containing all accessible members that have public visibility |
The ::
operator serves a different role from the .
operator. The ::
operator produces
a qualified name, while the .
operator produces a value. A qualified name can be used as
the right operand of .
; a value cannot. If a qualified name is used in a place where a value is expected, the
qualified name is looked up using the lexical scoping rules to obtain the value (most likely a global variable).
All of the models above address only access to fixed properties of a class.
JavaScript also allows one to dynamically add properties to individual instances of a class. For simplicity we do not provide
access control or versioning on these dynamic properties — all of them are
public and open to everyone. Because of the safety criterion, a member lookup of a private
or internal
member must choose the private
or internal
member even if there is a dynamic
member of the same name. To satisfy the robustness criterion, we should treat public
members as similarly as possible to private
or internal
members, so we always give preference to
a fixed property when there is a dynamic property of the same name.
To access a dynamic property that is shadowed by a fixed property, we can either prefix the member’s name with ::
or use an indirect property access.
How should we define the behavior of the expression a[
b]
(assuming the
[]
operator is not overridden by a’s class)? There are a couple
of possibilities:
"
s"
and
treat a[
b]
as though it were a.
s. This
is essentially what JavaScript 1.5 does. Unfortunately it’s hard to keep this behavior consistent with JavaScript 1.5
programs’ expectations (they expect no more than one member with the same name, etc.), and this kind of indirection is
also vulnerable to hijacking. It may be possible to solve the hijacking problem by devising restricted
variants of the []
operator such as a.
n::[
b]
that follow the rules given in the namespaces section above."
s"
and treat a[
b]
as though it were a.::
s, thus limiting our selection to dynamic members. Dynamic members
are well-behaved, but this kind of behavior would violate the compatibility criterion when
JavaScript 1.5 scripts try to reflect a JavaScript 2.0 object using the []
operator.In general it seems like it would be a bad idea to extend the syntax of the string "
s"
to allow ::
operators inside the string. Such strings are too easily forged to play the role of pointers to members.
[explain security attacks]
Waldemar Horwat Last modified Tuesday, October 9, 2001 |