April 2002 Draft
JavaScript 2.0
Formal Description
Semantic Notation
|
Monday, April 22, 2002
The semantics of JavaScript 2.0 are written in Algol-like pseudocode. The following sections define the notation used to write the semantics’ concepts, expressions, procedures, and actions.
The table below summarizes operators used in expressions in this document. The operators are listed in groups in order from the highest precedence (tightest-binding) to the lowest precedence (loosest-binding). Other than the relational operators, operators in the same group have the same precedence and associate left-to-right, so, for example, 7–3+2–1 means ((7–3)+2)–1 instead of 7–(3+(2–1)) or (7–(3+2))–1. As is traditional in mathematics, the relational operators cascade, so
a = b c < d
means
a = b and b c and c < d
Parentheses are used to override precedences or clarify expressions.
Expressions used in describing algorithms may perform side effects. Except for and, or, and ?:, the operators compute all of their operands left-to-right, and if computation of any operand throws an exception e, then the operator immediately propagates e without computing any following operands.
Group | Operator | Description |
---|---|---|
Nonassociative | (x) | Return x. Parentheses are used to override operator precedence. |
{x1, x2, ... , xn} | Set or semantic domain with the elements x1, x2, ... , xn | |
{f(x) | x A} {f(x) | x A such that predicate(x)} |
Set comprehension | |
[x0, x1, ... , xn–1] | List with the elements x0, x1, ... , xn–1 | |
[f(x) | x u] [f(x) | x u such that predicate(x)] |
List comprehension | |
Namelabel1: x1, ... , labeln: xn Name |
Tuple constructor | |
|x| | Cardinality of a set x or length of a list x | |
x | Floor of x | |
x | Ceiling of x | |
Action[nonterminali] | This notation is used inside an action for a grammar production that has nonterminal nonterminal on the production’s left or right side. Return the value of action Action invoked on the ith instance of nonterminal nonterminal on the left or right side of . The subscript i can be omitted if it is 1 and there is only one instance of nonterminal nonterminal on ’s right side. | |
nonterminali | This notation is used inside an action for a grammar production
that has nonterminal nonterminal on the production’s left or right side. Furthermore,
every complete expansion of grammar nonterminal nonterminal expands into a single character. Return the character to which the ith instance of nonterminal nonterminal on the right side of expands. The subscript i can be omitted if there is only one instance of nonterminal nonterminal in . If the subscript is omitted and nonterminal nonterminal appears on the left side of , then this expression returns the single character to which this whole production expands. |
|
Suffix | xy | x raised to the yth power |
u[i] | ith element of list u | |
u[i ... j] u[i ...] |
Slice of list u | |
u[i \ x] | List element substitution | |
a.label | Field named label of tuple or record a | |
T{} | Semantic domain of all sets whose elements are members of semantic domain T | |
T[] | Semantic domain of all lists whose elements are members of semantic domain T | |
f(x1, ..., xn) | Procedure call | |
Prefix | new Namelabel1: x1, ... , labeln: xn | Record constructor |
–x | Real number negation | |
min A | Smallest element of a set | |
max A | Largest element of a set | |
Factor | x y | Real number product |
x / y | Real number quotient (y must not be zero) | |
x mod y | Real number remainder (y must not be zero) | |
A B | Set intersection | |
T1 T2 ... Tn T () T T1 T2 ... Tn () () () |
Semantic domain of procedures | |
Term | x + y | Real number addition |
x – y | Real number subtraction or set difference | |
u v | List concatenation | |
A B | Set union | |
Relational | x = y x y |
Equality and inequality predicates on tags, real numbers,
sets, booleans, characters,
lists, strings, tuples,
and records. Values of differing kinds (such as the boolean true
and the character ‘A ’) are always considered unequal. |
x < y x y x > y x y |
Order predicates on real numbers, characters, and strings. | |
x A x A |
Set membership predicates | |
A B | Subset predicate | |
A B | Proper subset predicate | |
Negation | not a | Logical negation |
Conjunction | a and b | Short-circuiting logical conjunction |
Disjunction | a or b | Short-circuiting logical disjunction |
a xor b | Logical exclusive or | |
Conditional | a ? x : y | Conditional |
some x A satisfies predicate(x) every x A satisfies predicate(x) |
Set or list quantifiers |
Semantic domains describe the possible values that a variable might take on in an algorithm. The algorithms are constructed in a way that ensures that these constraints are always met, regardless of any valid or invalid programmer or user input or actions.
A semantic domain can be intuitively thought of as a set of possible values, and, in fact, any set of values explicitly described in this document is also a semantic domain. Nevertheless, semantic domains have a more precise mathematical definition in domain theory (see for example [Schmidt86]) that allows one to define semantic domains recursively without encountering paradoxes such as trying to define a set A whose members include all functions mapping values from A to Integer. The problem with an ordinary definition of such a set A is that the cardinality of the set of all functions mapping A to Integer is always strictly greater than the cardinality of A, leading to a contradiction. Domain theory uses a least fixed point construction to allow A to be defined as a semantic domain without encountering problems.
Semantic domains have names in Capitalized Red Small Caps. Such a name is to be considered distinct from a tag or regular variable with the same name, so Undefined, undefined, and undefined are three different and independent entities.
A variable v is constrained using the notation
v: T
where T is a semantic domain. This constraint indicates that the value of v will always be a member of the semantic domain T. These declarations are informative (they may be dropped without affecting the semantics’ correctness) but useful in understanding the semantics. For example, when the semantics state that x: Integer then one does not have to worry about what happens when x has the value true or +.
The constraints can be proven statically. The JavaScript semantics have been machine-checked to ensure that every constraint holds.
Tags are computational tokens with no internal structure. Tags are written using a dark red font. Two tags are equal if and only if they have the same name.
Each tag that does not also name a tuple or a record is defined using the notation:
In the HTML version of the semantics, each use of a tag’s name is linked back to its definition.
The tags true and false represent booleans. Boolean is the two-element semantic domain {true, false}.
Let a and b be booleans and x and y any values. In addition to = and , the following operations can be done on them:
Note that the and, or, and ?: operators short-circuit. These are the only operators that do not always compute all of their operands.
A set is an unordered, possibly infinite collection of elements. Each element may occur at most once in a set. There must be an equivalence relation = defined on all pairs of the set’s elements. Elements of a set may themselves be sets.
A set is denoted by enclosing a comma-separated list of values inside braces:
{element1, element2, ... , elementn}
The empty set is written as {}. Any duplicate elements are included only once in the set.
For example, the set {3, 0, 10, 11, 12, 13, –5} contains seven integers.
Sets of either integers or characters can be abbreviated using the ... range operator. For example, the above set can also be written as {0, –5, 3 ... 3, 10 ... 13}.
If the beginning of the range is equal to the end of the range, then the range consists of only one element: {7 ... 7} is the same as {7}. If the end of the range is one less than the beginning, then the range contains no elements: {7 ... 6} is the same as {}. The end of the range is never more than one less than the beginning.
A set can also be written using the set comprehension notation
{f(x) | x A}
which denotes the set of the results of computing expression f on all elements x of set A. A predicate can be added:
{f(x) | x A such that predicate(x)}
denotes the set of the results of computing expression f on all elements x of set A that satisfy the predicate expression. There can also be more than one free variable x and set A, in which case all combinations of free variables’ values are considered. For example,
{x | x Integer such that x2 < 10} = {–3, –2, –1, 0, 1, 2, 3};
{x2 | x {–5, –1, 1, 2, 4}} = {1, 4, 16, 25};
{x10 + y | x {1, 2, 4}, y {3, 5}} = {13, 15, 23, 25, 43, 45}.
The same notation is used for operations on sets and on semantic domains. Let A and B be sets (or semantic domains) and x and y be values. The following operations can be done on them:
If T is a semantic domain, then T{} is the semantic domain of all sets whose elements are members of T. For example, if T = {1,2,3}, then T{} = {{}, {1}, {2}, {3}, {1,2}, {1,3}, {2,3}, {1,2,3}}. The empty set {} is a member of T{} for any semantic domain T.
In addition to the above, the some and every quantifiers can be used on sets (see also lists). The quantifier
some x A satisfies predicate(x)
returns true if there exists at least one element x in set A such that predicate(x) computes to true. If there is no such element x, then the some quantifier’s result is false. If the some quantifier returns true, then variable x is left bound to any element of A for which predicate(x) computes to true; if there is more than one such element x, then one of them is chosen arbitrarily. For example,
some x {3, 16, 19, 26} satisfies x mod 10 = 6
evaluates to true and leaves x set to either 16 or 26. Other examples include:
(some x {3, 16, 19, 26} satisfies x mod 10 = 7) = false;
(some x {} satisfies x mod 10 = 7) = false;
(some x {“Hello
”} satisfies true) = true
and leaves x set to the string “Hello
”;
(some x {} satisfies true) = false.
The quantifier
every x A satisfies predicate(x)
returns true if there exists no element x in set A such that predicate(x) computes to false. If there is at least one such element x, then the every quantifier’s result is false. As a degenerate case, the every quantifier is always true if the set A is empty. For example,
(every x {3, 16, 19, 26} satisfies x mod 10 = 6) = false;
(every x {6, 26, 96, 106} satisfies x mod 10 = 6) = true;
(every x {} satisfies x mod 10 = 6) = true.
All numbers written in plain font are exact mathematical real numbers. Numbers can be written with or without a decimal point. Integers preceded with “0x” are hexadecimal (base 16). 4294967296, 4294967296.000, 0x100000000, and 232 are all the same number. 0.1 is the exact value 1/10.
Integer is the semantic domain of all integers {... –3, –2, –1, 0, 1, 2, 3 ...}. 3.0, 3, 0xFF, and –10100 are all integers.
Rational is the semantic domain of all rational numbers. Every integer is also a rational number: Integer Rational. 3, 1/3, 7.5, –12/7, and 2–5 are examples of rational numbers.
Real is the semantic domain of all real numbers. Every rational number is also a real number: Rational Real. is an example of a real number slightly larger than 3.14.
Let x and y be real numbers. The following operations can be done on them and always produce exact results:
Real numbers can be compared using =, , <, , >, and . The result is either true or false.
The four procedures below perform bitwise operations on integers. The integers are treated as though they were written in infinite-precision two’s complement binary notation, with each 1 bit representing true and 0 bit representing false.
More precisely, any integer x can be represented as an infinite sequence of bits ai where the index i ranges over the nonnegative integers and every ai {0, 1}. The sequence is traditionally written in reverse order:
..., a4, a3, a2, a1, a0
The unique sequence corresponding to an integer x is generated by the formula
ai = x / 2i mod 2
If x is zero or positive, then its sequence will have infinitely many consecutive leading 0’s, while a negative integer x will generate a sequence with infinitely many consecutive leading 1’s. For example, 6 generates the sequence ...0...0000110, while –6 generates ...1...1111010.
The logical and, or, and xor operations below operate on corresponding elements of the sequences ai and bi generated by the two parameters x and y. The result is another infinite sequence of bits ci. The result of the operation is the unique integer z that generates the sequence ci. For example, anding corresponding elements of the sequences generated by 6 and –6 yields the sequence ...0...0000010, which is the sequence generated by the integer 2. Thus, bitwiseAnd(6, –6) = 2.
Procedure | Description |
---|---|
bitwiseAnd(x: Integer, y: Integer): Integer | The bitwise and of x and y |
bitwiseOr(x: Integer, y: Integer): Integer | The bitwise or of x and y |
bitwiseXor(x: Integer, y: Integer): Integer | The bitwise xor of x and y |
bitwiseShift(x: Integer, count: Integer): Integer | Shift x to the left by count bits. If count is negative, shift x to the right by –count bits. Bits shifted out of the right end are lost; bit shifted in at the right end are zero. bitwiseShift(x, count) is exactly equivalent to x 2count. |
Characters enclosed in single quotes ‘ and ’ represent single Unicode 16-bit code points. Examples of
characters include ‘A
’, ‘b
’, ‘«LF»
’,
and ‘«uFFFF»
’ (see also the notation
for non-ASCII characters). Unicode surrogates are considered to be pairs of characters for the purpose of this specification.
Character is the semantic domain of all 65536 characters {‘«u0000»
’ ... ‘«uFFFF»
’}.
Characters can be compared using =, ,
<, , >, and .
These operators compare code point values, so ‘A
’ = ‘A
’,
‘A
’ < ‘B
’, and ‘A
’ < ‘a
’
are all true.
characterToCode and codeToCharacter convert between characters and their integer Unicode values.
The following procedures convert between characters and integers:
Procedure | Description |
---|---|
characterToCode(c: Character): Integer | The number of character c’s Unicode code point |
codeToCharacter(i: {0 ... 65535}): Character | The character whose Unicode code point number is i |
The procedure digitValue is defined as follows:
Procedure | Description |
---|---|
isInitialIdentifierCharacter(c: Character): Boolean | Return true if the nonterminal InitialIdentifierCharacter can expand into c and false otherwise |
isContinuingIdentifierCharacter(c: Character): Boolean | Return true if the nonterminal ContinuingIdentifierCharacter can expand into c and false otherwise |
A finite ordered list of zero or more elements is written by listing the elements inside bold brackets:
[element0, element1, ... , elementn–1]
For example, the following list contains four strings:
[“parsley
”, “sage
”, “rosemary
”, “thyme
”]
The empty list is written as [].
Unlike a set, the elements of a list are indexed by integers starting from 0. A list can contain duplicate elements.
A list can also be written using the list comprehension notation
[f(x) | x u]
which denotes the list [f(u[0]), f(u[1]), ... , f(u[|u|–1])] whose elements consist of the results of applying expression f to each corresponding element of list u. x is the name of the parameter in expression f. A predicate can be added:
[f(x) | x u such that predicate(x)]
denotes the list of the results of computing expression f on all elements x of list u that satisfy the predicate expression. The results are listed in the same order as the elements x of list u. For example,
[x2 | x [–1, 1, 2, 3, 4, 2, 5]] = [1, 1, 4, 9, 16, 4, 25]
[x+1 | x [–1, 1, 2, 3, 4, 5, 3, 10] such that x mod 2 = 1] = [0, 2, 4, 6, 4]
Let u = [e0, e1, ... , en–1] and v = [f0, f1, ... , fm–1] be lists, i and j be integers, and x be a value. The operations below can be done on lists. The operations are meaningful only when their preconditions are met; the semantics never use the operations below without meeting their preconditions.
Lists are functional — there is no notation for modifying a list in place.
If T is a semantic domain, then T[] is the semantic domain of all lists whose elements are members of T. The empty list [] is a member of T[] for any semantic domain T.
In addition to the above, the some and every quantifiers can be used on lists just as on sets:
some x u satisfies predicate(x)
every x u satisfies predicate(x)
These quantifiers’ behavior on lists is analogous to that on sets, except that, if the some quantifier returns true then it leaves variable x set to the first element of list u that satisfies condition predicate(x). For example,
some x [3, 36, 19, 26] satisfies x mod 10 = 6
evaluates to true and leaves x set to 36.
A list of characters is called a string. In addition to the normal list notation, for notational convenience a string can also be written as zero or more characters enclosed in double quotes (see also the notation for non-ASCII characters). Thus,
“Wonder«LF»
”
is equivalent to:
[‘W
’, ‘o
’, ‘n
’, ‘d
’, ‘e
’, ‘r
’, ‘«LF»
’]
The empty string is usually written as “”.
In addition to all of the other list operations, <, , >, and are defined on strings. A string x is less than string y when y is not the empty string and either x is the empty string, the first character of x is less than the first character of y, or the first character of x is equal to the first character of y and the rest of string x is less than the rest of string y.
String is the semantic domain of all strings. String = Character[].
A tuple is an immutable aggregate of values comprised of a name and zero or more labeled fields.
The pseudocode defines each tuple and describes its fields. A tuple definition has the form
and defines tuples with name Name to have n fields with semantic domains T1 through Tn respectively. In the HTML version of the semantics, each use of a tuple’s Name is linked back to its definition.
After Name is defined, the notation
Namelabel1: v1, ... , labeln: vn
represents a tuple with name Name and values v1 through vn for fields labeled label1 through labeln respectively. Each value vi is a member of the corresponding semantic domain Ti. When most of the fields are copied from an existing tuple a, this notation can be abbreviated as
Namelabeli1: vi1, ... , labelik: vik, other fields from a
which represents a tuple with name Name and values vi1 through vik for fields labeled labeli1 through labelik respectively and the values of correspondingly labeled fields from a for all other fields.
If a is the tuple Namelabel1: v1, ... , labeln: vn, then
a.labeli
returns the ith field’s value vi. Tuples are functional — there is no notation for modifying a tuple in place. In the HTML version of the semantics, each use of labeli is linked back to a’s type.
The equality operators = and may be used to compare tuples. Tuples are equal when they have the same name and their corresponding fields’ values are equal.
The notation
Name
represents the semantic domain of all tuples with name Name.
A record is a mutable aggregate of values similar to a tuple but with different equality behavior.
A record is comprised of a name and an address. The address points to a mutable data structure comprised of zero or more labeled fields. The address acts as the record’s serial number — every record allocated by new (see below) gets a different address, including records created by identical expressions or even the same expression used twice.
The pseudocode defines each record and describes its fields. A record definition has the form
and defines records with name Name to have n fields with semantic domains T1 through Tn respectively. In the HTML version of the semantics, each use of a record’s Name is linked back to its definition.
After Name is defined, the expression
new Namelabel1: v1, ... , labeln: vn
creates a record with name Name and a new address . The fields labeled label1 through labeln at address are initialized with values v1 through vn respectively. Each value vi is a member of the corresponding semantic domain Ti. A labelk: vk pair may be omitted from a new expression, which indicates that the initial value of field labelk does not matter because the semantics will always explicitly write a value into that field before reading it.
When most of the fields are copied from an existing record a, the new expression can be abbreviated as
new Namelabeli1: vi1, ... , labelik: vik, other fields from a
which represents a record b with name Name and a new address . The fields labeled labeli1 through labelik at address are initialized with values vi1 through vik respectively; the other fields at address are initialized with the values of correspondingly labeled fields from a’s address.
If a is a record with name Name and address , then
a.labeli
returns the current value v of the ith field at address . That field may be set to a new value w, which must be a member of the semantic domain Ti, using the assignment
a.labeli w
after which a.labeli will evaluate to w. Any record with a different address is unaffected by the assignment. In the HTML version of the semantics, each use of labeli is linked back to a’s type.
The equality operators = and may be used to compare records. Records are equal if and only if they have the same address.
The notation
Name
represents the infinite semantic domain of all records that have name Name and all addresses.
Float64 is the semantic domain of all representable double-precision floating-point IEEE 754 values, with all not-a-number values considered indistinguishable from each other. Float64 is the union of the following semantic domains:
There are 18428729675200069632 (that is, 264–254) normalized values:
m is called the significand.
There are also 9007199254740990 (that is, 253–2) denormalized non-zero values:
m is called the significand.
The remaining values are the tags +zero (positive zero), –zero (negative zero), + (positive infinity), – (negative infinity), and NaN (not a number). Although the integer 0 is not a floating-point number, the pseudocode converts 0 into either +zero or –zero whenever necessary.
Members of the semantic domain NormalizedFloat64 DenormalizedFloat64 that are greater than zero are called positive finite. The remaining members of NormalizedFloat64 DenormalizedFloat64 are less than zero and are called negative finite.
Since floating-point numbers are either rational numbers or tags, the notation
= and may be used to compare them. Note that = is false
for different tags, so +zero –zero
but NaN = NaN. The JavaScript x ==
y
and x ===
y operators have different behavior for floating-point numbers, defined
as float64Compare(x, y) = equal.
The procedure realToFloat64 converts a real number x into the applicable element of Float64 as follows:
This procedure corresponds exactly to the behavior of the IEEE 754 “round to nearest” mode.
The procedure truncateFiniteFloat64 truncates a FiniteFloat64 value to an integer, rounding towards zero:
Order is the four-element semantic domain of tags {less, equal, greater, unordered}.
The procedure rationalCompare compares two rational values x and y and returns one of the tags less, equal, or greater depending on the result of the comparison:
The procedure float64Compare compares two Float64 values x and y and returns one of the tags less, equal, greater, or unordered depending on the result of the comparison according to the table below.
x | y | ||||||
– | negative finite | –zero | +zero | positive finite | + | NaN | |
– | equal | less | less | less | less | less | unordered |
negative finite | greater | rationalCompare(x, y) | less | less | less | less | unordered |
–zero | greater | greater | equal | equal | less | less | unordered |
+zero | greater | greater | equal | equal | less | less | unordered |
positive finite | greater | greater | greater | greater | rationalCompare(x, y) | less | unordered |
+ | greater | greater | greater | greater | greater | equal | unordered |
NaN | unordered | unordered | unordered | unordered | unordered | unordered | unordered |
The following tables define procedures that perform common arithmetic on Float64 values using IEEE 754 rules. All procedures are strict and evaluate all of their arguments left-to-right.
x | Result |
– | + |
negative finite | –x |
–zero | +zero |
+zero | +zero |
positive finite | x |
+ | + |
NaN | NaN |
x | Result |
– | + |
negative finite | –x |
–zero | +zero |
+zero | –zero |
positive finite | –x |
+ | – |
NaN | NaN |
x | y | ||||||
– | negative finite | –zero | +zero | positive finite | + | NaN | |
– | – | – | – | – | – | NaN | NaN |
negative finite | – | realToFloat64(x + y) | x | x | realToFloat64(x + y) | + | NaN |
–zero | – | y | –zero | +zero | y | + | NaN |
+zero | – | y | +zero | +zero | y | + | NaN |
positive finite | – | realToFloat64(x + y) | x | x | realToFloat64(x + y) | + | NaN |
+ | NaN | + | + | + | + | + | NaN |
NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
x | y | ||||||
– | negative finite | –zero | +zero | positive finite | + | NaN | |
– | NaN | – | – | – | – | – | NaN |
negative finite | + | realToFloat64(x – y) | x | x | realToFloat64(x – y) | – | NaN |
–zero | + | –y | +zero | –zero | –y | – | NaN |
+zero | + | –y | +zero | +zero | –y | – | NaN |
positive finite | + | realToFloat64(x – y) | x | x | realToFloat64(x – y) | – | NaN |
+ | + | + | + | + | + | NaN | NaN |
NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
x | y | ||||||
– | negative finite | –zero | +zero | positive finite | + | NaN | |
– | + | + | NaN | NaN | – | – | NaN |
negative finite | + | realToFloat64(x y) | +zero | –zero | realToFloat64(x y) | – | NaN |
–zero | NaN | +zero | +zero | –zero | –zero | NaN | NaN |
+zero | NaN | –zero | –zero | +zero | +zero | NaN | NaN |
positive finite | – | realToFloat64(x y) | –zero | +zero | realToFloat64(x y) | + | NaN |
+ | – | – | NaN | NaN | + | + | NaN |
NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
x | y | ||||||
– | negative finite | –zero | +zero | positive finite | + | NaN | |
– | NaN | + | + | – | – | NaN | NaN |
negative finite | +zero | realToFloat64(x / y) | + | – | realToFloat64(x / y) | –zero | NaN |
–zero | +zero | +zero | NaN | NaN | –zero | –zero | NaN |
+zero | –zero | –zero | NaN | NaN | +zero | +zero | NaN |
positive finite | –zero | realToFloat64(x / y) | – | + | realToFloat64(x / y) | +zero | NaN |
+ | NaN | – | – | + | + | NaN | NaN |
NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
x | y | ||||||
– | negative finite | –zero | +zero | positive finite | + | NaN | |
– | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
negative finite | x | float64Negate(float64Remainder(–x, –y)) | NaN | NaN | float64Negate(float64Remainder(–x, y)) | x | NaN |
–zero | –zero | –zero | NaN | NaN | –zero | –zero | NaN |
+zero | +zero | +zero | NaN | NaN | +zero | +zero | NaN |
positive finite | x | float64Remainder(x, –y) | NaN | NaN | realToFloat64(x – yx/y) | x | NaN |
+ | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
A procedure is a function that receives zero or more arguments, performs computations, and optionally returns a result.
Procedures may perform side effects. In this document the word procedure is used to refer to internal algorithms; the
word function is used to refer to the programmer-visible function
JavaScript construct.
A procedure is denoted as:
If the procedure does not return a value, the : T on the first line is omitted.
f is the procedure’s name, param1 through paramn are the procedure’s parameters, T1 through Tn are the parameters’ respective semantic domains, T is the semantic domain of the procedure’s result, and step1 through stepm describe the procedure’s computation steps, which may produce side effects and/or return a result. If T is omitted, the procedure does not return a result. When the procedure is called with argument values v1 through vn, the procedure’s steps are performed and the result, if any, returned to the caller.
A procedure’s steps can refer to the parameters param1 through paramn; each reference to a parameter parami evaluates to the corresponding argument value vi. Procedure parameters are statically scoped. Arguments are passed by value.
The only operation done on a procedure f is calling it using the f(arg1, ..., argn) syntax. f is computed first, followed by the argument expressions arg1 through argn, in left-to-right order. If the result of computing f or any of the argument expressions throws an exception e, then the call immediately propagates e without computing any following argument expressions. Otherwise, f is invoked using the provided arguments and the resulting value, if any, returned to the caller.
Procedures are never compared using =, , or any of the other comparison operators.
The semantic domain of procedures that take n parameters in semantic domains T1 through Tn respectively and produce a result in semantic domain T is written as T1 T2 ... Tn T. If n = 0, this semantic domain is written as () T. If the procedure does not produce a result, the semantic domain of procedures is written either as T1 T2 ... Tn () or as () ().
Computation steps in procedures are described using a mixture of English and formal notation. The various kinds of formal steps are described in this section. Multiple steps are separated by semicolons and performed in order unless an earlier step exits via a return or propagates an exception.
Informal steps state invariants and provide comments.
A nothing step performs no operation.
A computation step may consist of an expression. The expression is computed and its value, if any, ignored.
An assignment step is indicated using the assignment operator . This step computes the value of expression and assigns the result to the temporary variable or mutable global v. If this is the first time the temporary variable is referenced in a procedure, the variable’s semantic domain T is listed; any value stored in v is guaranteed to be a member of the semantic domain T.
This step declares v to be a temporary variable with semantic domain T without assigning anything to the variable. v will not be read unless some other step first assigns a value to it.
Inside an action, the assignment operator can also be used to define the result of another action Action[nonterminali] applied to the current expansion of nonterminali, which must appear on the current production’s left or right side. Such an assignment is done only when the value of Action[nonterminali] is not defined explicitly via an action. The value of Action[nonterminali] is set at most once and never modified afterwards. If the same nonterminal is expanded several times while parsing a source program, all such expansions are treated independently.
Temporary variables are local to the procedures that define them (including any nested procedures). Each time a procedure is called it gets a new set of temporary variables.
This form of assignment sets the value of field label of record a to the value of expression.
An if step computes expression1, which will evaluate to either true or false. If it is true, the first list of steps is performed. Otherwise, expression2 is computed and tested, and so on. If no expression evaluates to true, the list of steps following the else is performed. The else clause may be omitted, in which case no action is taken when no expression evaluates to true.
A case step computes expression, which will evaluate to a value v. If v T1, then the first list of steps is performed. Otherwise, if v T2, then the second list of steps is performed, and so on. If v is not a member of any Ti, the list of steps following the else is performed. The else clause may be omitted, in which case v will always be a member of some Ti.
A while step computes expression, which will evaluate to either true or false. If it is false, no action is taken. If it is true, the list of steps is performed and then expression is computed and tested again. This repeats until expression returns true (or until the procedure exits via a return or an exception is propagated out).
A for each step computes expression, which will evaluate to either a set or a list A. The list of steps is performed repeatedly with variable x bound to each element of A. If A is a list, x is bound to each of its elements in order; if A is a set, the order in which x is bound to its elements is arbitrary. The repetition ends after x has been bound to all elements of A (or when either the procedure exits via a return or an exception is propagated out).
A return step computes expression to obtain a value v and returns from the enclosing procedure with the result v. No further steps in the enclosing procedure are performed. The expression may be omitted, in which case the enclosing procedure returns with no result.
A throw step computes expression to obtain a value v and begins propagating exception v outwards, exiting partially performed steps and procedure calls until the exception is caught by a catch step. Unless the enclosing procedure catches this exception, no further steps in the enclosing procedure are performed.
A try step performs the first list of steps. If they complete normally (or if they return), then the try step is done. If any of the steps propagates out an exception e, then if e T, then exception e stops propagating, variable v is bound to the value e, and the second list of steps is performed. If e T, then exception e keeps propagating out.
A try step does not intercept exceptions that may be propagated out of its second list of steps.
An inner proc may be nested as a step inside an outer proc. In this case the inner procedure is a closure and can access the parameters and temporaries of the outer procedure.
Semantic actions tie the grammar and the semantics together. A semantic action ascribes semantic meaning to a grammar production.
To illustrate the use of semantic actions, we shall look at an example, followed by a description of the notation for specifying semantic actions.
Consider the following grammar, with the start nonterminal Numeral:
This grammar defines the syntax of an acceptable input: “37
”, “33#4
”
and “30#2
” are acceptable syntactically, while “1a
” is not. However, the
grammar does not indicate what these various inputs mean. That is the job of the semantics, which are defined in terms of
actions on the parse tree of grammar rule expansions. Consider the following sample set of actions defined on this grammar,
with a starting Numeral action called (in this example) Value:
Action names are written in violet cursive type. The definition
states that the action Value can be applied to any expansion of the nonterminal Numeral,
and the result is an Integer. This action either maps an inputs to an integer or
throws an exception. The code above throws the exception syntaxError when presented
with the input “30#2
”.
There are two definitions of the Value action on Numeral, one for each grammar production that expands Numeral:
Each definition of an action is allowed to perform actions on the terminals and nonterminals on the right side of the expansion.
For example, Value applied to the first Numeral
production (the one that expands Numeral into Digits)
simply applies the DecimalValue action to the expansion of the nonterminal Digits
and returns the result. On the other hand, Value applied to the second Numeral
production (the one that expands Numeral into Digits #
Digits)
performs a computation using the results of the DecimalValue and BaseValue
applied to the two expansions of the Digits nonterminals. In this case there are
two identical nonterminals Digits on the right side of the expansion, so subscripts
are used to indicate on which the actions DecimalValue and BaseValue
are performed.
The definition
states that the action BaseValue can be applied to any expansion of the nonterminal Digits, and the result is a procedure that takes one Integer argument base and returns an Integer. The procedure’s body is comprised of independent cases for each production that expands Digits. When the procedure is called, the case corresponding to the expansion of the nonterminal Digits is evaluated.
The Value action on Digit illustrates the direct use of a nonterminal in a semantic expression: digitValue(Digit). Using the nonterminal Digit in this way refers to the character into which the Digit grammar rule expands.
We can fully evaluate the semantics on our sample inputs to get the following results:
Input | Semantic Result |
---|---|
37 |
37 |
33#4 |
15 |
30#2 |
throw syntaxError |
The following notation is used to define top-level semantic entities:
This notation defines Name to be a shorthand for the semantic domain expression. In the HTML version of the semantics, each use of Name is linked back to this definition.
This notation defines name to be a constant value given by the result of computing expression. The value is guaranteed to be a member of the semantic domain T. In the HTML version of the semantics, each use of name is linked back to this definition.
This notation defines name to be a mutable global value. Its initial value is the result of computing expression, but it may be subsequently altered using an assignment. The value is guaranteed to be a member of the semantic domain T. In the HTML version of the semantics, each use of name is linked back to this definition.
This notation defines f to be a procedure.
This notation defines a tag.
This notation defines a tuple.
This notation defines a record.
This notation states that action Action can be performed on nonterminal nonterminal and returns a value that is a member of the semantic domain T. The action’s value is either defined using the notation Action[nonterminal expansion] = expression below or set as a side effect of computing another action via an action assignment.
This notation specifies the value that action Action on nonterminal nonterminal computes in the case where nonterminal nonterminal expands to the given expansion. expansion can contain zero or more terminals and nonterminals (as well as other notations allowed on the right side of a grammar production). Furthermore, the terminals and nonterminals of expansion can be subscripted to allow them to be unambiguously referenced by action references or nonterminal references inside expression.
This notation combines the above two — it specifies the semantic domain of the action as well as its value.
This notation is used when the computation of the action is too complex for an expression. Here the steps to compute the action are listed as step1 through stepm. A return step produces the value of the action.
This notation is used only when Action returns a procedure when applied to nonterminal nonterminal with a single expansion expansion. Here the steps of the procedure are listed as step1 through stepm.
This notation is used only when Action returns a procedure when applied to nonterminal nonterminal with several expansions expansion1 through expansionn. The procedure is comprised of a series of cases, one for each expansion. Only the steps corresponding to the expansion found by the grammar parser used are evaluated.
Waldemar Horwat Last modified Monday, April 22, 2002 |