July 2000 Draft
JavaScript 2.0
Formal Description
Stages
|
|
Sunday, April 30, 2000
This page is out of date
The source code is processed in the following stages:
- If necessary, convert the source code into the Unicode UTF-16 format, normalized
form C.
- Split the source code into tokens using the lexer grammar and lexer
semantics.
- Parse the resulting sequence of tokens using the parser grammar and evaluate it using
the parser semantics [To be provided].
Lexing
Processing stage 2 is done as follows:
- Let tokens be an empty array of Token
metalanguage records. (As defined in the lexer semantics, a Token
can be either an identifier, a keyword, a punctuation symbol, a number, a number with a unit, a string, or the end token.)
- Let input be the input sequence of Unicode characters. Append a special placeholder
End to the end of input.
- Let regExpMayFollow be a Boolean
variable. Initialize it to true.
- Apply the lexer grammar to parse the longest possible prefix of input.
If regExpMayFollow is true, use the start symbol
NextTokenre.
If regExpMayFollow is false, use the start symbol
NextTokendiv.
The result of the parse should be a parse tree T. If the parse failed, return a syntax error.
- Compute the action Token on T to obtain a Token
t. If t is the end
token, return the tokens array and go to the parse stage.
- Append t to the end of the tokens array.
- Compute the action RegExpMayFollow on T to obtain a Boolean
value and assign that value to the regExpMayFollow variable.
- Remove the characters matched by T from input, leaving only the yet-unparsed
suffix of input.
- Go to step 4.
If an implementation encounters an error while lexing, it is permitted to either report the error immediately or defer
it until the affected token would actually be used by the parser. This flexibility allows an implementation to do lexing at
the same time it parses the source program.
Show mapping from Token structures
to parser grammar terminals (obvious, but needs to be written).
Parsing
To be provided