April 2002 Draft
JavaScript 2.0
Formal Description
Stages
previousupnext

Thursday, October 18, 2001

The source code is processed in the following stages:

  1. If necessary, convert the source code into the Unicode UTF-16 format, normalized form C.
  2. Remove any Unicode format control characters (category Cf) from the source code.
  3. Simultaneously split the source code into input elements using the lexical grammar and semantics and parse it using the syntactic grammar to obtain a parse tree P.
  4. Evaluate P using the syntactic semantics by computing the action Eval on it.

Lexing and Parsing

Processing stage 3 is done as follows:

  1. Let inputElements be an empty array of input elements (syntactic grammar terminals and line breaks).
  2. Let input be the input sequence of Unicode characters. Append a special placeholder End to the end of input.
  3. Let state be a variable that holds one of the constants re, div, or unit. Initialize it to re.
  4. Apply the lexical grammar to parse the longest possible prefix of input. Use the start symbol NextInputElementre, NextInputElementdiv, or NextInputElementunit depending on whether state is re, div, or unit, respectively. The result of the parse should be a lexical grammar parse tree T. If the parse failed, return a syntax error.
  5. Compute the action InputElement on T to obtain an InputElement e.
  6. If e is the end input element, go to step 15.
  7. Remove the characters matched by T from input, leaving only the yet-unlexed suffix of input.
  8. Interpret e as a syntactic grammar terminal or line break as follows:
  9. Append the resulting terminal or line break to the end of the inputElements array.
  10. If the inputElements array forms a valid prefix of the context-free language defined by the syntactic grammar, go to step 13.
  11. If is not a lineBreak but the previous element of the inputElements array is a lineBreak, then insert a VirtualSemicolon terminal between that lineBreak and in the inputElements array.
  12. If the inputElements array still does not form a valid prefix of the context-free language defined by the syntactic grammar, signal a syntax error and stop.
  13. If is a Number, then set state to unit. Otherwise, if the inputElements array followed by the terminal / forms a valid prefix of the context-free language defined by the syntactic grammar, then set state to div; otherwise, set state to re.
  14. Go to step 4.
  15. If the inputElements array does not form a valid sentence of the context-free language defined by the syntactic grammar, signal a syntax error and stop.
  16. Return the parse tree obtained by the syntactic grammar’s derivation of the sentence formed by the inputElements array.

Waldemar Horwat
Last modified Thursday, October 18, 2001
previousupnext