April 2002 Draft
JavaScript 2.0
Formal Description
Regular Expression Grammar
previousupnext

Monday, October 8, 2001

This LR(1) grammar describes the regular expression syntax of the JavaScript 2.0 proposal. See also the description of the grammar notation.

This document is also available as a Word 98 rtf file.

Unicode Character Classes

UnicodeCharacter  Any Unicode character
UnicodeAlphanumeric  Any Unicode alphabetic or decimal digit character (includes ASCII 0-9, A-Z, and a-z)
LineTerminator  «LF» | «CR» | «u2028» | «u2029»

Regular Expression Definitions

Regular Expression Patterns

RegularExpressionPattern  Disjunction

Disjunctions

Disjunction 
   Alternative
|  Alternative | Disjunction

Alternatives

Alternative 
   «empty»
|  Alternative Term

Terms

Term 
   Assertion
|  Atom
|  Atom Quantifier
Quantifier 
   QuantifierPrefix
|  QuantifierPrefix ?
QuantifierPrefix 
   *
|  +
|  ?
|  { DecimalDigits }
|  { DecimalDigits , }
|  { DecimalDigits , DecimalDigits }
DecimalDigits 
   DecimalDigit
|  DecimalDigits DecimalDigit
DecimalDigit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Assertions

Assertion 
   ^
|  $
|  \ b
|  \ B

Atoms

Atom 
   PatternCharacter
|  .
|  NullEscape
|  \ AtomEscape
|  CharacterClass
|  ( Disjunction )
|  ( ? : Disjunction )
|  ( ? = Disjunction )
|  ( ? ! Disjunction )
PatternCharacter  UnicodeCharacter except ^ | $ | \ | . | * | + | ? | ( | ) | [ | ] | { | } | |

Escapes

NullEscape  \ _
AtomEscape 
   DecimalEscape
|  CharacterEscape
|  CharacterClassEscape
CharacterEscape 
   ControlEscape
|  c ControlLetter
|  HexEscape
|  IdentityEscape
ControlLetter 
   A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
|  a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
IdentityEscape  UnicodeCharacter except _ | UnicodeAlphanumeric
ControlEscape 
   f
|  n
|  r
|  t
|  v

Decimal Escapes

DecimalEscape  DecimalIntegerLiteral [lookahead{DecimalDigit}]
DecimalIntegerLiteral 
   0
|  NonZeroDecimalDigits
NonZeroDecimalDigits 
   NonZeroDigit
|  NonZeroDecimalDigits DecimalDigit
NonZeroDigit  1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Hexadecimal Escapes

HexEscape 
   x HexDigit HexDigit
|  u HexDigit HexDigit HexDigit HexDigit
HexDigit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | a | b | c | d | e | f

Character Class Escapes

CharacterClassEscape 
   s
|  S
|  d
|  D
|  w
|  W

User-Specified Character Classes

CharacterClass 
   [ [lookahead{^}] ClassRanges ]
|  [ ^ ClassRanges ]
ClassRanges 
   «empty»
|  NonemptyClassRangesdash
  {dashnoDash}
NonemptyClassRanges 
   ClassAtomdash
|  ClassAtom NonemptyClassRangesnoDash
|  ClassAtom - ClassAtomdash ClassRanges
|  NullEscape ClassRanges

Character Class Range Atoms

ClassAtom 
   ClassCharacter
|  \ ClassEscape
ClassCharacterdash  UnicodeCharacter except \ | ]
ClassCharacternoDash  ClassCharacterdash except -
ClassEscape 
   DecimalEscape
|  b
|  CharacterEscape
|  CharacterClassEscape

Waldemar Horwat
Last modified Monday, October 8, 2001
previousupnext