ECMAScript 4 Netscape Proposal
Formal Description
Regular Expression Grammar
previousupnext

Monday, June 9, 2003

This LR(1) grammar describes the regular expression syntax of the ECMAScript 4 proposal. See also the description of the grammar notation.

This document is also available as a Word RTF file.

Unicode Character Classes

UnicodeCharacter  Any Unicode character
UnicodeAlphanumeric  Any Unicode alphabetic or decimal digit character (includes ASCII 0-9, A-Z, and a-z)
LineTerminator  «LF» | «CR» | «u0085» | «u2028» | «u2029»

Regular Expression Definitions

Regular Expression Patterns

RegularExpressionPattern  Disjunction

Disjunctions

Disjunction 
   Alternative
|  Alternative | Disjunction

Alternatives

Alternative 
   «empty»
|  Alternative Term

Terms

Term 
   Assertion
|  Atom
|  Atom Quantifier
Quantifier 
   QuantifierPrefix
|  QuantifierPrefix ?
QuantifierPrefix 
   *
|  +
|  ?
|  { DecimalDigits }
|  { DecimalDigits , }
|  { DecimalDigits , DecimalDigits }
DecimalDigits 
   DecimalDigit
|  DecimalDigits DecimalDigit
DecimalDigit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Assertions

Assertion 
   ^
|  $
|  \ b
|  \ B

Atoms

Atom 
   PatternCharacter
|  .
|  NullEscape
|  \ AtomEscape
|  CharacterClass
|  ( Disjunction )
|  ( ? : Disjunction )
|  ( ? = Disjunction )
|  ( ? ! Disjunction )
PatternCharacter  UnicodeCharacter except ^ | $ | \ | . | * | + | ? | ( | ) | [ | ] | { | } | |

Escapes

NullEscape  \ _
AtomEscape 
   DecimalEscape
|  CharacterEscape
|  CharacterClassEscape
CharacterEscape 
   ControlEscape
|  c ControlLetter
|  HexEscape
|  IdentityEscape
ControlLetter 
   A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
|  a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
IdentityEscape  UnicodeCharacter except _ | UnicodeAlphanumeric
ControlEscape 
   f
|  n
|  r
|  t
|  v

Decimal Escapes

DecimalEscape  DecimalIntegerLiteral [lookahead{DecimalDigit}]
DecimalIntegerLiteral 
   0
|  NonZeroDecimalDigits
NonZeroDecimalDigits 
   NonZeroDigit
|  NonZeroDecimalDigits DecimalDigit
NonZeroDigit  1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Hexadecimal Escapes

HexEscape 
   x HexDigit HexDigit
|  u HexDigit HexDigit HexDigit HexDigit
HexDigit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | a | b | c | d | e | f

Character Class Escapes

CharacterClassEscape 
   s
|  S
|  d
|  D
|  w
|  W

User-Specified Character Classes

CharacterClass 
   [ [lookahead{^}] ClassRanges ]
|  [ ^ ClassRanges ]
ClassRanges 
   «empty»
|  NonemptyClassRangesdash
  {dashnoDash}
NonemptyClassRanges 
   ClassAtomdash
|  ClassAtom NonemptyClassRangesnoDash
|  ClassAtom - ClassAtomdash ClassRanges
|  NullEscape ClassRanges

Character Class Range Atoms

ClassAtom 
   ClassCharacter
|  \ ClassEscape
ClassCharacterdash  UnicodeCharacter except \ | ]
ClassCharacternoDash  ClassCharacterdash except -
ClassEscape 
   DecimalEscape
|  b
|  CharacterEscape
|  CharacterClassEscape

Waldemar Horwat
Last modified Monday, June 9, 2003
previousupnext