ECMAScript 4 Netscape Proposal
Formal Description
Regular Expression Grammar
previousupnext

Monday, June 9, 2003

This LR(1) grammar describes the regular expression syntax of the ECMAScript 4 proposal. See also the description of the grammar notation.

This document is also available as a Word RTF file.

Unicode Character Classes

UnicodeCharacter ⇒ Any Unicode character
UnicodeAlphanumeric ⇒ Any Unicode alphabetic or decimal digit character (includes ASCII 0-9, A-Z, and a-z)
LineTerminator ⇒ «LF» | «CR» | «u0085» | «u2028» | «u2029»

Regular Expression Definitions

Regular Expression Patterns

RegularExpressionPattern ⇒ Disjunction

Disjunctions

Disjunction 
   Alternative
|  Alternative | Disjunction

Alternatives

Alternative 
   «empty»
|  Alternative Term

Terms

Term 
   Assertion
|  Atom
|  Atom Quantifier
Quantifier 
   QuantifierPrefix
|  QuantifierPrefix ?
QuantifierPrefix 
   *
|  +
|  ?
|  { DecimalDigits }
|  { DecimalDigits , }
|  { DecimalDigits , DecimalDigits }
DecimalDigits 
   DecimalDigit
|  DecimalDigits DecimalDigit
DecimalDigit ⇒ 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Assertions

Assertion 
   ^
|  $
|  \ b
|  \ B

Atoms

Atom 
   PatternCharacter
|  .
|  NullEscape
|  \ AtomEscape
|  CharacterClass
|  ( Disjunction )
|  ( ? : Disjunction )
|  ( ? = Disjunction )
|  ( ? ! Disjunction )
PatternCharacter ⇒ UnicodeCharacter except ^ | $ | \ | . | * | + | ? | ( | ) | [ | ] | { | } | |

Escapes

NullEscape ⇒ \ _
AtomEscape 
   DecimalEscape
|  CharacterEscape
|  CharacterClassEscape
CharacterEscape 
   ControlEscape
|  c ControlLetter
|  HexEscape
|  IdentityEscape
ControlLetter 
   A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
|  a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
IdentityEscape ⇒ UnicodeCharacter except _ | UnicodeAlphanumeric
ControlEscape 
   f
|  n
|  r
|  t
|  v

Decimal Escapes

DecimalEscape ⇒ DecimalIntegerLiteral [lookahead∉{DecimalDigit}]
DecimalIntegerLiteral 
   0
|  NonZeroDecimalDigits
NonZeroDecimalDigits 
   NonZeroDigit
|  NonZeroDecimalDigits DecimalDigit
NonZeroDigit ⇒ 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Hexadecimal Escapes

HexEscape 
   x HexDigit HexDigit
|  u HexDigit HexDigit HexDigit HexDigit
HexDigit ⇒ 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | a | b | c | d | e | f

Character Class Escapes

CharacterClassEscape 
   s
|  S
|  d
|  D
|  w
|  W

User-Specified Character Classes

CharacterClass 
   [ [lookahead∉{^}] ClassRanges ]
|  [ ^ ClassRanges ]
ClassRanges 
   «empty»
|  NonemptyClassRangesdash
δ ∈ {dashnoDash}
NonemptyClassRangesδ 
   ClassAtomdash
|  ClassAtomδ NonemptyClassRangesnoDash
|  ClassAtomδ - ClassAtomdash ClassRanges
|  NullEscape ClassRanges

Character Class Range Atoms

ClassAtomδ 
   ClassCharacterδ
|  \ ClassEscape
ClassCharacterdash ⇒ UnicodeCharacter except \ | ]
ClassCharacternoDash ⇒ ClassCharacterdash except -
ClassEscape 
   DecimalEscape
|  b
|  CharacterEscape
|  CharacterClassEscape

Waldemar Horwat
Last modified Monday, June 9, 2003
previousupnext