Sunday, March 29, 2009

Formal Grammars and Tools for Java

A grammar is formally defined as the ordered quad-tuple (N,Σ,P,S),
where:
  • a finite set N of nonterminal symbols;
  • a finite set Σ of terminal symbols that is disjoint from N;
  • a finite set P of production rules;
  • a distinguished symbol S from set N that is the start symbol.
Chomsky hierarchy of classes of formal grammars:
  • type 0 - unrestricted grammars - include all formal grammars;
  • type 1 - context-sensitive grammars - generate the context-sensitive languages;
  • type 2 - context-free grammars - generate the context-free languages. Context free languages are the theoretical basis for the syntax of most programming languages;
  • type 3 - regular grammars - generate the regular languages. Regular languages are commonly used to define search patterns and the lexical structure of programming languages.


In computer science, Extended Backus–Naur Form (EBNF) is a metasyntax notation used to express context-free grammars: that is, a formal way to describe computer programming languages and other formal languages. It is an extension of the basic Backus–Naur Form (BNF) metasyntax notation.

Compiler of computer programming languages consists of:
Types of parsers:
Sample:
  • grammar:
    S ::= Ax
    A ::= a
    A ::= b

  • input sequence:
    ax

  • Top-down parsing:
    S → Ax → ax

  • Bottom-up parsing:
    ax → Ax → S
Top-down parsers:
  • LL parser (Left-to-right, Leftmost derivation)
Bottom-up parsers:
  • LR parser (Left-to-right, Rightmost derivation)
  • SLR parser (Simple LR parser)
  • LALR parser (lookahead LR parser)
In good review of Lex and Yacc for Java next tools were selected:
I think that ANTLR is one of the best tools because:

Screens of ANTLR GUI:

No comments: