ZeePedia

Context Free Grammar (CFG), CFG terminologies

<< finiteness of a language
Trees >>
img
Theory of Automata
(CS402)
Theory of Automata
Lecture N0. 31
Reading Material
Chapter 12
Introduction to Computer Theory
Summary
Context Free Grammar, Terminals, non-terminals, productions, CFG, context Free language, examples.
Context Free Grammar (CFG)
The earliest computers accepted no instructions other then their own assembly language. Every procedure, no
matter how complicated , had to be encoded in the set of instructions, LOAD, STORE, ADD the contents of two
registers and so on. The major problem was to display mathematical formulas as follows
(8 - 0 ) 2 + ( 7 - 10 ) 2 + (11 - 10 ) 2
S=
2
or
1
+9
2
A=
8
5
4+  +
1
21
3+
2
So, it was necessary to develop a way of writing such expressions in one line of standard typewriter symbols, so
that in this way a high level language could be invented. Before the invention of computers, no one would ever
have dreamed of writing such complicated formula in parentheses e.g. the right side of formula can be written as
((1/2)+9)/(4+(8/21)+(5/(3+(1/2))))
The high level language is converted into assembly language codes by a program called compiler.
The compiler that takes the user's programs as its inputs and prints out an equivalent program written in
assembly language.
Like spoken languages, high level languages for computer have also, certain grammar. But in case of computers,
the grammatical rules, don't involve the meaning of the words.
It can be noted that the grammatical rules which involve the meaning of words are called Semantics, while
those don't involve the meaning of the words are called Syntactics.
e.g. in English language, it can not be written " Buildings sing ", while in computer language one number is as
good as another.
e.g. X = B + 10,  X = B + 999
Remark
In general, the rules of computer language grammar, are all syntactic and not semantic. A law of grammar is in
reality a suggestion for possible substitutions.
CFG terminologies
Terminals: The symbols that can't be replaced by anything are called terminals.
Non-Terminals: The symbols that must be replaced by other things are called non-terminals.
Productions: The grammatical rules are often called productions.
CFG
CFG is a collection of the followings
An alphabet S of letters called terminals from which the strings are formed, that will be the words of the
language.
A set of symbols called non-terminals, one of which is S, stands for "start here".
A finite set of productions of the form
non-terminal Æ finite string of terminals and /or non-terminals.
Note
The terminals are designated by small letters, while the non-terminals are designated by capital letters.
There is at least one production that has the non-terminal S as its left side.
92
img
Theory of Automata
(CS402)
Context Free Language (CFL)
The language generated by CFG is called Context Free Language (CFL).
Example
S = {a}
productions:
S ÆaS
SÆY
Applying production (1) six times and then production (2) once, the word aaaaaa is generated as
S fi aS
fi aaS
fi aaaS
fi aaaaS
fi aaaaaS
fi aaaaaaS
fi aaaaaaL
= aaaaaa
It can be observed that prod (2) generates L, a can be generated applying prod. (1) once and then prod. (2), aa
can be generated applying prod. (1) twice and then prod. (2) and so on. This shows that the grammar defines the
language expressed by a*.
Example
S = {a}
productions:
SÆSS
SÆa
SÆL
This grammar also defines the language expressed by a*.
Note: It is to be noted that L is not considered to be terminal. It has a special status. If for a certain non-terminal
N, there may be a production NÆL. This simply means that N can be deleted when it comes in the working
string.
Example
S = {a,b}
productions:
SÆX
SÆY
XÆL
YÆaY
YÆbY
YÆa
YÆb
All words of this language are of either X-type or of Y-type. i.e. while generating a word the first production
used is SÆX or SÆY. The words of X-type give only L, while the words of Y-type are words of finite strings
of a's or b's or both i.e. (a+b)+. Thus the language defined is expressed by (a+b)*.
Example
S = {a,b}
productions:
SÆaS
SÆbS
SÆa
SÆb
SÆL
This grammar also defines the language expressed by (a+b)*.
93
img
Theory of Automata
(CS402)
Example
S = {a,b}
productions:
SÆXaaX
XÆaX
XÆbX
XÆL
This grammar defines the language expressed by (a+b)*aa(a+b)*.
94