This project is intended to increase your understanding of languages and grammars. In a regular grammar (RG), all production rules must have one of the following forms:

computer science

Description

COMPUTER THEORY PROJECT #2:  Context-Free Grammars

Spring 2020

Due: 2/28/20

 

This project is intended to increase your understanding of languages and grammars. In a regular grammar (RG), all production rules must have one of the following forms:

 

                        Ni = t1t2t3...tkNj

Ni = t1t2t3...tk

 

where ti  denotes a terminal (alphabet symbol), and Nj and Nj denote nonterminals. Any language defined by a regular grammar is a regular language. Regular languages can also be defined by finite automata, transition graphs, and regular expressions.

 

More complex grammars, such as context-free grammars (CFG), can define additional languages beyond regular languages. In a context-free grammar, the production rules have the form:

 

                        Ni = any string of terminals and nonterminals

 

where Ni denotes a nonterminal. Any language defined by a context-free grammar is a context-free language. Regular languages are a proper subset of the set of all context-free languages.

 

Part A

 

Write a CFG class that meets the following design specifications:

 

Instance variables:

String[] Code                -- production rules as program code

char     startNT       -- starting nonterminal

 

Constructor:

CFG(String[] C)

 

Methods:

     char getStartNT()

     void setStartNT(char stNT)

     boolean processData(String inString, String wkString)

 

The CFG class includes an instance variable Code of type String. The Code array contains program statements that define the production rules for the grammar. The instruction format for a CFG is:

 

     LHS=>RHS

 

where LHS is the left-hand side and RHS is the right-hand side of a production rule. For a CFG, the LHS character is a single nonterminal. The RHS string can contain both terminals and nonterminals. For example, the statement  S=>aTa, with LHS = S and RHS = aTa, states that S can be replaced by aTa.

 

It is assumed that the starting nonterminal is the LHS value for the first production rule. The setStartNT method can be used to change the starting nonterminal.

 

Note that you will need a recursive algorithm for the processData method, since two production rules may have the same LHS value. Your CFG class should work for the sample test program you will be given. Turn in the source code for your CFG class.

 

Part B

 

Define a context-free grammar for each of the languages described below. Then write a test program to implement the grammar as an instance of your CFG class. Each context-free grammar should be defined in a separate test program.

 

1.     A CFG for alphabet {a,b} that recognizes the language consisting of all strings that start with an odd number of a's followed by the same number of b's. Test your program with the following input strings:

                        ab, aabb, aaabbb, aaabbbbb, aaaabbb

 

2.     A CFG for alphabet {a,b} that recognizes the language consisting of all strings of length 1 or greater that do not contain the substring aa.  Test your program with the following input strings:

                        abba, abbabaaa, abaabab, bababbab, bbbabba

 

3.     A CFG for alphabet {a,b} that recognizes the language consisting of all strings that contain exactly one b, have 2N a's (N >= 0, N is an integer) before the b, and 2N+1 a's after the b. Test your program with the following input strings:

                        ba, aaabaaaa, aabaaa, abaa, aaaabaaa

 

4.     A CFG for alphabet {x,y,z} that recognizes the language consisting of all strings that start with z, followed by N x's (N >= 0), followed by twice as many y's, and ending with z. Test your program with the following input strings:

                        zz, zxxyyz, zxxyyyy, zxyyz, zxxyyyyz

 

For each context-free grammar in Part B, turn in your definition of the grammar, the source code for the test program, and the output for the test input strings.


Related Questions in computer science category