Monday, 4 March 2013

3.1.6 Regular Expressions, Backus-Naur Form and Reverse Polish Notation

1.6 Regular Expressions, Backus-Naur Form and Reverse Polish Notation
Natural Language
Set of Words
Syntax – Rules for putting Words together
Semantics – Meaning
The peanut ate the monkey
                                Syntax – correct
                                Semantics - incorrect
Formal Language
Alphabet
Syntax
(No Semantics)
Examples
                                Post Code
                                Car Number Plates
Meta Languages
These are used to express the rules of a formal language.
Two example of Meta Languages :
                                                Regular Expressions
                                                Backus-Naur Form
Regular Language
Is one that can be expressed as a Finite State Automaton (Non-deterministic Finite State Machine)
Regular Expression
Describes Valid Strings in a Formal Language
Example of Regular Expression Notation
                                                a(ab)*       
               
                                = or      * = Zero or more occurrences
Valid                      a              aa           aabab
Invalid   b             ba           abc

Regular Expression Notation - Basic Symbols
* - Zero or More                                               a*
- Or                                                                    ab
+ - One or More                                                               a+
? – Zero or One                                                 aab?

More Symbols (Meta Characters)
( )   - Brackets
[ ]   - Alternate OR                                            [ab]
-     - Sequence                                  a-z
^    - Not                                                               ^t
\    - Escape Meta Meaning or Shorthand Classes                                               (See Book P.61)
.     - Wildcard Character

Applications
Used in searches for pattern matching…
Eg Google Searches

Backus-Naur Form
It is very difficult to define Programming Languages or Natural Language with regular expressions so we use Backus-Naur Form
Example BNF Definition
Unsigned Integer (this is recursive)
<unsigned integer> ::= <digit><digit><unsigned integer>
<digit> ::= 0123456789

Syntax Diagrams
This is another way of defining rules of syntax, in for example a programming language.

<expression> ::= <term> | <term> "+" <expression>


<term> ::= <factor> | <factor> "*" <term>



Reverse Polish Notation

Infix (Normal) is difficult for computers..
4 + 5 x 3                                = 19        Which bit do we do first?
Postfix (RPN) is easy for computers
4 5 +                                       = [in infix]            4 + 5    = 9            
x y – x y + /                         = [in infix]            ( x – y ) / ( x + y )
RPN Removes the need for Brackets


No comments:

Post a Comment