1.6 Regular Expressions, Backus-Naur Form and Reverse Polish Notation
Natural Language
Set of Words
Syntax – Rules for putting Words together
Semantics – Meaning
The peanut ate the monkey
Syntax – correct
Semantics - incorrect
Formal Language
Alphabet
Syntax
(No Semantics)
Examples
Post Code
Car Number Plates
Meta Languages
These are used to express the rules of a formal language.
Two example of Meta Languages :
Regular Expressions
Backus-Naur Form
Regular Language
Is one that can be expressed as a Finite State Automaton (Non-deterministic Finite State Machine)
Regular Expression
Describes Valid Strings in a Formal Language
Example of Regular Expression Notation
a(a│b)*
│ = or * = Zero or more occurrences
Valid a aa aabab
Invalid b ba abc
Regular Expression Notation - Basic Symbols
* - Zero or More a*
│ - Or a│b
+ - One or More a+
? – Zero or One aab?
More Symbols (Meta Characters)
( ) - Brackets
[ ] - Alternate OR [ab]
- - Sequence a-z
^ - Not ^t
\ - Escape Meta Meaning or Shorthand Classes (See Book P.61)
. - Wildcard Character
Applications
Used in searches for pattern matching…
Eg Google Searches
Backus-Naur Form
It is very difficult to define Programming Languages or Natural Language with regular expressions so we use Backus-Naur Form
Example BNF Definition
Unsigned Integer (this is recursive)
<unsigned integer> ::= <digit>│<digit><unsigned integer>
<digit> ::= 0│1│2│3│4│5│6│7│8│9
Syntax Diagrams
This is another way of defining rules of syntax, in for example a programming language.
<expression> ::= <term> | <term> "+" <expression>
<term> ::= <factor> | <factor> "*" <term>
Reverse Polish Notation
Infix (Normal) is difficult for computers..
4 + 5 x 3 = 19 Which bit do we do first?
Postfix (RPN) is easy for computers
4 5 + = [in infix] 4 + 5 = 9
x y – x y + / = [in infix] ( x – y ) / ( x + y )
RPN Removes the need for Brackets
No comments:
Post a Comment