Basic Idea |
|
Convert characters into tokens Run characters through a finite state machine, taking actions on the edges Use a table to represent the transition function of the FSM |
Finite State Machine Machine to accept words that contain 0101 A touch of Formalism that I'm too lazy to typeset A table-based lexer simulates a FSM by having an array that maps the current state and the current input character to new state. It is easy to add code to the transtions, providing for very elaborate behavior. Without this code a FSM can recoginize only regular languages, with it, it is easy to recoginize context-free languages, and even build a parse tree as you go. |
|
Exercise |
|
Create a simple lexer to handle the language of words inside of balanced parens. Your program should read from standard input until it encounters the character $, when it encounters it, it should produce either the message "Invalid Input" if the input was not valid, or the words from the input, separated by spaces.
Examples:
Monday Lab => Wednesday 5:00PM Wednesday Lab => Friday 5:00PM |