Intro to Compilers
- Fortrans : the first compiler
- Led to an enormous body of theoretical
- Mordern compilers preserve the outline of FORTRAN I
Compilers Structure
- Lexical Analysis
- Parsing
- Semantic Analysis
- Optimization
- Code Generation
Lexical analysis divideds program text into “words” or “tokens”
An implement must do two things:
- Recognize substrings corresponding to tokens (The lexemes)
- Identify the token class of each lexeme
Languages
- S: a set of characters
- A: a set of strings drawn from S
- Epsilon: set with a single, empty string
-
A+B: {s s from A or s from B} - AB: {ab: a from A and b from B}
- A*: A出现任意次
Finite Automata
Transition In state S1 on input a go to state S2 If end of input and in accepting state => accept Otherwise =>terminates in not accepting state or get stuck => reject
Langage of a FA = set of accept states
DFA: Deterministic Finite Automata
NFA: Nondeterministic Finite Automata
CFGs
we need
- A language for describing valid strings of tokens
- A method for distinguishing valid from invalid strings of tokens
Programming languges have recursive structure
A CFG consists of
- A set of terminals: T
- A set of non-terminals: N
- A start symbol: S
- A set of productions: X -> Y1..Yn (X is N, Y is N+T+{epsilon})