To compile a program to the Python Virtual Machine using a home brewed compiler
fn fib(n)
if n == 0 or n == 1
return n
else {
previous = 1
current = 1
for i in range(2, n) {
tmp = current + previous
previous = current
current = tmp
}
return current
}
From https://devguide.python.org/compiler
Input |
|
---|---|
Output |
|
sequence | e1 e2 |
|
ordered choice | e1 / e2 |
|
not predicate | !e |
|
and predicate | &e |
(sugar for !!e ) |
zero or more | e* |
|
one or more | e+ |
(sugar for ee* ) |
optional | e? |
(sugar for &ee / !e ) |
Grammar | Input |
---|---|
|
|
Output |
---|
|
Traverse the parse tree and apply user-defined functions to each node after parsing has finished successfully
const parser = peg
.pegc('./csv.peg')
.bind({
File: ({ action, visit }) => [action, visit()],
Line: ({ action, visit }) => [action, visit()],
Value: ({ action, visit }) => [action, visit().join('')],
});
parser(input);
What happens if semantic actions are applied before parsing is finished
A trade-off of having support for semantic actions embedded within the PEG text
With a PEG implementation at hand, the next step is to get the
language grammar ready to take input text and generate an Abstract
Syntax Tree (or AST
for short). Take a look at the grammar.
Where do we store x
in the example below when plus_n
is
finished executing
fn plus_n(x) fn(y) x + y
plus_two = plus_n(2)
plus_five = plus_n(5)
print(plus_two(2)) # Equals 4
print(plus_five(2)) # Equals 7
Data structure for bookkeeping information about variables. The Python compiler has one, so does the Effigy compiler.
|
Symbol Table | Annotated AST |
---|---|
|
|
Python | Effigy |
---|---|
|
|
The traversals on the AST for mapping the scope rules and for emitting code are guided by a PEG grammar with different sets of semantic actions. Take a look at the grammar.
PyObject_HEAD
int co_argcount; /* #arguments, except *args */
int co_posonlyargcount; /* #positional only arguments */
int co_kwonlyargcount; /* #keyword only arguments */
int co_nlocals; /* #local variables */
int co_stacksize; /* #entries needed for evaluation stack */
int co_firstlineno; /* first source line number */
PyObject *co_code; /* instruction opcodes */
PyObject *co_consts; /* list (constants used) */
PyObject *co_names; /* list of strings (names used) */
PyObject *co_varnames; /* tuple of strings (local variable names) */
PyObject *co_freevars; /* tuple of strings (free variable names) */
PyObject *co_cellvars; /* tuple of strings (cell variable names) */
Source | Code Object |
---|---|
|
|
The AST is traversed by a PEG grammar and code is emitted during the semantic action execution.
enter
: Enter a new scope (create Code Object)leave
: Leave scope & return Code Objectemit
: Add new instruction co_code
attr
: Setter & Getter for object tablespos
: Index of the current instructionref
: Create a new labelfix
: Fix parameter of already emitted instructionIfStm: ({ visit, node }) => {
const [test, body, elsestm] = node[1];
visit(test.value); // Visit the test expression
const lb0 = ref();
emit('pop-jump-if-false', lb0);
visit(body.value); // Visit the body of the statement
if (elsestm) {
const lb1 = ref();
emit('jump-forward', lb1);
const savedPos = pos();
fixjabs(lb0);
visit(elsestm.value); // Visit the body of `else' branche
fixjrel(lb1, savedPos);
} else {
fixjabs(lb0);
}
return true;
},
The top-level Code Object returned by the traversal for emitting code and then it is marshaled and written right after to header into a pyc file.