Commit 36ee3a00 authored by thiarichey's avatar thiarichey
Browse files

More grammar fuzz script.

parent f0043966
File added
......@@ -185,7 +185,9 @@ generally we expect all other results to become more frequent.
### The generating procedure
Similar to the previous fuzzer except that it tracks an `int` that counts the
number of open-but-not-yet closed parentheses. In order to make this tracking
number of open-but-not-yet closed parentheses. After generating our token string,
if this `int` is not equal to 0, then we need to add closed parentheses
onto the end of the string until it is! In order to make this tracking
easier, this fuzzer does not generate comments (since parentheses inside
comments wouldn't count).
......@@ -195,6 +197,12 @@ Our previous fuzzer structurally eliminated balanced parentheses errors. Yay!
But the majority of the resulting programs had abstract syntax errors. Let's now
structurally eliminate those.
The general approach here is similar to the one we took to generate balanced
parentheses, but more complicated. Just as we know a valid Trefoil V2 node is
parenthesized, we know that a valid Trefoil V2 program is composed of a sequence
of bindings (whose arguments can be expressions!). So, let's think about how to
generate expressions and bindings.
The strategy will be to think about generating *trees* (in particular, ASTs)
instead of sequences of tokens. At each step, the fuzzer will randomly decide
what kind of binding to produce, and then for that binding, will generate the
......@@ -202,7 +210,7 @@ right number of arguments. Similarly, when generating expressions, it will first
select what kind of expression to generate, and then generate the right number
for arguments.
Let's take a look at the numbers
Let's take a look at the numbers.
```
Paren:0
......@@ -215,5 +223,36 @@ Programs:52562
Total:1000000
```
We've made a *lot* of progress! Valid programs now account for about 5% of our
results, up from virtually none before. This is because we have eliminaated abstract
syntax errors thanks to our AST-based approach to generating bindings; however,
we now have many more runtime errors and unbound variable and function binding errors.
### The generating procedure
Let's begin with expressions. Expressions can be a lot of things, and we won't review them
all here--although you can see LANGUAGE in HW3 for that--but consider integer literals,
boolean literals, variable reference expressions, the `nil` literal, and if expressions.
One of these is not like the others -- whereas the first four expression types are composed of
literals or symbols, an if expression takes other expressions as arguments! So, we need our generator
to generate more expressions, some of which will themselves be recursive expressions and some of
which will be literals or symbols. You can probably envision a problem here -- stack
overflow! We choose a reasonably small maximum depth for expressions to address this, after which
point we guarantee that a nonrecursive expression will be generated. We can think of the first four
expression types as our "base cases".
Bindings are similar to expressions, but simpler in that their arguments are nodes, symbols, or
expressions, as opposed to other (non-expression) bindings.
So, we generate bindings, which generate expressions, which eventually terminate! This results in
only valid Trefoil V2 ASTs. Last time, in Trefoil V1, this was sufficient to generate exclusively
valid programs, but that is not the case here. Consider what happens in the following simple Trefoil
V2 program:
```
(define x (+ 1 y))
```
When we try to look up `a` in our dynamic environment, it's not there because we have not defined it in
our program! This represents an unbound variable error. What are some other types of errors you can think
of coming up in Trefoil V2 programs generated using grammarFuzz?
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment