... | @@ -75,7 +75,7 @@ Then we can validate if we have valid non-nesting values by checking: |
... | @@ -75,7 +75,7 @@ Then we can validate if we have valid non-nesting values by checking: |
|
```
|
|
```
|
|
|
|
|
|
|
|
|
|
### Validating arrays
|
|
### Validating arrays
|
|
|
|
|
|
Assume we have two bit streams for tokens: `valueToken` (explained above) and `anyToken`, a bit stream marking the end position of any legal or illegal token. Assume also that `LBrak`, `RBrak`, and `Comma` are streams marking the position of JSON `[`, `]`, and `,` tokens respectively.
|
|
Assume we have two bit streams for tokens: `valueToken` (explained above) and `anyToken`, a bit stream marking the end position of any legal or illegal token. Assume also that `LBrak`, `RBrak`, and `Comma` are streams marking the position of JSON `[`, `]`, and `,` tokens respectively.
|
|
|
|
|
... | @@ -116,6 +116,44 @@ after any value or Comma. |
... | @@ -116,6 +116,44 @@ after any value or Comma. |
|
errAfterLBrak = ScanTo(Advance(arrayStart, 1), anyToken) & ~ (nested | valueToken | RBrak)
|
|
errAfterLBrak = ScanTo(Advance(arrayStart, 1), anyToken) & ~ (nested | valueToken | RBrak)
|
|
```
|
|
```
|
|
|
|
|
|
|
|
### Validating objects
|
|
|
|
|
|
|
|
Assume we have two bit streams for tokens: `valueToken` (explained above) and `anyToken`, a bit stream marking the end position of any legal or illegal token. Assume also that `LBrace`, `RBrace`, `DQuote`, `Colon` and `Comma` are streams marking the position of JSON `{`, `}`, `"`, `:` and `,` tokens respectively.
|
|
|
|
|
|
|
|
```
|
|
|
|
str = valueToken & DQuote
|
|
|
|
valueTokenMinusStr = valueToken ^ str
|
|
|
|
|
|
|
|
atDepth = bnc.EQ(ND, d)
|
|
|
|
nested = bnc.UGT(ND, d)
|
|
|
|
objStart = atDepth & LBrace
|
|
|
|
objEnd = ScanThru(objStart, nested | (atDepth & ~ (RBrak | RBrace)))
|
|
|
|
objAtEnd = objEnd & RBrak
|
|
|
|
objSpan = ExclusiveSpan(objStart, objEnd)
|
|
|
|
// Now validate that every value or nested item is followed
|
|
|
|
// either by a comma or a the end RBrace.
|
|
|
|
afterNested = Advance(nested & objSpan, 1) & atDepth
|
|
|
|
|
|
|
|
// process all values that are not strings
|
|
|
|
afterTokenMinusStr = Advance(valueTokenMinusStr & objSpan, 1)
|
|
|
|
tokenNextMinusStr = ScanThru(afterNested | afterTokenMinusStr, whitespace)
|
|
|
|
errAfterValueMinusStr = tokenNextMinusStr &~(Comma | RBrace)
|
|
|
|
|
|
|
|
// process strings as both key and value
|
|
|
|
afterTokenStr = Advance(str & objSpan, 1)
|
|
|
|
tokenNextStr = ScanThru(afterNested | afterTokenStr, whitespace)
|
|
|
|
errAfterValueStr = tokenNextStr &~(Comma | RBrace | Colon)
|
|
|
|
|
|
|
|
// join errors
|
|
|
|
errAfterValue = errAfterValueMinusStr | errAfterValueStr
|
|
|
|
|
|
|
|
//
|
|
|
|
// Every Comma must be followed by a value
|
|
|
|
errAfterComma = ScanTo(Advance(Comma & arraySpan, 1), anyToken) & ~ (nested | valueToken)
|
|
|
|
// After the LBrak we must have either a value or an RBrak.
|
|
|
|
errAfterLBrak = ScanTo(Advance(arrayStart, 1), anyToken) & ~ (nested | valueToken | RBrak)
|
|
|
|
```
|
|
|
|
|
|
# CFG Kernel (_note: this doc isn't ready_)
|
|
# CFG Kernel (_note: this doc isn't ready_)
|
|
|
|
|
|
The CFG kernel is a Parabix kernel (to be implemented) that can compute a context-free grammars given a set of rules with its own syntax (close to EBNF), a BixNum representing the nesting depth of its tokens (see NestingDepth kernel), and a StreamSet representing all valid markers.
|
|
The CFG kernel is a Parabix kernel (to be implemented) that can compute a context-free grammars given a set of rules with its own syntax (close to EBNF), a BixNum representing the nesting depth of its tokens (see NestingDepth kernel), and a StreamSet representing all valid markers.
|
... | | ... | |