... | @@ -160,6 +160,17 @@ Given this mask, we need to apply FilterByMask to produce two streamsets: |
... | @@ -160,6 +160,17 @@ Given this mask, we need to apply FilterByMask to produce two streamsets: |
|
1. `FilteredBasisBits` is produced by filtering the eight basis bit streams.
|
|
1. `FilteredBasisBits` is produced by filtering the eight basis bit streams.
|
|
2. `FilteredMarks` is produced by filtering two streams produced from parsing: (a) the positions of starts of each record, (b) the positions of the starts of each field, and (c) the positions of any characters that need to be escaped with `\` in the JSON output (escaped_quotes and embedded newlines).
|
|
2. `FilteredMarks` is produced by filtering two streams produced from parsing: (a) the positions of starts of each record, (b) the positions of the starts of each field, and (c) the positions of any characters that need to be escaped with `\` in the JSON output (escaped_quotes and embedded newlines).
|
|
|
|
|
|
|
|
Filtering reduces the length of the stream. For example, for our example above, filtering
|
|
|
|
the field starts yields the following.
|
|
|
|
```
|
|
|
|
Data stream: "Free speech",limitation,"Never yell ""Fire!"" in a crowded theatre."
|
|
|
|
CSV_data_mask .11111111111..1111111111..11111111111.111111.11111111111111111111111.
|
|
|
|
Field_starts .1............1...........1..........................................
|
|
|
|
Filtered_starts 1..........1.........1.......................................
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Expansion of Compressed Input
|
|
### Expansion of Compressed Input
|
|
|
|
|
|
In this phase, `ExpandedBasisBits` is computed by expansion from `FilteredBasisBits` by insertion of zero bits at marked points. The number of zero bits to insert is based on the template string to be inserted.
|
|
In this phase, `ExpandedBasisBits` is computed by expansion from `FilteredBasisBits` by insertion of zero bits at marked points. The number of zero bits to insert is based on the template string to be inserted.
|
... | | ... | |