| ... | @@ -88,7 +88,9 @@ preceding comma. | ... | @@ -88,7 +88,9 @@ preceding comma. | 
|  |  |  |  | 
|  | ## Matching and Deleting a Row |  | ## Matching and Deleting a Row | 
|  |  |  |  | 
|  | Suppose that we have a regular expression R to select CSV rows for deletion. |  | Suppose that we have a regular expression R to select CSV rows for deletion, | 
|  |  |  | where R has no Unicode properties or other features. | 
|  |  |  |  | 
|  | The Parabix regular expression engine can be used to perform the matching and |  | The Parabix regular expression engine can be used to perform the matching and | 
|  | `FilterByMask` can be used for the deletion. |  | `FilterByMask` can be used for the deletion. | 
|  |  |  |  | 
| ... | @@ -103,6 +105,41 @@ Using the ICGrepKernel to perform the matching may be implemented as follows. | ... | @@ -103,6 +105,41 @@ Using the ICGrepKernel to perform the matching may be implemented as follows. | 
|  | P->CreateKernelCall<ICGrepKernel>(std::move(options)); |  | P->CreateKernelCall<ICGrepKernel>(std::move(options)); | 
|  | ``` |  | ``` | 
|  |  |  |  | 
|  |  |  | The resulting `MatchResults` stream will have 1 bits on any matching CSV row. | 
|  |  |  | To select the row, the next task is to move the matches to the line end position, | 
|  |  |  | assuming that the line ends are given by `mLineBreakStream`. | 
|  |  |  |  | 
|  |  |  | ``` | 
|  |  |  | StreamSet * const MovedMatches = P->CreateStreamSet(); | 
|  |  |  | P->CreateKernelCall<MatchedLinesKernel>(MatchResults, mLineBreakStream, MovedMatches); | 
|  |  |  | ``` | 
|  |  |  |  | 
|  |  |  | We can next get a stream that is indexed by line number (1 bit per CSV row). | 
|  |  |  | ``` | 
|  |  |  | StreamSet * MatchesByLine = P->CreateStreamSet(1, 1); | 
|  |  |  | FilterByMask(P, mLineBreakStream, MovedMatches, MatchesByLine); | 
|  |  |  | ``` | 
|  |  |  |  | 
|  |  |  | LineStarts can then be identified as the positions immediately after a line break | 
|  |  |  | or at the beginning of the file.   These are computed by the `LineStartsKernel`. | 
|  |  |  |  | 
|  |  |  | ``` | 
|  |  |  | StreamSet * LineStarts = E->CreateStreamSet(1, 1); | 
|  |  |  | P->CreateKernelCall<LineStartsKernel>(mLineBreakStream, LineStarts); | 
|  |  |  | ``` | 
|  |  |  |  | 
|  |  |  | The starts of the matched lines are now computed by a  ```SpreadByMask```. | 
|  |  |  | ``` | 
|  |  |  | StreamSet * MatchedLineStarts = E->CreateStreamSet(1, 1); | 
|  |  |  | SpreadByMask(E, LineStarts, MatchesByLine, MatchedLineStarts); | 
|  |  |  | ``` | 
|  |  |  |  | 
|  |  |  | Now a mask for an entire matched row can be computed, using the LineSpansKernel. | 
|  |  |  | ``` | 
|  |  |  | StreamSet * MatchedLineSpans = E->CreateStreamSet(1, 1); | 
|  |  |  | P->CreateKernelCall<LineSpansKernel>(MatchedLineStarts, MatchedLineEnds, MatchedLineSpans); | 
|  |  |  | ``` | 
|  |  |  |  | 
|  |  |  | If FilterByMask was used at this point, you would get the matched rows.   To delete | 
|  |  |  | the matched rows, the MatchedLineSpans must be negated (use a Pablo createNot operation). | 
|  |  |  |  |