cameron · b797ac47
Hide whitespace changes
Inline Side-by-side

Showing with 35 additions and 5 deletions

CSVediting.md CSVediting.md +35 -5

No files found.
--- a/CSVediting.md
+++ b/CSVediting.md
@@ -6,10 +6,40 @@ After successfully [parsing](CSVparsing) a CSV file, now let's consider how to e
 One of the basic editing operations that we might want to support is deleting a column from all records in a file.
-Suppose we want to delete the second column in the following CSV data.
+Suppose we want to delete the second column in every row of the following CSV data.
+```
+Data_stream:         Henderson,Paul,ph@sfu.ca⏎Lin,Qingshan,1234@zju.edu.cn⏎
+Field_separators:    .........1....1.........1...1........1...............1
+Record_separators:   ........................1............................1
+```
+The Parabix `FilterByMask` operation can do this for us, if we set up a mask stream that selects all of the data except the second column and its following comma.
+```
+Data stream:         Henderson,Paul,ph@sfu.ca⏎Lin,Qingshan,1234@zju.edu.cn⏎
+To keep:             1111111111.....11111111111111.........1111111111111111
+```
+How do we calculate this mask?   With the following set of operations using a 
+`PabloBuilder pb`.
+```
+PabloAST * F1start = pb.createNot(pb.createAdvance(pb.createNot(record_separators), 1);
+PabloAST * F1follow = pb.createScanTo(F1start, Field_separators);
+PabloAST * F2start = pb.createAdvance(F1start, 1);
+PabloAST * F2follow = pb.createScanTo(F2start, Field_separators);
+PabloAST * toDelete = pb.createIntrinsicCall(pablo::Intrinsic::InclusiveSpan, {F2start, F2follow});
+PabloAST * toKeep = pb.createNot(toDelete);
+```
+```
+Data stream:         Henderson,Paul,ph@sfu.ca⏎Lin,Qingshan,1234@zju.edu.cn⏎
+F1start:             1........................1............................
+F1follow:            .........1..................1.........................
+F2start:             ..........1..................1........................
+F2follow:            ..............1......................1................
+toDelete:            ..........11111..............111111111................
+toKeep:              1111111111.....11111111111111.........1111111111111111
 ```
-Data stream:     Henderson,Paul,ph@sfu.ca⏎Lin,Qingshan,1234@zju.edu.cn⏎
-Field starts:    1.........1....1.........1...1........1...............
-Field follows:   .........1....1.........1...1........1...............1
-```
\ No newline at end of file