... | ... | @@ -6,10 +6,40 @@ After successfully [parsing](CSVparsing) a CSV file, now let's consider how to e |
|
|
|
|
|
One of the basic editing operations that we might want to support is deleting a column from all records in a file.
|
|
|
|
|
|
Suppose we want to delete the second column in the following CSV data.
|
|
|
Suppose we want to delete the second column in every row of the following CSV data.
|
|
|
```
|
|
|
Data_stream: Henderson,Paul,ph@sfu.ca⏎Lin,Qingshan,1234@zju.edu.cn⏎
|
|
|
Field_separators: .........1....1.........1...1........1...............1
|
|
|
Record_separators: ........................1............................1
|
|
|
|
|
|
```
|
|
|
|
|
|
The Parabix `FilterByMask` operation can do this for us, if we set up a mask stream that selects all of the data except the second column and its following comma.
|
|
|
|
|
|
```
|
|
|
Data stream: Henderson,Paul,ph@sfu.ca⏎Lin,Qingshan,1234@zju.edu.cn⏎
|
|
|
Field starts: 1.........1....1.........1...1........1...............
|
|
|
Field follows: .........1....1.........1...1........1...............1
|
|
|
To keep: 1111111111.....11111111111111.........1111111111111111
|
|
|
|
|
|
```
|
|
|
|
|
|
How do we calculate this mask? With the following set of operations using a
|
|
|
`PabloBuilder pb`.
|
|
|
```
|
|
|
PabloAST * F1start = pb.createNot(pb.createAdvance(pb.createNot(record_separators), 1);
|
|
|
PabloAST * F1follow = pb.createScanTo(F1start, Field_separators);
|
|
|
PabloAST * F2start = pb.createAdvance(F1start, 1);
|
|
|
PabloAST * F2follow = pb.createScanTo(F2start, Field_separators);
|
|
|
PabloAST * toDelete = pb.createIntrinsicCall(pablo::Intrinsic::InclusiveSpan, {F2start, F2follow});
|
|
|
PabloAST * toKeep = pb.createNot(toDelete);
|
|
|
```
|
|
|
|
|
|
```
|
|
|
Data stream: Henderson,Paul,ph@sfu.ca⏎Lin,Qingshan,1234@zju.edu.cn⏎
|
|
|
F1start: 1........................1............................
|
|
|
F1follow: .........1..................1.........................
|
|
|
F2start: ..........1..................1........................
|
|
|
F2follow: ..............1......................1................
|
|
|
toDelete: ..........11111..............111111111................
|
|
|
toKeep: 1111111111.....11111111111111.........1111111111111111
|
|
|
```
|
|
|
|