CSV Editing
After successfully parsing a CSV file, now let's consider how to edit it.
Deleting a column
One of the basic editing operations that we might want to support is deleting a column from all records in a file.
Suppose we want to delete the second column in every row of the following CSV data.
Data_stream:         Henderson,Paul,ph@sfu.ca⏎Lin,Qingshan,1234@zju.edu.cn⏎
Field_separators:    .........1....1.........1...1........1...............1
Record_separators:   ........................1............................1
The Parabix FilterByMask operation can do this for us, if we set up a mask stream that selects all of the data except the second column and its preceding comma.
Data stream:         Henderson,Paul,ph@sfu.ca⏎Lin,Qingshan,1234@zju.edu.cn⏎
To keep:             111111111.....11111111111111.........11111111111111111
How do we calculate this mask?   With the following set of operations using a
PabloBuilder pb.
PabloAST * F1start = pb.createNot(pb.createAdvance(pb.createNot(Record_separators), 1);
PabloAST * F1follow = pb.createScanTo(F1start, Field_separators);
PabloAST * F2start = pb.createAdvance(F1start, 1);
PabloAST * F2follow = pb.createScanTo(F2start, Field_separators);
PabloAST * toDelete = pb.createIntrinsicCall(pablo::Intrinsic::ExclusiveSpan, {F1follow, F2follow});
PabloAST * toKeep = pb.createNot(toDelete);Data stream:         Henderson,Paul,ph@sfu.ca⏎Lin,Qingshan,1234@zju.edu.cn⏎
F1start:             1........................1............................
F1follow:            .........1..................1.........................
F2start:             ..........1..................1........................
F2follow:            ..............1......................1................
toDelete:            .........11111..............111111111.................
toKeep:              111111111.....11111111111111.........11111111111111111A Pablo Kernel to create this mask can be created as follows.
MaskOutField2::MaskOutField2(BuilderRef b, StreamSet * Record_separators, 
                                           StreamSet * Field_separators, 
                                           StreamSet * toKeep)
: PabloKernel(b, "MaskOutField2",
  {Binding{"Record_separators", Record_separators}, 
   Binding{"Field_separators", Field_separators}},
  {Binding{"toKeep", toKeep}})  {}
void MaskOutField2::generatePabloMethod() {
    PabloBuilder pb(getEntryScope());
    Var * Record_separators = pb.createExtract(getInputStreamVar("Record_separators"), pb.getInteger(0));    
    Var * Field_separators = pb.createExtract(getInputStreamVar("Field_separators"), pb.getInteger(0));
    PabloAST * F1start = pb.createNot(pb.createAdvance(pb.createNot(Record_separators), 1));
    PabloAST * F1follow = pb.createScanTo(F1start, Field_separators);
    PabloAST * F2start = pb.createAdvance(F1follow, 1);
    PabloAST * F2follow = pb.createScanTo(F2start, Field_separators);
    PabloAST * toDelete = pb.createIntrinsicCall(pablo::Intrinsic::SpanUpTo, {F1follow, F2follow});
    PabloAST * toKeep = pb.createNot(toDelete);
    pb.createAssign(pb.createExtract(getOutputStreamVar("toKeep"), pb.getInteger(0)), pb.createInFile(toKeep));
}Of course, a slightly different kernel is needed for masking out a column other than the
second one.    This should be written using a columnNo parameter to a more generic
kernel, and performing the necessary number of ScanTo and Advance operations.
The name of the kernel should actually be different for each columnNo.
MaskOutField::MaskOutField2(BuilderRef b, StreamSet * Record_separators, 
                                          StreamSet * Field_separators, 
                                          StreamSet * toKeep,
                                          unsigned columnNo)
: PabloKernel(b, "MaskOutField" + std::to_string(columnNo),
  {Binding{"Record_separators", Record_separators}, 
   Binding{"Field_separators", Field_separators}},
  {Binding{"toKeep", toKeep}})  {}Finally, the first column must be handled differently. In this case, there is no preceding comma, so the mask should zero out the following comma rather than the preceding comma.