Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
P parabix-devel
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 9
    • Issues 9
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 2
    • Merge requests 2
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • cameron
  • parabix-devel
  • Wiki
  • CSVediting

CSVediting · Changes

Page history
Update CSVediting authored Nov 27, 2021 by cameron's avatar cameron
Show whitespace changes
Inline Side-by-side
Showing with 23 additions and 2 deletions
+23 -2
  • CSVediting.md CSVediting.md +23 -2
  • No files found.
CSVediting.md
View page @ f493132d
...@@ -72,7 +72,7 @@ second one. This should be written using a `columnNo` parameter to a more gen ...@@ -72,7 +72,7 @@ second one. This should be written using a `columnNo` parameter to a more gen
kernel, and performing the necessary number of `ScanTo` and `Advance` operations. kernel, and performing the necessary number of `ScanTo` and `Advance` operations.
The name of the kernel should actually be different for each columnNo. The name of the kernel should actually be different for each columnNo.
``` ```
MaskOutField::MaskOutField2(BuilderRef b, StreamSet * Record_separators, MaskOutField::MaskOutField(BuilderRef b, StreamSet * Record_separators,
StreamSet * Field_separators, StreamSet * Field_separators,
StreamSet * toKeep, StreamSet * toKeep,
unsigned columnNo) unsigned columnNo)
...@@ -85,3 +85,24 @@ MaskOutField::MaskOutField2(BuilderRef b, StreamSet * Record_separators, ...@@ -85,3 +85,24 @@ MaskOutField::MaskOutField2(BuilderRef b, StreamSet * Record_separators,
Finally, the first column must be handled differently. In this case, there is Finally, the first column must be handled differently. In this case, there is
no preceding comma, so the mask should zero out the following comma rather than the no preceding comma, so the mask should zero out the following comma rather than the
preceding comma. preceding comma.
## Matching and Deleting a Row
Suppose that we have a regular expression R to select CSV rows for deletion.
The Parabix regular expression engine can be used to perform the matching and
`FilterByMask` can be used for the deletion.
Using the ICGrepKernel to perform the matching may be implemented as follows.
```
auto options = std::make_unique<GrepKernelOptions>(&cc::UTF8);
options->setSource(BasisBits);
StreamSet * MatchResults = P->CreateStreamSet(1, 1);
options->setResults(MatchResults);
options->setRE(R);
P->CreateKernelCall<ICGrepKernel>(std::move(options));
```
Clone repository
  • Bracket Matching
  • CSV Validation
  • CSVediting
  • CSVparsing
  • Character Code Compilers
  • KernelLibrary
  • Pablo
  • ParabixTransform
  • Parallel Deletion
  • Parallel Hashing
  • Performance Testing Script
  • Shuffle Pattern Library
  • StaticCCC
  • String Insertion
  • UCD: Unicode Property Database and Compilers
View All Pages