Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
P parabix-devel
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 9
    • Issues 9
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 2
    • Merge requests 2
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • cameron
  • parabix-devel
  • Wiki
  • Parallel Deletion

Parallel Deletion · Changes

Page history
Update Parallel Deletion authored Jul 13, 2021 by cameron's avatar cameron
Hide whitespace changes
Inline Side-by-side
Showing with 15 additions and 10 deletions
+15 -10
  • Parallel-Deletion.md Parallel-Deletion.md +15 -10
  • No files found.
Parallel-Deletion.md
View page @ 37f5a128
...@@ -27,14 +27,21 @@ In the event that N bit streams are processed, it may be possible to reduce the ...@@ -27,14 +27,21 @@ In the event that N bit streams are processed, it may be possible to reduce the
# Overview of Parallel Deletion Methods # Overview of Parallel Deletion Methods
## Intra-Block Deletion vs. Block-by-Block Processing ## Intra-Field Deletion vs. Field-by-Field Processing
When implementing parallel deletion operations on a data stream, we first perform deletions within each block (intra-block parallel deletion) and then arrange to join together the results of adjacent blocks. When implementing parallel deletion operations on a data stream, we first perform deletions within fields of a particular size (intra-field parallel deletion) and then arrange to join together the results of adjacent fields.
For example, working with 128-bit SSE registers, we may delete 7 positions in block A, and 23 positions in block B. Given these results, we typically then want to join the partial blocks together to form a continuous data stream. For example, working with 128-bit SSE registers with 128-bit fields, we may delete 7 positions in block A, and 23 positions in block B. Given these results, we typically then want to join the partial blocks together to form a continuous data stream.
One method is to always maintain a partial block of pending results. When a new partial block is produced, we transfer as many positions as possible into the pending block. If we fill it, then we emit the full block and move the remaining bits from the new block to become the new pending block. The field granularity for these tasks may be the full block width, or a smaller field width that is more convenient for processing.
Intra-Block Parallel Deletion
One method is to always maintain a partial block of pending results. When a new partial block is produced, we transfer as many positions as possible into the pending block. If we fill it, then we emit the full block and move the remaining data from the new block to become the new pending block.
In the Parabix framework, two separate kernels are used for these two tasks. The `FieldCompress` kernel performs intra-field compression, while the `StreamCompress` kernel perform the field-by-field compression of partial data
fields. The `FilterByMask` routine combines conveniently generates the
necessary kernel calls to perform both tasks.
## Intra-Field Parallel Deletion
There are several different methods that can be used to implement parallel deletion within SIMD registers. Depending on the problem variation, it is often beneficial to combine methods. There are several different methods that can be used to implement parallel deletion within SIMD registers. Depending on the problem variation, it is often beneficial to combine methods.
...@@ -44,7 +51,7 @@ There are several different methods that can be used to implement parallel delet ...@@ -44,7 +51,7 @@ There are several different methods that can be used to implement parallel delet
1. Deletion using PEXT bit manipulation instructions. 1. Deletion using PEXT bit manipulation instructions.
1. Deletion by permutation vector computation. 1. Deletion by permutation vector computation.
## Intra-Field Deletion ## Full Field Deletion
When the ultimate goal is to produce streams of bytes, doublebytes, or other units, then it is often beneficial to limit the bit stream deletion work to smaller field sizes. For example, suppose that a stream of bytes is to be produced after application of parallel deletion to a the corresponding 8 parallel bit streams. Working with 128-bit registers, each register will ultimately hold at most 16 bytes. Then we may proceed as follows: When the ultimate goal is to produce streams of bytes, doublebytes, or other units, then it is often beneficial to limit the bit stream deletion work to smaller field sizes. For example, suppose that a stream of bytes is to be produced after application of parallel deletion to a the corresponding 8 parallel bit streams. Working with 128-bit registers, each register will ultimately hold at most 16 bytes. Then we may proceed as follows:
...@@ -52,6 +59,7 @@ When the ultimate goal is to produce streams of bytes, doublebytes, or other uni ...@@ -52,6 +59,7 @@ When the ultimate goal is to produce streams of bytes, doublebytes, or other uni
1. Perform the inverse Parabix transform to byte stream form, producing a squence of 8 register values in which the nondeleted bytes are leftmost in each register. 1. Perform the inverse Parabix transform to byte stream form, producing a squence of 8 register values in which the nondeleted bytes are leftmost in each register.
1. Sequentially write the register values to the output stream, advancing the output stream pointer by the count of nondeleted bytes for each register in turn. 1. Sequentially write the register values to the output stream, advancing the output stream pointer by the count of nondeleted bytes for each register in turn.
# Summary of Deletion Methods # Summary of Deletion Methods
## Parallel Prefix Deletion ## Parallel Prefix Deletion
...@@ -72,8 +80,5 @@ Intel introduced the PEXT instruction with its Haswell processors. This operatio ...@@ -72,8 +80,5 @@ Intel introduced the PEXT instruction with its Haswell processors. This operatio
## Deletion by Permutation Vector Computation ## Deletion by Permutation Vector Computation
This method is suitable for performing deletion at byte or large granularities, given a processor supporting shuffle or permutation operations with variable selectors. This method is suitable for performing deletion at byte or larger granularities, given a processor supporting shuffle or permutation operations with variable selectors.
# Deletion Kernels in the Parabix+LLVM Framework
In the Parabix+LLVM Framework, stream computations are organized in layers implemented by kernels.
\ No newline at end of file
Clone repository
  • Bracket Matching
  • CSV Validation
  • CSVediting
  • CSVparsing
  • Character Code Compilers
  • KernelLibrary
  • Pablo
  • ParabixTransform
  • Parallel Deletion
  • Parallel Hashing
  • Performance Testing Script
  • Shuffle Pattern Library
  • StaticCCC
  • String Insertion
  • UCD: Unicode Property Database and Compilers
View All Pages