Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
P parabix-devel
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 9
    • Issues 9
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 2
    • Merge requests 2
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • cameron
  • parabix-devel
  • Wiki
  • Shuffle Pattern Library

Shuffle Pattern Library · Changes

Page history
Update Shuffle Pattern Library authored Jul 13, 2021 by cameron's avatar cameron
Hide whitespace changes
Inline Side-by-side
Showing with 9 additions and 10 deletions
+9 -10
  • Shuffle-Pattern-Library.md Shuffle-Pattern-Library.md +9 -10
  • No files found.
Shuffle-Pattern-Library.md
View page @ 57bc0735
# Towards a Library of Shufflevector Patterns # Towards a Library of Shufflevector Patterns
The LLVM IR contains a very powerful vector operation called `shufflevector`. Unless the target architecture supports a similarly powerful operation directly, good code generation for shufflevector instructions with arbitrary shuffle patterns is very difficult. The LLVM IR contains a very powerful vector operation called `shufflevector`. Unless the target architecture supports a similarly powerful operation directly, good code generation for shufflevector instructions with arbitrary shuffle patterns is very difficult.
...@@ -32,7 +31,7 @@ Canonical Selection from Constant Splats ...@@ -32,7 +31,7 @@ Canonical Selection from Constant Splats
In general, in discussing canonical forms, we refer to the three consecutive arguments of `shufflevector` as vector 1 or v1, vector 2 or v2 and mask. In general, in discussing canonical forms, we refer to the three consecutive arguments of `shufflevector` as vector 1 or v1, vector 2 or v2 and mask.
A constant splat is a constant vector having the same value in every field. When used as a vector, `zeroinitializer`` is a constant splat of the value 0. A constant splat is a constant vector having the same value in every field. When used as a vector, `zeroinitializer` is a constant splat of the value 0.
Whenever one of the input vectors is a constant splat vector, then the canonical selection mask values for selecting a value from this vector must always specify selection from the first field in the vector. That is, if the size of the input vectors is n, and vector 2 is a constant splat, then the index n is the canonical index for selection from this vector. Conversion to constant indexes in this case is illustrated in the second step of the example above. If vector 1 is a constant splat, then the canonical index for selection from this vector is 0. Whenever one of the input vectors is a constant splat vector, then the canonical selection mask values for selecting a value from this vector must always specify selection from the first field in the vector. That is, if the size of the input vectors is n, and vector 2 is a constant splat, then the index n is the canonical index for selection from this vector. Conversion to constant indexes in this case is illustrated in the second step of the example above. If vector 1 is a constant splat, then the canonical index for selection from this vector is 0.
...@@ -58,10 +57,10 @@ mask'[i] = (mask[i] + n) mod (2n) ...@@ -58,10 +57,10 @@ mask'[i] = (mask[i] + n) mod (2n)
The following rules define canonical order for the vector arguments. The following rules define canonical order for the vector arguments.
If one of the vectors is `undef`, then vector 2 must be `undef` in canonical order. 1. If one of the vectors is `undef`, then vector 2 must be `undef` in canonical order.
If one of the vectors is `zeroinitializer `(or the equivalent constant), and the other vector is not undef, then vector 2 must be `zeroinitializer` in canonical order. 1. If one of the vectors is `zeroinitializer `(or the equivalent constant), and the other vector is not undef, then vector 2 must be `zeroinitializer` in canonical order.
If one of the vectors is a constant vector and the other vector is neither constant nor undef, then vector 2 must be a constant vector in canonical order. 1. If one of the vectors is a constant vector and the other vector is neither constant nor undef, then vector 2 must be a constant vector in canonical order.
If neither vector is a constant splat or undef, then the first selector value must be < n, (i.e., selecting from vector 1). 1. If neither vector is a constant splat or undef, then the first selector value must be < n, (i.e., selecting from vector 1).
#### Canonical Vector Size #### Canonical Vector Size
...@@ -93,8 +92,8 @@ Lane detection is a recursive binary process. The first step is to determine if ...@@ -93,8 +92,8 @@ Lane detection is a recursive binary process. The first step is to determine if
To determine whether a shuffle vector pattern may be divided into lanes, the following algorithm is applied. To determine whether a shuffle vector pattern may be divided into lanes, the following algorithm is applied.
Given a shuffle vector mask of 2n elements labelled m0, ... m2n-1, form the two submask vectors m0, m2n-1-1 and m2n-1, ... m2n-1. 1. Given a shuffle vector mask of 2n elements labelled m0, ... m2n-1, form the two submask vectors m0, m2n-1-1 and m2n-1, ... m2n-1.
Compare the two vectors, element by element. The mask is ruled a laned operation if the following conditions hold in each case. 1. Compare the two vectors, element by element. The mask is ruled a laned operation if the following conditions hold in each case.
The value of m2n-1+i - mi = 2n-1, or The value of m2n-1+i - mi = 2n-1, or
The value of m2n-1+i = mi = 2n, and vector 2 is a constant splat. The value of m2n-1+i = mi = 2n, and vector 2 is a constant splat.
...@@ -110,7 +109,7 @@ def pattern_signature(P): ...@@ -110,7 +109,7 @@ def pattern_signature(P):
### Standard Pattern Examples ### Standard Pattern Examples
In the following examples, the patterns are defined assuming that the shufflevector arguments have been first placed in canonical form as described above and lanes have been determined. In the following examples, the patterns are defined assuming that the `shufflevector` arguments have been first placed in canonical form as described above and lanes have been determined.
#### The Rotate Pattern #### The Rotate Pattern
...@@ -208,4 +207,4 @@ The Parallel Deposit Pattern is essentially a vector expand operation, which pre ...@@ -208,4 +207,4 @@ The Parallel Deposit Pattern is essentially a vector expand operation, which pre
The pattern must consist of a set of consecutive selectors from the first vector (starting with 0), interspersed with zero selectors (i.e., selecting from `zeroinitializer`. The pattern must consist of a set of consecutive selectors from the first vector (starting with 0), interspersed with zero selectors (i.e., selecting from `zeroinitializer`.
In the case of i1 vectors, this corresponds to the parallel bit deposit of the BitShuffle project. In the case of i1 vectors, this corresponds to the parallel bit deposit of the BitShuffle project.
\ No newline at end of file
Clone repository
  • Bracket Matching
  • CSV Validation
  • CSVediting
  • CSVparsing
  • Character Code Compilers
  • KernelLibrary
  • Pablo
  • ParabixTransform
  • Parallel Deletion
  • Parallel Hashing
  • Performance Testing Script
  • Shuffle Pattern Library
  • StaticCCC
  • String Insertion
  • UCD: Unicode Property Database and Compilers
View All Pages