|
|
|
|
|
# Towards a Library of Shufflevector Patterns
|
|
# Towards a Library of Shufflevector Patterns
|
|
|
|
|
|
The LLVM IR contains a very powerful vector operation called `shufflevector`. Unless the target architecture supports a similarly powerful operation directly, good code generation for shufflevector instructions with arbitrary shuffle patterns is very difficult.
|
|
The LLVM IR contains a very powerful vector operation called `shufflevector`. Unless the target architecture supports a similarly powerful operation directly, good code generation for shufflevector instructions with arbitrary shuffle patterns is very difficult.
|
... | @@ -32,7 +31,7 @@ Canonical Selection from Constant Splats |
... | @@ -32,7 +31,7 @@ Canonical Selection from Constant Splats |
|
|
|
|
|
In general, in discussing canonical forms, we refer to the three consecutive arguments of `shufflevector` as vector 1 or v1, vector 2 or v2 and mask.
|
|
In general, in discussing canonical forms, we refer to the three consecutive arguments of `shufflevector` as vector 1 or v1, vector 2 or v2 and mask.
|
|
|
|
|
|
A constant splat is a constant vector having the same value in every field. When used as a vector, `zeroinitializer`` is a constant splat of the value 0.
|
|
A constant splat is a constant vector having the same value in every field. When used as a vector, `zeroinitializer` is a constant splat of the value 0.
|
|
|
|
|
|
Whenever one of the input vectors is a constant splat vector, then the canonical selection mask values for selecting a value from this vector must always specify selection from the first field in the vector. That is, if the size of the input vectors is n, and vector 2 is a constant splat, then the index n is the canonical index for selection from this vector. Conversion to constant indexes in this case is illustrated in the second step of the example above. If vector 1 is a constant splat, then the canonical index for selection from this vector is 0.
|
|
Whenever one of the input vectors is a constant splat vector, then the canonical selection mask values for selecting a value from this vector must always specify selection from the first field in the vector. That is, if the size of the input vectors is n, and vector 2 is a constant splat, then the index n is the canonical index for selection from this vector. Conversion to constant indexes in this case is illustrated in the second step of the example above. If vector 1 is a constant splat, then the canonical index for selection from this vector is 0.
|
|
|
|
|
... | @@ -58,10 +57,10 @@ mask'[i] = (mask[i] + n) mod (2n) |
... | @@ -58,10 +57,10 @@ mask'[i] = (mask[i] + n) mod (2n) |
|
|
|
|
|
The following rules define canonical order for the vector arguments.
|
|
The following rules define canonical order for the vector arguments.
|
|
|
|
|
|
If one of the vectors is `undef`, then vector 2 must be `undef` in canonical order.
|
|
1. If one of the vectors is `undef`, then vector 2 must be `undef` in canonical order.
|
|
If one of the vectors is `zeroinitializer `(or the equivalent constant), and the other vector is not undef, then vector 2 must be `zeroinitializer` in canonical order.
|
|
1. If one of the vectors is `zeroinitializer `(or the equivalent constant), and the other vector is not undef, then vector 2 must be `zeroinitializer` in canonical order.
|
|
If one of the vectors is a constant vector and the other vector is neither constant nor undef, then vector 2 must be a constant vector in canonical order.
|
|
1. If one of the vectors is a constant vector and the other vector is neither constant nor undef, then vector 2 must be a constant vector in canonical order.
|
|
If neither vector is a constant splat or undef, then the first selector value must be < n, (i.e., selecting from vector 1).
|
|
1. If neither vector is a constant splat or undef, then the first selector value must be < n, (i.e., selecting from vector 1).
|
|
|
|
|
|
#### Canonical Vector Size
|
|
#### Canonical Vector Size
|
|
|
|
|
... | @@ -93,8 +92,8 @@ Lane detection is a recursive binary process. The first step is to determine if |
... | @@ -93,8 +92,8 @@ Lane detection is a recursive binary process. The first step is to determine if |
|
|
|
|
|
To determine whether a shuffle vector pattern may be divided into lanes, the following algorithm is applied.
|
|
To determine whether a shuffle vector pattern may be divided into lanes, the following algorithm is applied.
|
|
|
|
|
|
Given a shuffle vector mask of 2n elements labelled m0, ... m2n-1, form the two submask vectors m0, m2n-1-1 and m2n-1, ... m2n-1.
|
|
1. Given a shuffle vector mask of 2n elements labelled m0, ... m2n-1, form the two submask vectors m0, m2n-1-1 and m2n-1, ... m2n-1.
|
|
Compare the two vectors, element by element. The mask is ruled a laned operation if the following conditions hold in each case.
|
|
1. Compare the two vectors, element by element. The mask is ruled a laned operation if the following conditions hold in each case.
|
|
The value of m2n-1+i - mi = 2n-1, or
|
|
The value of m2n-1+i - mi = 2n-1, or
|
|
The value of m2n-1+i = mi = 2n, and vector 2 is a constant splat.
|
|
The value of m2n-1+i = mi = 2n, and vector 2 is a constant splat.
|
|
|
|
|
... | @@ -110,7 +109,7 @@ def pattern_signature(P): |
... | @@ -110,7 +109,7 @@ def pattern_signature(P): |
|
|
|
|
|
### Standard Pattern Examples
|
|
### Standard Pattern Examples
|
|
|
|
|
|
In the following examples, the patterns are defined assuming that the shufflevector arguments have been first placed in canonical form as described above and lanes have been determined.
|
|
In the following examples, the patterns are defined assuming that the `shufflevector` arguments have been first placed in canonical form as described above and lanes have been determined.
|
|
|
|
|
|
#### The Rotate Pattern
|
|
#### The Rotate Pattern
|
|
|
|
|
... | | ... | |