cameron · 57bc0735
Hide whitespace changes
Inline Side-by-side

Showing with 9 additions and 10 deletions

Shuffle-Pattern-Library.md Shuffle-Pattern-Library.md +9 -10

No files found.
--- a/Shuffle-Pattern-Library.md
+++ b/Shuffle-Pattern-Library.md
 # Towards a Library of Shufflevector Patterns
 The LLVM IR contains a very powerful vector operation called  `shufflevector`. Unless the target architecture supports a similarly powerful operation directly, good code generation for shufflevector instructions with arbitrary shuffle patterns is very difficult.
@@ -32,7 +31,7 @@ Canonical Selection from Constant Splats
 In general, in discussing canonical forms, we refer to the three consecutive arguments of `shufflevector` as vector 1 or v1, vector 2 or v2 and mask.
-A constant splat is a constant vector having the same value in every field. When used as a vector, `zeroinitializer`` is a constant splat of the value 0.
+A constant splat is a constant vector having the same value in every field. When used as a vector, `zeroinitializer` is a constant splat of the value 0.
 Whenever one of the input vectors is a constant splat vector, then the canonical selection mask values for selecting a value from this vector must always specify selection from the first field in the vector. That is, if the size of the input vectors is n, and vector 2 is a constant splat, then the index n is the canonical index for selection from this vector. Conversion to constant indexes in this case is illustrated in the second step of the example above. If vector 1 is a constant splat, then the canonical index for selection from this vector is 0.
@@ -58,10 +57,10 @@ mask'[i] = (mask[i] + n) mod (2n)
 The following rules define canonical order for the vector arguments.
-    If one of the vectors is `undef`, then vector 2 must be `undef` in canonical order.
+1.    If one of the vectors is `undef`, then vector 2 must be `undef` in canonical order.
-    If one of the vectors is `zeroinitializer `(or the equivalent constant), and the other vector is not undef, then vector 2 must be `zeroinitializer` in canonical order.
+1.    If one of the vectors is `zeroinitializer `(or the equivalent constant), and the other vector is not undef, then vector 2 must be `zeroinitializer` in canonical order.
-    If one of the vectors is a constant vector and the other vector is neither constant nor undef, then vector 2 must be a constant vector in canonical order.
+1.    If one of the vectors is a constant vector and the other vector is neither constant nor undef, then vector 2 must be a constant vector in canonical order.
-    If neither vector is a constant splat or undef, then the first selector value must be < n, (i.e., selecting from vector 1). 
+1.    If neither vector is a constant splat or undef, then the first selector value must be < n, (i.e., selecting from vector 1). 
 #### Canonical Vector Size
@@ -93,8 +92,8 @@ Lane detection is a recursive binary process. The first step is to determine if
 To determine whether a shuffle vector pattern may be divided into lanes, the following algorithm is applied.
-    Given a shuffle vector mask of 2n elements labelled m0, ... m2n-1, form the two submask vectors m0, m2n-1-1 and m2n-1, ... m2n-1.
+1.    Given a shuffle vector mask of 2n elements labelled m0, ... m2n-1, form the two submask vectors m0, m2n-1-1 and m2n-1, ... m2n-1.
-    Compare the two vectors, element by element. The mask is ruled a laned operation if the following conditions hold in each case.
+1.    Compare the two vectors, element by element. The mask is ruled a laned operation if the following conditions hold in each case.
        The value of m2n-1+i - mi = 2n-1, or
        The value of m2n-1+i = mi = 2n, and vector 2 is a constant splat. 
@@ -110,7 +109,7 @@ def pattern_signature(P):
 ### Standard Pattern Examples
-In the following examples, the patterns are defined assuming that the shufflevector arguments have been first placed in canonical form as described above and lanes have been determined.
+In the following examples, the patterns are defined assuming that the `shufflevector` arguments have been first placed in canonical form as described above and lanes have been determined.
 #### The Rotate Pattern
@@ -208,4 +207,4 @@ The Parallel Deposit Pattern is essentially a vector expand operation, which pre
 The pattern must consist of a set of consecutive selectors from the first vector (starting with 0), interspersed with zero selectors (i.e., selecting from `zeroinitializer`.
 In the case of i1 vectors, this corresponds to the parallel bit deposit of the BitShuffle project.
\ No newline at end of file