... | @@ -4,18 +4,46 @@ Parabix technology is a high-performance programming framework for streaming |
... | @@ -4,18 +4,46 @@ Parabix technology is a high-performance programming framework for streaming |
|
text processing applications, leveraging both SIMD and multicore
|
|
text processing applications, leveraging both SIMD and multicore
|
|
parallel processing features.
|
|
parallel processing features.
|
|
|
|
|
|
|
|
## Programming Model: Kernels + Stream Sets = Programs
|
|
|
|
|
|
|
|
Parabix programming is based on the concepts of computational kernels operating on sets of data streams.
|
|
|
|
|
|
|
|
### Data Streams and Stream Sets
|
|
|
|
|
|
|
|
Data streams are streams of data fields all of a given bit width. If the bit width is N, the type
|
|
|
|
of the field is said to be `iN`, an integer of N bits. Bit streams are streams of type `i1`.
|
|
|
|
|
|
|
|
Stream sets are sets of data streams all of the same type and in one-to-one correspondence.
|
|
|
|
An `8 x i1`
|
|
|
|
stream set is a set of eight parallel bit streams.
|
|
|
|
All streams in the set are of the same length and are allocated and processed together by the underlying system.
|
|
|
|
|
|
|
|
A `1 x i8` stream is a stream of bytes. Most often, Parabix programs operation read byte streams from
|
|
|
|
files or other input sources, transform those streams into sets of bit streams and process those bit streams using
|
|
|
|
bitwise logic and shifting.
|
|
|
|
|
|
|
|
|
|
|
|
### Kernels: Stream Processing Functions
|
|
|
|
|
|
|
|
Parabix programs are assembled as sequences of kernels operating on stream sets. Kernels are generally
|
|
|
|
just functions, taking stream sets as input and producing stream sets as output.
|
|
|
|
|
|
## Parabix Transform
|
|
## Parabix Transform
|
|
|
|
|
|
The Parabix framework is based on the concept of parallel bit streams,
|
|
The Parabix framework is based on the concept of parallel bit streams,
|
|
a fundamentally new transform representation of text. Byte-oriented character stream data
|
|
a fundamentally new transform representation of text. Byte-oriented character stream data
|
|
is first transformed into eight parallel bit streams, each bit stream comprising
|
|
is first transformed into eight parallel bit streams, each bit stream comprising
|
|
one bit per character code unit. Code units may be ASCII characters or
|
|
one bit per character code unit. The byte stream is represented as a `1 x i8` stream set.
|
|
UTF-8 bytes, for example, with one parallel bit stream defined for each of
|
|
The transposed parallel bit streams comprise a `8 x i1` stream set of the same length of the
|
|
bit 0 through bit 7 of each code unit. Given such a representation, the
|
|
basis streams. The code units of the byte stream
|
|
|
|
may be ASCII characters or UTF-8 bytes, for example. The Parabix transform extracts
|
|
|
|
the bits of each byte and produces separate streams for each of them.
|
|
|
|
Given such a representation, the
|
|
128-bit SIMD (single-instruction multiple-data) registers of the SSE (Intel
|
|
128-bit SIMD (single-instruction multiple-data) registers of the SSE (Intel
|
|
architecture SIMD technology) or Altivec (Power PC architecture) may be used
|
|
architecture SIMD technology) or Altivec (Power PC architecture) may be used
|
|
to process 128 code unit positions at a time.
|
|
to process 128 code unit positions at a time.
|
|
|
|
|
|
|
|
The transposition process is implemented by the Parabix S2P kernel (S2P stands for serial-to-parallel).
|
|
See the [Parabix Transform](ParabixTransform) page for details.
|
|
See the [Parabix Transform](ParabixTransform) page for details.
|
|
|
|
|
|
## Alphabets, Character Classes, Unicode
|
|
## Alphabets, Character Classes, Unicode
|
... | @@ -38,24 +66,6 @@ input: This is just 1 abbreviated example of character stream input containing |
... | @@ -38,24 +66,6 @@ input: This is just 1 abbreviated example of character stream input containing |
|
|
|
|
|
Read about the [Parabix Character Class Compilers](CharacterClassCompiler) for more information.
|
|
Read about the [Parabix Character Class Compilers](CharacterClassCompiler) for more information.
|
|
|
|
|
|
## Programming Model: Kernels + Stream Sets = Programs
|
|
|
|
|
|
|
|
Parabix programming is based on the concepts of computational kernels operating on sets of data streams.
|
|
|
|
|
|
|
|
### Data Streams and Stream Sets
|
|
|
|
|
|
|
|
Data streams are streams of data fields all of a given bit width. If the bit width is N, the type
|
|
|
|
of the field is said to be `iN`, an integer of N bits. Bit streams are streams of type `i1`.
|
|
|
|
|
|
|
|
Stream sets are sets of data streams all of the same type and in one-to-one correspondence. An `8 x i1`
|
|
|
|
stream set is a set of eight parallel bit streams. All streams in the set are of the same length and are allocated and processed together by the underlying system.
|
|
|
|
|
|
|
|
### Kernels: Stream Processing Functions
|
|
|
|
|
|
|
|
Parabix programs are assembled as sequences of kernels operating on stream sets. Kernels are generally
|
|
|
|
just functions, taking stream sets as input and producing stream sets as output.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Build and Test
|
|
## Build and Test
|
|
|
|
|
... | | ... | |