Patterns
Pattern is the first key concept (out of three) in PyKokkos (Concepts). Patterns specify the structure of computation. There are three key patterns available in PyKokkos:
parallel_for
, which is also known as amap
operation in other frameworks/languagesparallel_reduce
, which is also known as afold
operation in other frameworks/languagesparallel_scan
, which implements a prefix scan
Parallel for
The most commonly used pattern is parallel_for
. The pattern is
available as a function in the pykokkos
library and has the
following signature:
parallel_for([label], policy, workunit, [keyword arguments])
label is an optional string value helpful for debugging and profiling
policy specifies the way computations are executed (execution place and number of work units to run in parallel). In its simplest form, policy is an integer value that specifies a range of values. More details about policies is provided in a separate page (Policies)
workunit is the name of the
@pk.workunit
function that performs one unit of workarguments are keyword arguments passed to the workunit
Based on the policy, the parallel_for
will execute a number of
work units in parallel. Each work unit is executed independently and
there are no guarantees about the execution order. At the same time,
any number of work units might be running in parallel or they might be
executed sequentially if the runtime determines that such an execution
would be beneficial for the overall performance.
Below is an example to illustrate the parallel_for
pattern.
import pykokkos as pk
@pk.workunit
def hello(i: int):
pk.printf("Hello, World! from i = %d\n", i)
def main():
pk.parallel_for(10, hello)
main()
In this example, the policy is simply an integer value (10
) that
specifies a range (0..9
) of unique ids for work units to be
spawned (one work unit for one id). Here is the output for the
example:
Hello, World! from i = 0
Hello, World! from i = 8
Hello, World! from i = 4
Hello, World! from i = 1
Hello, World! from i = 9
Hello, World! from i = 5
Hello, World! from i = 6
Hello, World! from i = 2
Hello, World! from i = 3
Hello, World! from i = 7
Parallel reduce
The pattern parallel_reduce
implements a reduction. This pattern
is similar in many ways to parallel_for
except that each work unit
produces a value, and all the values are eventually accumulated into a
single value (known as an accumulator). This pattern is available as
a function in the pykokkos
library and has the following
signature:
parallel_reduce([label], policy, workunit, [keyword arguments])
label is an optional string value helpful for debugging and profiling
policy specifies the way computations are executed (execution place and number of workunits to run in parallel). In its simplest form, policy is an integer value that specifies a range of values. More details about policies is provided in a separate page (Policies)
workunit is the name of the
@pk.workunit
function that performs one unit of workarguments are keyword arguments passed to the workunit
Based on the policy, parallel_reduce
runs a number of work units.
Each work unit receives two arguments in addition to the specified
keyword arguments: (1) unique id of the work unit, and (2) an
accumulator.
Below is an example to illustrate the parallel_reduce
pattern:
import pykokkos as pk
import numpy as np
@pk.workunit
def work(wid, acc, a):
acc += a[wid]
def main():
N = 10
a = np.random.randint(100, size=(N))
print(a)
total = pk.parallel_reduce("work", N, work, a=a)
print(total)
main()
In the example, we run N
(which is set to 10
) work units to
compute the sum of all elements in a numpy array (a
). Note that
the first two arguments to the workunit (wid
which is a unique
identifier of a work unit, and acc
which is an accumulator) are
provided at runtime by the framework.
Parallel scan
The pattern parallel_scan
implements a prefix scan. This pattern
is very much like parallel_reduce
, but it also stores all
intermediate results. The pattern is available as a function in the
pykokkos
library and has the following signature:
parallel_scan([label], policy, workunit, [keyword arguments])
label is an optional string value helpful for debugging and profiling
policy specifies the way computations are executed (execution place and number of workunits to run in parallel). In its simplest form, policy is an integer value that specifies a range of values. More details about policies is provided in a separate page (Policies)
workunit is the name of the
@pk.workunit
function that performs one unit of workarguments are keyword arguments passed to the workunit
As before, based on the policy, parallel_scan
runs a number of
units of work. Each unit of work receives three arguments in
addition to the given keyword arguments: (1) unique id of the unit of
work, (2) an accumulator, and (3) a boolean flag to indicate if the
scan for the current unit of work is complete.
Below is an example to illustrate the parallel_scan
pattern:
import pykokkos as pk
import numpy as np
@pk.workunit
def work(wid, acc, final, a):
acc += a[wid]
if final:
a[wid] = acc
def main():
N = 10
a = np.random.randint(100, size=(N))
print(a)
pk.parallel_scan("work", N, work, a=a)
print(a)
main()
The output for the example above for a single run is:
[59 60 48 65 41 22 64 59 91 24]
[ 59 119 167 232 273 295 359 418 509 533]