parallel_scan
#
Header File: <Kokkos_Core.hpp>
Usage#
Kokkos::parallel_scan( name, policy, functor, result );
Kokkos::parallel_scan( name, policy, functor );
Kokkos::parallel_scan( policy, functor, result);
Kokkos::parallel_scan( policy, functor );
Dispatches parallel work defined by functor
according to the ExecutionPolicy policy
and perform a pre (exclusive) or post (inclusive) scan of the contributions
provided by the work items. The optional label name
is used by profiling and debugging tools. If provided, the final result is placed in result.
Interface#
-
template<class ExecPolicy, class FunctorType>
Kokkos::parallel_scan(const std::string &name, const ExecPolicy &policy, const FunctorType &functor);#
-
template<class ExecPolicy, class FunctorType>
Kokkos::parallel_scan(const ExecPolicy &policy, const FunctorType &functor);#
-
template<class ExecPolicy, class FunctorType, class ReturnType>
Kokkos::parallel_scan(const std::string &name, const ExecPolicy &policy, const FunctorType &functor, ReturnType &return_value);#
-
template<class ExecPolicy, class FunctorType, class ReturnType>
Kokkos::parallel_scan(const ExecPolicy &policy, const FunctorType &functor, ReturnType &return_value);#
Parameters:#
name
: A user provided string which is used in profiling and debugging tools via the Kokkos Profiling Hooks.ExecPolicy: An ExecutionPolicy which defines iteration space and other execution properties. Valid policies are:
IntegerType
: defines a 1D iteration range, starting from 0 and going to a count.RangePolicy: defines a 1D iteration range.
ThreadVectorRange: defines a 1D iteration range to be executed through vector parallelization dividing the threads within a team. Only valid inside a parallel region executed through a
TeamPolicy
or aTaskTeam
.
FunctorType: A valid functor with (at minimum) an
operator()
with a matching signature for theExecPolicy
combined with the reduced type.ReturnType: a POD type with
operator +=
andoperator =
, or aKokkos::View
.
Requirements:#
The
functor
has a member function of the formoperator() (const HandleType& handle, ReturnType& value, const bool final) const
oroperator() (const WorkTag, const HandleType& handle, ReturnType& value, const bool final) const
The
WorkTag
free form of the operator is used ifExecPolicy
is anIntegerType
orExecPolicy::work_tag
isvoid
.HandleType
is anIntegerType
ifExecPolicy
is anIntegerType
else it isExecPolicy::member_type
.
The type
ReturnType
of thefunctor
operator must be compatible with theReturnType
of the parallel_scan and must match the arguments of theinit
andjoin
functions of the functor if provided. If the functor doesn’t have aninit
member function, it is assumed that the identity for the scan operation is given by the default constructor of the value type (and not byreduction_identity`
).the functor must define FunctorType::value_type the same as ReturnType
Semantics#
Neither concurrency nor order of execution are guaranteed.
The
ReturnType
content will be overwritten, i.e. the value does not need to be initialized to the reduction-neutral element.The input value to the operator may contain a partial result, Kokkos may only combine the thread local contributions in the end. The operator should modify the input value according to the desired scan operation.
It is not guaranteed that the functor will ever be called with
final = false
.The functor might be called multiple times with
final = false
and the user has to make sure that the behavior in this case stays the same for repeated calls.
Examples#
#include<Kokkos_Core.hpp>
#include<cstdio>
int main(int argc, char* argv[]) {
Kokkos::initialize(argc,argv);
{
int N = argc>1?atoi(argv[1]):100;
int64_t result;
Kokkos::View<int64_t*>post("postfix_sum",N);
Kokkos::View<int64_t*>pre("prefix_sum",N);
Kokkos::parallel_scan("Loop1", N,
KOKKOS_LAMBDA(int64_t i, int64_t& partial_sum, bool is_final) {
if(is_final) pre(i) = partial_sum;
partial_sum += i;
if(is_final) post(i) = partial_sum;
}, result);
// pre (exclusive): 0,0,1,3,6,10,...
// post (inclusive): 0,1,3,6,10,...
// result: N*(N-1)/2
printf("Result: %i %li\n",N,result);
}
Kokkos::finalize();
}