TeamPolicy
#
Header File: <Kokkos_Core.hpp>
Usage#
Kokkos::TeamPolicy<>( league_size, team_size [, vector_length])
Kokkos::TeamPolicy<ARGS>(league_size, team_size [, vector_length])
Kokkos::TeamPolicy<>(Space, league_size, team_size [, vector_length])
Kokkos::TeamPolicy<ARGS>(Space, league_size, team_size [, vector_length])
Execution policy for a 1D iteration space starting at begin and going to end with an open interval.
See also: TeamMember
Description#
-
template<class ...Args>
class TeamPolicy# Template Arguments
Valid template arguments for TeamPolicy are described here
Public nested typedefs
-
execution_space#
-
schedule_type#
-
work_tag#
-
index_type#
-
iteration_pattern#
-
launch_bounds#
-
member_type#
Constructors
-
TeamPolicy()#
Default constructor uninitialized policy.
-
TeamPolicy(const TeamPolicy&) = default;#
Copy constructor
-
TeamPolicy(TeamPolicy&&) = default;#
Move constructor
-
TeamPolicy(index_type league_size, index_type team_size, index_type vector_length = 1)#
Request to launch
league_size
work items, each of which is assigned to a team of threads withteam_size
threads, using a vector length ofvector_length
. If the team size is not possible when calling a parallel policy, that kernel launch may throw.
-
TeamPolicy(index_type league_size, Impl::AUTO_t, index_type vector_length = 1)#
Request to launch
league_size
work items, each of which is assigned to a team of threads of a size determined by Kokkos, using a vector length ofvector_length
. The team size may be determined lazily at launch time, taking into account properties of the functor.
-
TeamPolicy(execution_space space, index_type league_size, index_type team_size, index_type vector_length = 1)#
Request to launch
league_size
work items, each of which is assigned to a team of threads withteam_size
threads, using a vector length ofvector_length
. If the team size is not possible when calling a parallel policy, that kernel launch may throw. Use the provided execution space instance during a kernel launch.
-
TeamPolicy(execution_space space, index_type league_size, Impl::AUTO_t, index_type vector_length = 1)#
Request to launch
league_size
work items, each of which is assigned to a team of threads of a size determined by Kokkos, using a vector length ofvector_length
. The team size may be determined lazily at launch time, taking into account properties of the functor. Use the provided execution space instance during a kernel launch.
Runtime Settings
-
inline TeamPolicy &set_chunk_size(int chunk);#
Set the chunk size. Each physical team of threads will get assigned
chunk
consecutive teams. Default is 1.Returns: reference to
*this
-
inline TeamPolicy &set_scratch_size(const int &level, const Impl::PerTeamValue &per_team);#
-
inline TeamPolicy &set_scratch_size(const int &level, const Impl::PerThreadValue &per_thread);#
-
inline TeamPolicy &set_scratch_size(const int &level, const Impl::PerTeamValue &per_team, const Impl::PerThreadValue &per_thread);#
-
inline TeamPolicy &set_scratch_size(const int &level, const Impl::PerThreadValue &per_thread, const Impl::PerTeamValue &per_team);#
Set the per team and per thread scratch size.
level
: set the storage level. 0 is closest cache. 1 is closest storage (e.g. high bandwidth memory)per_team
: wrapper for the per team size of scratch in bytes. Returned by the functionPerTeam(int)
.per_thread
: wrapper for the per thread size of scratch in bytes. Returned by the functionPerThread(int)
.
One can set the scratch size for level 0 and 1 independently by calling the function twice. Subsequent calls with the same level overwrite the previous value. Returns: reference to
*this
Query Limits of Runtime Settings
-
template<class FunctorType>
int team_size_max(const FunctorType &f, const ParallelForTag&) const;#
-
template<class FunctorType>
int team_size_max(const FunctorType &f, const ParallelReduceTag&) const;# Query the maximum team size possible given a specific functor. The tag denotes whether this is for a
parallel_for()
or aparallel_reduce()
. Note: this is not a static function! The function will take into account settings for vector length and scratch size of*this
. Using a value larger than the return value will result in dispatch failure. Returns: The maximum value forteam_size
allowed to be given to be used with an otherwise identicalTeamPolicy
for dispatching the functorf
.
-
template<class FunctorType>
int team_size_recommended(const FunctorType &f, const ParallelForTag&) const;#
-
template<class FunctorType>
int team_size_recommended(const FunctorType &f, const ParallelReduceTag&) const;# Query the recommended team size for the specific functor
f
. The tag denotes whether this is for aparallel_for()
or aparallel_reduce()
. Note: this is not a static function! The function will take into account settings for vector length and scratch size of*this
. Returns: The recommended value forteam_size
to be given to be used with an otherwise identicalTeamPolicy
for dispatching the functorf
.
-
static int vector_length_max();#
Returns: the maximum valid value for vector length.
-
static int scratch_size_max(int level);#
Returns: the maximum total scratch size in bytes, for the given level. Note: If a kernel performs team-level reductions or scan operations, not all of this memory will be available for dynamic user requests. Some of that maximal scratch size is being used for internal operations. The actual size of these internal allocations depends on the value type used in the reduction or scan.
Query Runtime Settings
-
int team_size() const;#
Returns: the requested team size.
-
int league_size() const;#
Returns: the requested league size.
-
int scratch_size(int level, int team_size_ = -1) const;#
This function returns the total scratch size requested. If
team_size
is not provided, the team size for the calculation is used from the internal setting (i.e. the result of callingthis->team_size()
). Otherwise, the provided team size is used. Returns: the value for the total scratch size in bytes in the specified scratch level.
-
int team_scratch_size(int level) const;#
Returns: the value for the per team scratch size in bytes in the specified scratch level.
-
int thread_scratch_size(int level) const;#
Returns: the value for the per thread scratch size in bytes in the specified scratch level.
-
int chunk_size() const;#
Returns: the chunk size, set via
set_chunk_size()
.
-
execution_space#
Examples#
TeamPolicy<> policy_1(N,AUTO);
TeamPolicy<Cuda> policy_2(N,T);
TeamPolicy<Schedule<Dynamic>, OpenMP> policy_3(N,AUTO,8);
TeamPolicy<IndexType<int>, Schedule<Dynamic>> policy_4(N,1,4);
TeamPolicy<OpenMP> policy_5(OpenMP(), N, AUTO);