TeamThreadMDRange¶
Header File: <Kokkos_Core.hpp>
Description¶
TeamThreadMDRange is a nested execution policy used inside of hierarchical parallelism.
Interface¶
-
template<class Rank, typename TeamHandle>
class TeamThreadMDRange¶ Constructor
-
TeamThreadMDRange(team, extent_1, extent_2, ...);¶
Splits the index range
0toextentover the threads of the team, whereextentis the backend-dependent rank that will be threaded- Parameters:
team – TeamHandle to the calling team execution context
extent_1, extent_2, ... – index range lengths of each rank
Requirements
TeamHandleis a type that models TeamHandleextent_1, extent_2, ...are intsEvery member thread of
teammust call the operation in the same branch, i.e. it is not legal to have some threads call this function in one branch, and the other threads ofteamcall it in another branchextent_iis such thati >= 2 && i <= 8is true. For example:TeamThreadMDRange(team, 4); // NOT OK, violates i>=2 TeamThreadMDRange(team, 4,5); // OK TeamThreadMDRange(team, 4,5,6); // OK TeamThreadMDRange(team, 4,5,6,2,3,4,5,6); // OK, max num of extents allowed
-
TeamThreadMDRange(team, extent_1, extent_2, ...);¶
Restrictions¶
Note that when used in parallel_reduce, the reduction is limited to a sum.
Examples¶
using TeamHandle = TeamPolicy<>::member_type;
parallel_for(TeamPolicy<>(N,AUTO),
KOKKOS_LAMBDA (TeamHandle const& team) {
int leagueRank = team.league_rank();
auto range = TeamThreadMDRange<Rank<4>, TeamHandle>(team, n0, n1, n2, n3);
parallel_for(range, [=](int i0, int i1, int i2, int i3) {
A(leagueRank, i0, i1, i2, i3) = B(leagueRank, i1) + C(i1, i2, i3);
});
team.team_barrier();
int teamSum = 0;
parallel_reduce(range,
[=](int i0, int i1, int i2, int i3, int& threadSum) {
threadSum += D(leagueRank, i0, i1, i2, i3);
}, teamSum
);
single(PerTeam(team), [&leagueSum, teamSum]() { leagueSum += teamSum; });
A_rowSum[leagueRank] = leagueSum;
});