TeamVectorMDRange
#
Header File: <Kokkos_Core.hpp>
Description#
TeamVectorMDRange is a nested execution policy used inside of hierarchical parallelism.
Interface#
-
template<class Rank, typename TeamHandle>
class TeamVectorMDRange# Constructor
-
TeamVectorMDRange(team, extent_1, extent_2, ...);#
Splits an index range over the threads of the team and another index range over their vector lanes. Ranks for threading and vectorization determined by the backend.
- Parameters:
team – TeamHandle to the calling team execution context
extent_1, extent_2, ... – index range lengths of each rank
Requirements
TeamHandle
is a type that models TeamHandleextent_1, extent_2, ...
are intsEvery member thread of
team
must call the operation in the same branch, i.e. it is not legal to have some threads call this function in one branch, and the other threads ofteam
call it in another branchextent_i
is such thati >= 2 && i <= 8
is true. For example:TeamVectorMDRange(team, 4); // NOT OK, violates i>=2 TeamVectorMDRange(team, 4,5); // OK TeamVectorMDRange(team, 4,5,6); // OK TeamVectorMDRange(team, 4,5,6,2,3,4,5,6); // OK, max num of extents allowed
-
TeamVectorMDRange(team, extent_1, extent_2, ...);#
Examples#
using TeamHandle = TeamPolicy<>::member_type;
parallel_for(TeamPolicy<>(N,AUTO),
KOKKOS_LAMBDA(TeamHandle const& team) {
int leagueRank = team.league_rank();
auto range = TeamVectorMDRange<Rank<4>, TeamType>(team, n0, n1, n2, n3);
parallel_for(range,
[=](int i0, int i1, int i2, int i3) {
A(leagueRank, i0, i1, i2, i3) = B(leagueRank, i1) + C(i1, i2, i3);
});
team.team_barrier();
int teamSum = 0;
parallel_reduce(range,
[=](int i0, int i1, int i2, int i3, int& vectorSum) {
vectorSum += v(leagueRank, i, j, k, l);
}, teamSum
);
single(PerTeam(team), [&leagueSum, teamSum]() { leagueSum += teamSum; });
A_rowSum[leagueRank] = leagueSum;
});