fence
#
Header File: <Kokkos_Core.hpp>
Usage:
Kokkos::fence();
Blocks on completion of all outstanding asynchronous Kokkos operations. That includes parallel dispatch (e.g. parallel_for(), parallel_reduce() and parallel_scan()) as well as asynchronous data operations such as three-argument deep_copy.
Note: there is a execution space instance specific fence
too: ExecutionSpaceConcept
Interface#
void Kokkos::fence();
void Kokkos::fence(const std::string& label);
Parameters#
label
: A label to identify a specific fence in fence profiling operations.label
does not have to be unique.
Requirements#
Kokkos::fence()
cannot be called inside an existing parallel region (i.e. inside theoperator()
of a functor or lambda).
Semantics#
Blocks on completion of all outstanding asynchronous works. Side effects of outstanding work will be observable upon completion of the
fence
call - that meansKokkos::fence()
implies a memory fence.
Examples#
Timing kernels#
Kokkos::Timer timer;
// This operation is asynchronous, without a fence
// one would time only the launch overhead
Kokkos::parallel_for("Test", N, functor);
Kokkos::fence();
double time = timer.seconds();
Use with asynchronous deep copy#
Kokkos::deep_copy(exec1, a,b);
Kokkos::deep_copy(exec2, a,b);
// do some stuff which doesn't touch a or b
Kokkos::parallel_for("Test", N, functor);
// wait for all three operations to finish
Kokkos::fence();
// do something with a and b