fence
¶
Header File: <Kokkos_Core.hpp>
Usage:
Kokkos::fence();
Blocks on completion of all outstanding asynchronous Kokkos operations. That includes parallel dispatch (e.g. parallel_for(), parallel_reduce() and parallel_scan()) as well as asynchronous data operations such as three-argument deep_copy.
Note: there is a execution space instance specific fence
too: ExecutionSpaceConcept
Interface¶
void Kokkos::fence();
void Kokkos::fence(const std::string& label);
Parameters¶
label
: A label to identify a specific fence in fence profiling operations.label
does not have to be unique.
Requirements¶
Kokkos::fence()
cannot be called inside an existing parallel region (i.e. inside theoperator()
of a functor or lambda).
Semantics¶
Blocks on completion of all outstanding asynchronous works. Side effects of outstanding work will be observable upon completion of the
fence
call - that meansKokkos::fence()
implies a memory fence.
Examples¶
Timing kernels¶
Kokkos::Timer timer;
// This operation is asynchronous, without a fence
// one would time only the launch overhead
Kokkos::parallel_for("Test", N, functor);
Kokkos::fence();
double time = timer.seconds();
Use with asynchronous deep copy¶
Kokkos::deep_copy(exec1, a,b);
Kokkos::deep_copy(exec2, a,b);
// do some stuff which doesn't touch a or b
Kokkos::parallel_for("Test", N, functor);
// wait for all three operations to finish
Kokkos::fence();
// do something with a and b