KokkosBatched::Pbtrs¶
Defined in header: KokkosBatched_Pbtrs.hpp
template <typename ArgUplo, typename ArgAlgo>
struct SerialPbtrs {
template <typename ABViewType, typename BViewType>
KOKKOS_INLINE_FUNCTION
static int
invoke(const ABViewType &ab,
const BViewType& b);
};
Solves a system of the linear equations \(A \cdot X = B\) with a symmetric (or Hermitian) positive definite band matrix A using the Cholesky factorization computed by Pbtrf
.
1. For a real symmetric positive definite band matrix A, this solves a system of the linear equations \(A \cdot X = B\) for X using the \(A = U^T \cdot U \: \text{(if ArgUplo == Uplo::Upper)}\) or \(A = L \cdot L^T \: \text{(if ArgUplo == Uplo::Lower)}\) factorization
computed by Pbtrf
. Here, U is an upper triangular matrix and L is a lower triangular matrix. This operation is equivalent to the LAPACK routine SPBTRS
or DPBTRS
for single or double precision.
2. For a complex Hermitian positive definite band matrix A, this solves a system of the linear equations \(A \cdot X = B\) for X using the \(A = U^H \cdot U \: \text{(if ArgUplo == Uplo::Upper)}\) or \(A = L \cdot L^H \: \text{(if ArgUplo == Uplo::Lower)}\) factorization
computed by Pbtrs
. Here, U is an upper triangular matrix and L is a lower triangular matrix. This operation is equivalent to the LAPACK routine CPBTRS
or ZPBTRS
for single or double precision.
Parameters¶
- ab:
Input view containing the triangular factor U or L from the Cholesky factorization. See LAPACK reference for the band storage format.
- b:
Input/output view containing the right-hand side on input and the solution on output.
Type Requirements¶
ArgUplo
must be one of the following:KokkosBatched::Uplo::Upper
for upper triangular factorizationKokkosBatched::Uplo::Lower
for lower triangular factorization
ArgAlgo
must beKokkosBatched::Algo::Pbtrs::Unblocked
for the unblocked algorithmABViewType
must be a Kokkos View of rank 2 containing the band matrix ABViewType
must be a Kokkos View of rank 1 containing the right-hand side that satisfies -std::is_same_v<typename BViewType::value_type, typename BViewType::non_const_value_type> == true
Example¶
//@HEADER
// ************************************************************************
//
// Kokkos v. 4.0
// Copyright (2022) National Technology & Engineering
// Solutions of Sandia, LLC (NTESS).
//
// Under the terms of Contract DE-NA0003525 with NTESS,
// the U.S. Government retains certain rights in this software.
//
// Part of Kokkos, under the Apache License v2.0 with LLVM Exceptions.
// See https://kokkos.org/LICENSE for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//@HEADER
#include <Kokkos_Core.hpp>
#include <Kokkos_Random.hpp>
#include <KokkosBatched_Pbtrf.hpp>
#include <KokkosBatched_Pbtrs.hpp>
using ExecutionSpace = Kokkos::DefaultExecutionSpace;
/// \brief Example of batched pbtrf/pbtrs
/// Solving A * x = b, where
/// A: [[4, 1, 0],
/// [1, 4, 1],
/// [0, 1, 4]]
/// b: [1, 1, 1]
/// x: [3/14, 1/7, 3/14]
/// A is a real symmetric (or complex Hermitian) positive definite matrix
///
/// In upper band storage,
/// Ab: [[0, 1, 1],
/// [4, 4, 4],]
/// In lower band storage,
/// Ab: [[4, 4, 4],
/// [1, 1, 0],]
///
/// This corresponds to the following system of equations:
/// 4 x0 + x1 = 1
/// x0 + 4 x1 + x2 = 1
/// x1 + 4 x2 = 1
///
int main(int /*argc*/, char** /*argv*/) {
Kokkos::initialize();
{
using View2DType = Kokkos::View<double**, ExecutionSpace>;
using View3DType = Kokkos::View<double***, ExecutionSpace>;
const int Nb = 10, n = 3, k = 1;
// Matrix Ab in band storage
View3DType Ab("Ab", Nb, k + 1, n);
// Solution
View2DType x("x", Nb, n);
// Initialize Ab and x
Kokkos::deep_copy(x, 1.0);
auto h_Ab = Kokkos::create_mirror_view(Ab);
// Upper triangular matrix
for (int ib = 0; ib < Nb; ib++) {
// Fill the diagonal elements
h_Ab(ib, 1, 0) = 4.0;
h_Ab(ib, 1, 1) = 4.0;
h_Ab(ib, 1, 2) = 4.0;
// Fill the super-diagonal elements
h_Ab(ib, 0, 1) = 1.0;
h_Ab(ib, 0, 2) = 1.0;
}
Kokkos::deep_copy(Ab, h_Ab);
// solve A * x = b with pbtrf and pbtrs
ExecutionSpace exec;
using policy_type = Kokkos::RangePolicy<ExecutionSpace, Kokkos::IndexType<int>>;
policy_type policy{exec, 0, Nb};
Kokkos::parallel_for(
"pbtrf-pbtrs", policy, KOKKOS_LAMBDA(int ib) {
auto sub_Ab = Kokkos::subview(Ab, ib, Kokkos::ALL, Kokkos::ALL);
auto sub_x = Kokkos::subview(x, ib, Kokkos::ALL);
// Factorize Ab by pbtrf
KokkosBatched::SerialPbtrf<KokkosBatched::Uplo::Upper, KokkosBatched::Algo::Pbtrf::Unblocked>::invoke(sub_Ab);
// Solve A * x = b with pbtrs
KokkosBatched::SerialPbtrs<KokkosBatched::Uplo::Upper, KokkosBatched::Algo::Pbtrs::Unblocked>::invoke(sub_Ab,
sub_x);
});
// Confirm that the results are correct
auto h_x = Kokkos::create_mirror_view_and_copy(Kokkos::HostSpace{}, x);
bool correct = true;
double eps = 1.0e-12;
for (int ib = 0; ib < Nb; ib++) {
if (Kokkos::abs(h_x(ib, 0) - 3.0 / 14.0) > eps) correct = false;
if (Kokkos::abs(h_x(ib, 1) - 1.0 / 7.0) > eps) correct = false;
if (Kokkos::abs(h_x(ib, 2) - 3.0 / 14.0) > eps) correct = false;
}
if (correct) {
std::cout << "pbtrf/pbtrs works correctly!" << std::endl;
}
}
Kokkos::finalize();
}
output:
pbtrf/pbtrs works correctly!