gko::matrix::Csr#

Compressed sparse row format. Stores each row’s nonzeros as a contiguous slice of value and column-index arrays, with a row-pointer array locating the start of each row. The general-purpose default for sparse matrices in Ginkgo.

template<typename ValueType = default_precision, typename IndexType = int32>
class Csr #

Inherits from

  • public gko::EnableLinOp<Csr<default_precision, int32>>

  • public ConvertibleTo<Csr<next_precision<default_precision>, int32>>

  • public ConvertibleTo<Csr<next_precision<default_precision, 2>, int32>>

  • public ConvertibleTo<Csr<next_precision<default_precision, 3>, int32>>

  • public ConvertibleTo<Dense<default_precision>>

  • public ConvertibleTo<Coo<default_precision, int32>>

  • public ConvertibleTo<Ell<default_precision, int32>>

  • public ConvertibleTo<Fbcsr<default_precision, int32>>

  • public ConvertibleTo<Hybrid<default_precision, int32>>

  • public ConvertibleTo<Sellp<default_precision, int32>>

  • public ConvertibleTo<SparsityCsr<default_precision, int32>>

  • public gko::DiagonalExtractable<default_precision>

  • public gko::ReadableFromMatrixData<default_precision, int32>

  • public gko::WritableToMatrixData<default_precision, int32>

  • public gko::Transposable

  • public gko::Permutable<int32>

  • public gko::EnableAbsoluteComputation<remove_complex<Csr<default_precision, int32>>>

  • public gko::ScaledIdentityAddable

CSR is a matrix format which stores only the nonzero coefficients by compressing each row of the matrix (compressed sparse row format).

The nonzero elements are stored in a 1D array row-wise, and accompanied with a row pointer array which stores the starting index of each row. An additional column index array is used to identify the column of each nonzero element.

The Csr LinOp supports three families of apply operations, dispatched on the type of the right operand:

  • Against a Dense operand b, apply computes a sparse matrix-vector (or matrix-multivector) product:

    \[ x = A b, \qquad x = \alpha\, A b + \beta\, x. \]

  • Against another Csr operand B, apply computes a sparse-sparse matrix product (SpGEMM):

    \[ C = A B, \qquad C = \alpha\, A B + \beta\, C. \]

  • Against an Identity operand, apply reduces to a sparse-sparse matrix addition (SpGEAM):

    \[ B = \alpha\, A + \beta\, B. \]

In code:

matrix::Csr *A, *B, *C;      // matrices
matrix::Dense *b, *x;        // vectors tall-and-skinny matrices
matrix::Dense *alpha, *beta; // scalars of dimension 1x1
matrix::Identity *I;         // identity matrix

// Applying to Dense matrices computes an SpMV/SpMM product
A->apply(b, x)              // x = A*b
A->apply(alpha, b, beta, x) // x = alpha*A*b + beta*x

// Applying to Csr matrices computes a SpGEMM product of two sparse matrices
A->apply(B, C)              // C = A*B
A->apply(alpha, B, beta, C) // C = alpha*A*B + beta*C

// Applying to an Identity matrix computes a SpGEAM sparse matrix addition
A->apply(alpha, I, beta, B) // B = alpha*A + beta*B
Both the SpGEMM and SpGEAM operation require the input matrices to be sorted by column index, otherwise the algorithms will produce incorrect results.

Template Parameters:
  • ValueType – precision of matrix elements

  • IndexType – precision of matrix indexes

Public Functions

virtual std::unique_ptr<LinOp> transpose() const override#

Returns a LinOp representing the transpose of the Transposable object.

Returns:

a pointer to the new transposed object

virtual std::unique_ptr<LinOp> conj_transpose() const override#

Returns a LinOp representing the conjugate transpose of the Transposable object.

Returns:

a pointer to the new conjugate transposed object

std::unique_ptr<Csr> multiply(ptr_param<const Csr> other) const#

Computes the sparse matrix product this * other on the executor of this matrix.

Parameters:

other – the matrix with which the product will be computed. It needs to be sorted by column indices when using OmpExecutor or DpcppExecutor for this.

Returns:

the product of the two matrices, stored on the same executor as this matrix.

std::pair<std::unique_ptr<Csr>, multiply_reuse_info> multiply_reuse(
ptr_param<const Csr> other,
) const#

Computes the sparse matrix product this * other on the executor of this matrix, and necessary data for value updates:

auto [C, reuse] = A->multiply_reuse(B);
change_values(A, B);
reuse->update_values(A, B, C);

Parameters:

other – the matrix with which the product will be computed. It needs to be sorted by column indices when using OmpExecutor or DpcppExecutor for this.

Returns:

std::pair containing the product of the two matrices, stored on the same executor as this matrix, and a multiply_reuse_info object allowing value updates to the output matrix.

std::unique_ptr<Csr> multiply_add(
ptr_param<const Dense<value_type>> scale_mult,
ptr_param<const Csr> mtx_mult,
ptr_param<const Dense<value_type>> scale_add,
ptr_param<const Csr> mtx_add,
) const#

Computes the sparse matrix product scale_mult * this * mtx_mult + scale_add * mtx_add on the executor of this matrix.

Parameters:
  • scale_mult – the scalar by which the matrix product will be scaled.

  • mtx_mult – the matrix with which the product will be computed. It needs to be sorted by column indices when using OmpExecutor or DpcppExecutor for this.

  • scale_add – the scalar by which the matrix mtx_add will be scaled.

  • mtx_add – the matrix which will be added to the product, scaled by scale_add.

Returns:

the result of the computation, stored on the same executor as this matrix.

std::pair<std::unique_ptr<Csr>, multiply_add_reuse_info> multiply_add_reuse(
ptr_param<const Dense<value_type>> scale_mult,
ptr_param<const Csr> mtx_mult,
ptr_param<const Dense<value_type>> scale_add,
ptr_param<const Csr> mtx_add,
) const#

Computes the sparse matrix product scale_mult * this * mtx_mult + scale_add * mtx_add on the executor of this matrix, and necessary data for value updates:

auto [result, reuse] = mtx->multiply_add_reuse(sm, mm, sa, ma);
change_values(mtx, sm, mm, sa, ma);
reuse->update_values(mtx, sm, mm, sa, ma, result);

Parameters:
  • scale_mult – the scalar by which the matrix product will be scaled.

  • mtx_mult – the matrix with which the product will be computed. It needs to be sorted by column indices when using OmpExecutor or DpcppExecutor for this.

  • scale_add – the scalar by which the matrix mtx_add will be scaled.

  • mtx_add – the matrix which will be added to the product, scaled by scale_add.

Returns:

std::pair containing the result of the computation, stored on the same executor as this matrix, and a multiply_add_reuse_info object allowing value updates to the output matrix.

std::unique_ptr<Csr> scale_add(
ptr_param<const Dense<value_type>> scale_this,
ptr_param<const Dense<value_type>> scale_other,
ptr_param<const Csr> mtx_other,
) const#

Computes the sparse matrix sum scale_this * this + scale_other * mtx_add on the executor of this matrix. This matrix needs to be sorted by column index, otherwise the result will be incorrect.

Parameters:
  • scale_this – the scalar by which this matrix will be scaled.

  • scale_other – the scalar by which this matrix will be scaled.

  • mtx_other – the matrix which will be added to this, scaled by scale_other. It needs to be sorted by column index, otherwise the result will be incorrect.

Returns:

the result of the computation, stored on the same executor as this matrix.

std::pair<std::unique_ptr<Csr>, scale_add_reuse_info> add_scale_reuse(
ptr_param<const Dense<value_type>> scale_this,
ptr_param<const Dense<value_type>> scale_other,
ptr_param<const Csr> mtx_other,
) const#

Computes the sparse matrix sum scale_this * this + scale_other * mtx_add on the executor of this matrix, and necessary data for value updates:

auto [result, reuse] = mtx->add_scale_reuse(alpha, beta, mtx2);
change_values(alpha, mtx, beta, mtx2);
reuse->update_values(alpha, mtx, beta, mtx2, result);
This matrix needs to be sorted by column index, otherwise the result will be incorrect.

Parameters:
  • scale_this – the scalar by which this matrix will be scaled.

  • scale_other – the scalar by which this matrix will be scaled.

  • mtx_other – the matrix which will be added to this, scaled by scale_other. It needs to be sorted by column index, otherwise the result will be incorrect.

Returns:

std::pair containing the result of the computation, stored on the same executor as this matrix, and a scale_add_reuse_info object allowing value updates to the output matrix.

std::pair<std::unique_ptr<Csr>, permuting_reuse_info> transpose_reuse(
) const#

Computes the necessary data to update a transposed matrix from its original matrix.

auto [transposed, reuse] = matrix->transpose_reuse();
change_values(matrix);
reuse->update_values(matrix, transposed);

Returns:

an std::pair consisting of the transposed matrix and a reuse info struct that can be used to update values in the transposed matrix.

std::unique_ptr<Csr> permute(
ptr_param<const Permutation<index_type>> permutation,
permute_mode mode = permute_mode::symmetric,
) const#

Creates a permuted copy \(A'\) of this matrix \(A\) with the given permutation \(P\). By default, this computes a symmetric permutation (permute_mode::symmetric). For the effect of the different permutation modes, see permute_mode

Parameters:
  • permutation – The input permutation.

  • mode – The permutation mode. If permute_mode::inverse is set, we use the inverse permutation \(P^{-1}\) instead of \(P\). If permute_mode::rows is set, the rows will be permuted. If permute_mode::columns is set, the columns will be permuted.

Returns:

The permuted matrix.

std::unique_ptr<Csr> permute(
ptr_param<const Permutation<index_type>> row_permutation,
ptr_param<const Permutation<index_type>> column_permutation,
bool invert = false,
) const#

Creates a non-symmetrically permuted copy \(A'\) of this matrix \(A\) with the given row and column permutations \(P\) and \(Q\). The operation will compute \(A'(i, j) = A(p[i], q[j])\), or \(A' = P A Q^T\) if invert is false, and \(A'(p[i], q[j]) = A(i,j)\), or \(A' = P^{-1} A Q^{-T}\) if invert is true.

Parameters:
  • row_permutation – The permutation \(P\) to apply to the rows

  • column_permutation – The permutation \(Q\) to apply to the columns

  • invert – If set to false, uses the input permutations, otherwise uses their inverses \(P^{-1}, Q^{-1}\)

Returns:

The permuted matrix.

std::pair<std::unique_ptr<Csr>, permuting_reuse_info> permute_reuse(
ptr_param<const Permutation<index_type>> permutation,
permute_mode mode = permute_mode::symmetric,
) const#

Computes the operations necessary to propagate changed values from a matrix A to a permuted matrix. The semantics of this function match those of permute(ptr_param<const Permutation<index_type>>, permute_mode). Updating values works as follows:

auto [permuted, reuse] = matrix->permute_reuse(permutation, mode);
change_values(matrix);
reuse->update_values(matrix, permuted);

Parameters:
  • permutation – The input permutation.

  • mode – The permutation mode. If permute_mode::inverse is set, we use the inverse permutation \(P^{-1}\) instead of \(P\). If permute_mode::rows is set, the rows will be permuted. If permute_mode::columns is set, the columns will be permuted.

Returns:

an std::pair consisting of the permuted matrix and the reuse info that can be used to update values in the permuted matrix.

std::pair<std::unique_ptr<Csr>, permuting_reuse_info> permute_reuse(
ptr_param<const Permutation<index_type>> row_permutation,
ptr_param<const Permutation<index_type>> column_permutation,
bool invert = false,
) const#

Computes the operations necessary to propagate changed values from a matrix A to a permuted matrix. The semantics of this function match those of permute(ptr_param<const Permutation<index_type>>, ptr_param<const

Permutation<index_type>>, bool). Updating values works as follows:

auto [permuted, reuse] = matrix->permute_reuse(row_perm, col_perm, inv);
change_values(matrix);
reuse->update_values(matrix, permuted);

Parameters:
  • row_permutation – The permutation \(P\) to apply to the rows

  • column_permutation – The permutation \(Q\) to apply to the columns

  • invert – If set to false, uses the input permutations, otherwise uses their inverses \(P^{-1}, Q^{-1}\)

Returns:

an std::pair consisting of the permuted matrix and the reuse info that can be used to update values in the permuted matrix.

std::unique_ptr<Csr> scale_permute(
ptr_param<const ScaledPermutation<value_type, index_type>> permutation,
permute_mode = permute_mode::symmetric,
) const#

Creates a scaled and permuted copy of this matrix. For an explanation of the permutation modes, see permute(ptr_param<const Permutation<index_type>>, permute_mode)

Parameters:
  • permutation – The scaled permutation.

  • mode – The permutation mode.

Returns:

The permuted matrix.

std::unique_ptr<Csr> scale_permute(
ptr_param<const ScaledPermutation<value_type, index_type>> row_permutation,
ptr_param<const ScaledPermutation<value_type, index_type>> column_permutation,
bool invert = false,
) const#

Creates a scaled and permuted copy of this matrix. For an explanation of the parameters, see permute(ptr_param<const Permutation<index_type>>, ptr_param<const

Permutation<index_type>>, permute_mode)

Parameters:
  • row_permutation – The scaled row permutation.

  • column_permutation – The scaled column permutation.

  • invert – If set to false, uses the input permutations, otherwise uses their inverses \(P^{-1}, Q^{-1}\)

Returns:

The permuted matrix.

virtual std::unique_ptr<Diagonal<ValueType>> extract_diagonal(
) const override#

Extracts the diagonal entries of the matrix into a vector.

Parameters:

diag – the vector into which the diagonal will be written

virtual std::unique_ptr<absolute_type> compute_absolute(
) const override#

Gets the AbsoluteLinOp

Returns:

a pointer to the new absolute object

virtual void compute_absolute_inplace() override#

Compute absolute inplace on each element.

void sort_by_column_index()#

Sorts all (value, col_idx) pairs in each row by column index

inline value_type *get_values() noexcept#

Returns the values of the matrix.

Returns:

the values of the matrix.

inline const value_type *get_const_values() const noexcept#

Returns the values of the matrix.

Note

This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.

Returns:

the values of the matrix.

std::unique_ptr<Dense<ValueType>> create_value_view()#

Creates a Dense view of the value array of this matrix as a column vector of dimensions nnz x 1.

std::unique_ptr<const Dense<ValueType>> create_const_value_view(
) const#

Creates a const Dense view of the value array of this matrix as a column vector of dimensions nnz x 1.

inline index_type *get_col_idxs() noexcept#

Returns the column indexes of the matrix.

Returns:

the column indexes of the matrix.

inline const index_type *get_const_col_idxs() const noexcept#

Returns the column indexes of the matrix.

Note

This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.

Returns:

the column indexes of the matrix.

inline index_type *get_row_ptrs() noexcept#

Returns the row pointers of the matrix.

Returns:

the row pointers of the matrix.

inline const index_type *get_const_row_ptrs() const noexcept#

Returns the row pointers of the matrix.

Note

This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.

Returns:

the row pointers of the matrix.

inline index_type *get_srow() noexcept#

Returns the starting rows.

Returns:

the starting rows.

inline const index_type *get_const_srow() const noexcept#

Returns the starting rows.

Note

This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.

Returns:

the starting rows.

inline size_type get_num_srow_elements() const noexcept#

Returns the number of the srow stored elements (involved warps)

Returns:

the number of the srow stored elements (involved warps)

inline size_type get_num_stored_elements() const noexcept#

Returns the number of elements explicitly stored in the matrix.

Returns:

the number of elements explicitly stored in the matrix

inline std::shared_ptr<strategy_type> get_strategy() const noexcept#

Returns the strategy

Returns:

the strategy

inline void set_strategy(std::shared_ptr<strategy_type> strategy)#

Set the strategy

Parameters:

strategy – the csr strategy

inline void scale(ptr_param<const LinOp> alpha)#

Scales the matrix with a scalar.

Parameters:

alpha – The entire matrix is scaled by alpha. alpha has to be a 1x1 Dense matrix.

inline void inv_scale(ptr_param<const LinOp> alpha)#

Scales the matrix with the inverse of a scalar.

Parameters:

alpha – The entire matrix is scaled by 1 / alpha. alpha has to be a 1x1 Dense matrix.

std::unique_ptr<Csr<ValueType, IndexType>> create_submatrix(
const index_set<IndexType> &row_index_set,
const index_set<IndexType> &column_index_set,
) const#

Creates a submatrix from this Csr matrix given row and column index_set objects.

Note

This is not a view but creates a new, separate CSR matrix.

Parameters:
  • row_index_set – the row index set containing the set of rows to be in the submatrix.

  • column_index_set – the col index set containing the set of columns to be in the submatrix.

Returns:

A new CSR matrix with the elements that belong to the row and columns of this matrix as specified by the index sets.

std::unique_ptr<Csr<ValueType, IndexType>> create_submatrix(
const span &row_span,
const span &column_span,
) const#

Creates a submatrix from this Csr matrix given row and column spans

Note

This is not a view but creates a new, separate CSR matrix.

Parameters:
  • row_span – the row span containing the contiguous set of rows to be in the submatrix.

  • column_span – the column span containing the contiguous set of columns to be in the submatrix.

Returns:

A new CSR matrix with the elements that belong to the row and columns of this matrix as specified by the index sets.

Csr &operator=(const Csr&)#

Copy-assigns a Csr matrix. Preserves executor, copies everything else.

Csr &operator=(Csr&&)#

Move-assigns a Csr matrix. Preserves executor, moves the data and leaves the moved-from object in an empty state (0x0 LinOp with unchanged executor and strategy, no nonzeros and valid row pointers).

Csr(const Csr&)#

Copy-constructs a Csr matrix. Inherits executor, strategy and data.

Csr(Csr&&)#

Move-constructs a Csr matrix. Inherits executor and strategy, moves the data and leaves the moved-from object in an empty state (0x0 LinOp with unchanged executor and strategy, no nonzeros and valid row pointers).

Public Static Functions

static std::unique_ptr<Csr> create(
std::shared_ptr<const Executor> exec,
std::shared_ptr<strategy_type> strategy,
)#

Creates an uninitialized CSR matrix of the specified size.

Parameters:
  • execExecutor associated to the matrix

  • strategy – the strategy of CSR

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Csr> create(
std::shared_ptr<const Executor> exec,
const dim<2> &size = {},
size_type num_nonzeros = {},
std::shared_ptr<strategy_type> strategy = nullptr,
)#

Creates an uninitialized CSR matrix of the specified size.

Parameters:
  • execExecutor associated to the matrix

  • size – size of the matrix

  • num_nonzeros – number of nonzeros

  • strategy – the strategy of CSR, or the default strategy if set to nullptr

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Csr> create(
std::shared_ptr<const Executor> exec,
const dim<2> &size,
array<value_type> values,
array<index_type> col_idxs,
array<index_type> row_ptrs,
std::shared_ptr<strategy_type> strategy = nullptr,
)#

Creates a CSR matrix from already allocated (and initialized) row pointer, column index and value arrays.

Note

If one of row_ptrs, col_idxs or values is not an rvalue, not an array of IndexType, IndexType and ValueType, respectively, or is on the wrong executor, an internal copy of that array will be created, and the original array data will not be used in the matrix.

Parameters:
  • execExecutor associated to the matrix

  • size – size of the matrix

  • values – array of matrix values

  • col_idxs – array of column indexes

  • row_ptrs – array of row pointers

  • strategy – the strategy the matrix uses for SpMV operations

Returns:

A smart pointer to the newly created matrix.

template<typename InputValueType, typename InputColumnIndexType, typename InputRowPtrType>
static inline std::unique_ptr<Csr> create(
std::shared_ptr<const Executor> exec,
const dim<2> &size,
std::initializer_list<InputValueType> values,
std::initializer_list<InputColumnIndexType> col_idxs,
std::initializer_list<InputRowPtrType> row_ptrs,
)#

create(std::shared_ptr<const Executor>,const dim<2>&, array<value_type>, array<index_type>, array<index_type>)

create(std::shared_ptr<const Executor>,const dim<2>&, array<value_type>, array<index_type>, array<index_type>)

static std::unique_ptr<const Csr> create_const(
std::shared_ptr<const Executor> exec,
const dim<2> &size,
gko::detail::const_array_view<ValueType> &&values,
gko::detail::const_array_view<IndexType> &&col_idxs,
gko::detail::const_array_view<IndexType> &&row_ptrs,
std::shared_ptr<strategy_type> strategy = nullptr,
)#

Creates a constant (immutable) Csr matrix from a set of constant arrays.

Parameters:
  • exec – the executor to create the matrix on

  • size – the dimensions of the matrix

  • values – the value array of the matrix

  • col_idxs – the column index array of the matrix

  • row_ptrs – the row pointer array of the matrix

  • strategy – the strategy the matrix uses for SpMV operations

Returns:

A smart pointer to the constant matrix wrapping the input arrays (if they reside on the same executor as the matrix) or a copy of these arrays on the correct executor.

Returns:

A smart pointer to the newly created matrix.

class strategy_type#

strategy_type is to decide how to set the csr algorithm.

The practical strategy method should inherit strategy_type and implement its process, clac_size function and the corresponding device kernel.

Subclassed by

Public Functions

inline strategy_type(std::string name)#

Creates a strategy_type.

Parameters:

name – the name of strategy

inline std::string get_name()#

Returns the name of strategy

Returns:

the name of strategy

virtual void process(
const array<index_type> &mtx_row_ptrs,
array<index_type> *mtx_srow,
) = 0#

Computes srow according to row pointers.

Parameters:
  • mtx_row_ptrs – the row pointers of the matrix

  • mtx_srow – the srow of the matrix

virtual int64_t clac_size(const int64_t nnz) = 0#

Computes the srow size according to the number of nonzeros.

Parameters:

nnz – the number of nonzeros

Returns:

the size of srow

virtual std::shared_ptr<strategy_type> copy() = 0#

Copy a strategy. This is a workaround until strategies are revamped, since strategies like automatical do not work when actually shared.

class classical #

Inherits from

classical is a strategy_type which uses the same number of threads on each row. Classical strategy uses multithreads to calculate on parts of rows and then do a reduction of these threads results. The number of threads per row depends on the max number of stored elements per row.

Public Functions

inline classical()#

Creates a classical strategy.

inline virtual void process(
const array<index_type> &mtx_row_ptrs,
array<index_type> *mtx_srow,
) override#

Computes srow according to row pointers.

Parameters:
  • mtx_row_ptrs – the row pointers of the matrix

  • mtx_srow – the srow of the matrix

inline virtual int64_t clac_size(const int64_t nnz) override#

Computes the srow size according to the number of nonzeros.

Parameters:

nnz – the number of nonzeros

Returns:

the size of srow

inline virtual std::shared_ptr<strategy_type> copy() override#

Copy a strategy. This is a workaround until strategies are revamped, since strategies like automatical do not work when actually shared.

class merge_path #

Inherits from

merge_path is a strategy_type which uses the merge_path algorithm. merge_path is according to Merrill and Garland: Merge-Based Parallel Sparse Matrix-Vector Multiplication

Public Functions

inline merge_path()#

Creates a merge_path strategy.

inline virtual void process(
const array<index_type> &mtx_row_ptrs,
array<index_type> *mtx_srow,
) override#

Computes srow according to row pointers.

Parameters:
  • mtx_row_ptrs – the row pointers of the matrix

  • mtx_srow – the srow of the matrix

inline virtual int64_t clac_size(const int64_t nnz) override#

Computes the srow size according to the number of nonzeros.

Parameters:

nnz – the number of nonzeros

Returns:

the size of srow

inline virtual std::shared_ptr<strategy_type> copy() override#

Copy a strategy. This is a workaround until strategies are revamped, since strategies like automatical do not work when actually shared.

class cusparse #

Inherits from

cusparse is a strategy_type which uses the sparselib csr.

Note

cusparse is also known to the hip executor which converts between cuda and hip.

Public Functions

inline cusparse()#

Creates a cusparse strategy.

inline virtual void process(
const array<index_type> &mtx_row_ptrs,
array<index_type> *mtx_srow,
) override#

Computes srow according to row pointers.

Parameters:
  • mtx_row_ptrs – the row pointers of the matrix

  • mtx_srow – the srow of the matrix

inline virtual int64_t clac_size(const int64_t nnz) override#

Computes the srow size according to the number of nonzeros.

Parameters:

nnz – the number of nonzeros

Returns:

the size of srow

inline virtual std::shared_ptr<strategy_type> copy() override#

Copy a strategy. This is a workaround until strategies are revamped, since strategies like automatical do not work when actually shared.

class sparselib #

Inherits from

sparselib is a strategy_type which uses the sparselib csr.

Note

Uses cusparse in cuda and hipsparse in hip.

Public Functions

inline sparselib()#

Creates a sparselib strategy.

inline virtual void process(
const array<index_type> &mtx_row_ptrs,
array<index_type> *mtx_srow,
) override#

Computes srow according to row pointers.

Parameters:
  • mtx_row_ptrs – the row pointers of the matrix

  • mtx_srow – the srow of the matrix

inline virtual int64_t clac_size(const int64_t nnz) override#

Computes the srow size according to the number of nonzeros.

Parameters:

nnz – the number of nonzeros

Returns:

the size of srow

inline virtual std::shared_ptr<strategy_type> copy() override#

Copy a strategy. This is a workaround until strategies are revamped, since strategies like automatical do not work when actually shared.

class load_balance #

Inherits from

load_balance is a strategy_type which uses the load balance algorithm.

Public Functions

inline load_balance()#

Creates a load_balance strategy.

Warning

this is deprecated! Please rely on the new automatic strategy instantiation or use one of the other constructors.

inline load_balance(std::shared_ptr<const CudaExecutor> exec)#

Creates a load_balance strategy with CUDA executor.

Parameters:

exec – the CUDA executor

inline load_balance(std::shared_ptr<const HipExecutor> exec)#

Creates a load_balance strategy with HIP executor.

Parameters:

exec – the HIP executor

inline load_balance(std::shared_ptr<const DpcppExecutor> exec)#

Creates a load_balance strategy with DPCPP executor.

Note

TODO: porting - we hardcode the subgroup size is 32

Parameters:

exec – the DPCPP executor

inline load_balance(
int64_t nwarps,
int warp_size = 32,
bool cuda_strategy = true,
std::string strategy_name = "none",
)#

Creates a load_balance strategy with specified parameters

Note

The warp_size must be the size of full warp. When using this constructor, set_strategy needs to be called with correct parameters which is replaced during the conversion.

Parameters:
  • nwarps – the number of warps in the executor

  • warp_size – the warp size of the executor

  • cuda_strategy – whether the cuda_strategy needs to be used.

inline virtual void process(
const array<index_type> &mtx_row_ptrs,
array<index_type> *mtx_srow,
) override#

Computes srow according to row pointers.

Parameters:
  • mtx_row_ptrs – the row pointers of the matrix

  • mtx_srow – the srow of the matrix

inline virtual int64_t clac_size(const int64_t nnz) override#

Computes the srow size according to the number of nonzeros.

Parameters:

nnz – the number of nonzeros

Returns:

the size of srow

inline virtual std::shared_ptr<strategy_type> copy() override#

Copy a strategy. This is a workaround until strategies are revamped, since strategies like automatical do not work when actually shared.

class automatical #

Inherits from

Public Functions

inline automatical()#

Creates an automatical strategy.

Warning

this is deprecated! Please rely on the new automatic strategy instantiation or use one of the other constructors.

inline automatical(std::shared_ptr<const CudaExecutor> exec)#

Creates an automatical strategy with CUDA executor.

Parameters:

exec – the CUDA executor

inline automatical(std::shared_ptr<const HipExecutor> exec)#

Creates an automatical strategy with HIP executor.

Parameters:

exec – the HIP executor

inline automatical(std::shared_ptr<const DpcppExecutor> exec)#

Creates an automatical strategy with Dpcpp executor.

Note

TODO: porting - we hardcode the subgroup size is 32

Parameters:

exec – the Dpcpp executor

inline automatical(
int64_t nwarps,
int warp_size = 32,
bool cuda_strategy = true,
std::string strategy_name = "none",
)#

Creates an automatical strategy with specified parameters

Note

The warp_size must be the size of full warp. When using this constructor, set_strategy needs to be called with correct parameters which is replaced during the conversion.

Parameters:
  • nwarps – the number of warps in the executor

  • warp_size – the warp size of the executor

  • cuda_strategy – whether the cuda_strategy needs to be used.

class multiply_reuse_info#

Class describing the internal lookup structures created by multiply_reuse(const Csr*) to recompute a sparse matrix-matrix product with updated values.

Public Functions

void update_values(
ptr_param<const Csr> mtx1,
ptr_param<const Csr> mtx2,
ptr_param<Csr> out,
) const#

Recomputes the sparse matrix-matrix product out = mtx1 * mtx2 when only the values of mtx1 and mtx2 changed, but the sparsity patterns of mtx1, mtx2 and out are unchanged.

class multiply_add_reuse_info#

Class describing the internal lookup structures created by multiply_add_reuse to recompute a sparse matrix-matrix product with updated values.

Public Functions

void update_values(
ptr_param<const Csr> mtx,
ptr_param<const Dense<value_type>> scale_mult,
ptr_param<const Csr> mtx_mult,
ptr_param<const Dense<value_type>> scale_add,
ptr_param<const Csr> mtx_add,
ptr_param<Csr> out,
) const#

Recomputes the sparse matrix-matrix product out = scale_mult * mtx * mtx_mult + scale_add * mtx_add when only the values of mtx, scale_mult, mtx_mult, scale_add, mtx_add changed, but the sparsity patterns of mtx, mtx_mult, mtx_add and out are unchanged.

class scale_add_reuse_info#

Class describing the internal lookup structures created by scale_add_reuse to recompute a sparse matrix-matrix sum with updated values.

Public Functions

void update_values(
ptr_param<const Dense<value_type>> scale1,
ptr_param<const Csr> mtx1,
ptr_param<const Dense<value_type>> scale2,
ptr_param<const Csr> mtx2,
ptr_param<Csr> out,
) const#

Recomputes the sparse matrix-matrix sum out = scale1 * mtx1 + scale2 * mtx2 when only the values of mtx1, scale1, mtx2, scale2 changed, but the sparsity patterns of mtx1, mtx2 and out are unchanged.

struct permuting_reuse_info#

A struct describing a transformation of the matrix that reorders the values of the matrix into the transformed matrix.

Public Functions

explicit permuting_reuse_info()#

Creates an empty reuse info.

explicit permuting_reuse_info(
std::unique_ptr<Permutation<index_type>> value_permutation,
)#

Creates a reuse info structure from its value permutation.

void update_values(
ptr_param<const Csr> input,
ptr_param<Csr> output,
) const#

Propagates the values from an input matrix to the transformed matrix. The output matrix needs to have been computed using the transformation that was also used to generate this reuse data. Internally, this permutes the input value vector into the output value vector.