gko::matrix::Csr#
Compressed sparse row format. Stores each row’s nonzeros as a contiguous slice of value and column-index arrays, with a row-pointer array locating the start of each row. The general-purpose default for sparse matrices in Ginkgo.
-
template<typename ValueType = default_precision, typename IndexType = int32>
class Csr # Inherits from
public gko::EnableLinOp<Csr<default_precision, int32>>
public ConvertibleTo<Csr<next_precision<default_precision>, int32>>
public ConvertibleTo<Csr<next_precision<default_precision, 2>, int32>>
public ConvertibleTo<Csr<next_precision<default_precision, 3>, int32>>
public ConvertibleTo<Dense<default_precision>>
public ConvertibleTo<Coo<default_precision, int32>>
public ConvertibleTo<Ell<default_precision, int32>>
public ConvertibleTo<Fbcsr<default_precision, int32>>
public ConvertibleTo<Hybrid<default_precision, int32>>
public ConvertibleTo<Sellp<default_precision, int32>>
public ConvertibleTo<SparsityCsr<default_precision, int32>>
public gko::DiagonalExtractable<default_precision>
public gko::ReadableFromMatrixData<default_precision, int32>
public gko::WritableToMatrixData<default_precision, int32>
public gko::Transposable
public gko::Permutable<int32>
public gko::EnableAbsoluteComputation<remove_complex<Csr<default_precision, int32>>>
public gko::ScaledIdentityAddable
CSR is a matrix format which stores only the nonzero coefficients by compressing each row of the matrix (compressed sparse row format).
The nonzero elements are stored in a 1D array row-wise, and accompanied with a row pointer array which stores the starting index of each row. An additional column index array is used to identify the column of each nonzero element.
The Csr LinOp supports three families of
applyoperations, dispatched on the type of the right operand:Against a
Denseoperandb,applycomputes a sparse matrix-vector (or matrix-multivector) product:\[ x = A b, \qquad x = \alpha\, A b + \beta\, x. \]Against another
CsroperandB,applycomputes a sparse-sparse matrix product (SpGEMM):\[ C = A B, \qquad C = \alpha\, A B + \beta\, C. \]Against an
Identityoperand,applyreduces to a sparse-sparse matrix addition (SpGEAM):\[ B = \alpha\, A + \beta\, B. \]
In code:
Both the SpGEMM and SpGEAM operation require the input matrices to be sorted by column index, otherwise the algorithms will produce incorrect results.matrix::Csr *A, *B, *C; // matrices matrix::Dense *b, *x; // vectors tall-and-skinny matrices matrix::Dense *alpha, *beta; // scalars of dimension 1x1 matrix::Identity *I; // identity matrix // Applying to Dense matrices computes an SpMV/SpMM product A->apply(b, x) // x = A*b A->apply(alpha, b, beta, x) // x = alpha*A*b + beta*x // Applying to Csr matrices computes a SpGEMM product of two sparse matrices A->apply(B, C) // C = A*B A->apply(alpha, B, beta, C) // C = alpha*A*B + beta*C // Applying to an Identity matrix computes a SpGEAM sparse matrix addition A->apply(alpha, I, beta, B) // B = alpha*A + beta*B
- Template Parameters:
ValueType – precision of matrix elements
IndexType – precision of matrix indexes
Public Functions
-
virtual std::unique_ptr<LinOp> transpose() const override#
Returns a LinOp representing the transpose of the Transposable object.
- Returns:
a pointer to the new transposed object
-
virtual std::unique_ptr<LinOp> conj_transpose() const override#
Returns a LinOp representing the conjugate transpose of the Transposable object.
- Returns:
a pointer to the new conjugate transposed object
-
std::unique_ptr<Csr> multiply(ptr_param<const Csr> other) const#
Computes the sparse matrix product
this * otheron the executor of this matrix.- Parameters:
other – the matrix with which the product will be computed. It needs to be sorted by column indices when using OmpExecutor or DpcppExecutor for
this.- Returns:
the product of the two matrices, stored on the same executor as this matrix.
- std::pair<std::unique_ptr<Csr>, multiply_reuse_info> multiply_reuse(
- ptr_param<const Csr> other,
Computes the sparse matrix product
this * otheron the executor of this matrix, and necessary data for value updates:auto [C, reuse] = A->multiply_reuse(B); change_values(A, B); reuse->update_values(A, B, C);
- Parameters:
other – the matrix with which the product will be computed. It needs to be sorted by column indices when using OmpExecutor or DpcppExecutor for
this.- Returns:
std::pair containing the product of the two matrices, stored on the same executor as this matrix, and a multiply_reuse_info object allowing value updates to the output matrix.
- std::unique_ptr<Csr> multiply_add(
- ptr_param<const Dense<value_type>> scale_mult,
- ptr_param<const Csr> mtx_mult,
- ptr_param<const Dense<value_type>> scale_add,
- ptr_param<const Csr> mtx_add,
Computes the sparse matrix product
scale_mult * this * mtx_mult + scale_add * mtx_addon the executor of this matrix.- Parameters:
scale_mult – the scalar by which the matrix product will be scaled.
mtx_mult – the matrix with which the product will be computed. It needs to be sorted by column indices when using OmpExecutor or DpcppExecutor for
this.scale_add – the scalar by which the matrix mtx_add will be scaled.
mtx_add – the matrix which will be added to the product, scaled by scale_add.
- Returns:
the result of the computation, stored on the same executor as this matrix.
- std::pair<std::unique_ptr<Csr>, multiply_add_reuse_info> multiply_add_reuse(
- ptr_param<const Dense<value_type>> scale_mult,
- ptr_param<const Csr> mtx_mult,
- ptr_param<const Dense<value_type>> scale_add,
- ptr_param<const Csr> mtx_add,
Computes the sparse matrix product
scale_mult * this * mtx_mult + scale_add * mtx_addon the executor of this matrix, and necessary data for value updates:auto [result, reuse] = mtx->multiply_add_reuse(sm, mm, sa, ma); change_values(mtx, sm, mm, sa, ma); reuse->update_values(mtx, sm, mm, sa, ma, result);
- Parameters:
scale_mult – the scalar by which the matrix product will be scaled.
mtx_mult – the matrix with which the product will be computed. It needs to be sorted by column indices when using OmpExecutor or DpcppExecutor for
this.scale_add – the scalar by which the matrix mtx_add will be scaled.
mtx_add – the matrix which will be added to the product, scaled by scale_add.
- Returns:
std::pair containing the result of the computation, stored on the same executor as this matrix, and a multiply_add_reuse_info object allowing value updates to the output matrix.
- std::unique_ptr<Csr> scale_add(
- ptr_param<const Dense<value_type>> scale_this,
- ptr_param<const Dense<value_type>> scale_other,
- ptr_param<const Csr> mtx_other,
Computes the sparse matrix sum
scale_this * this + scale_other * mtx_addon the executor of this matrix. This matrix needs to be sorted by column index, otherwise the result will be incorrect.- Parameters:
scale_this – the scalar by which this matrix will be scaled.
scale_other – the scalar by which this matrix will be scaled.
mtx_other – the matrix which will be added to this, scaled by scale_other. It needs to be sorted by column index, otherwise the result will be incorrect.
- Returns:
the result of the computation, stored on the same executor as this matrix.
- std::pair<std::unique_ptr<Csr>, scale_add_reuse_info> add_scale_reuse(
- ptr_param<const Dense<value_type>> scale_this,
- ptr_param<const Dense<value_type>> scale_other,
- ptr_param<const Csr> mtx_other,
Computes the sparse matrix sum
scale_this * this + scale_other * mtx_addon the executor of this matrix, and necessary data for value updates:This matrix needs to be sorted by column index, otherwise the result will be incorrect.auto [result, reuse] = mtx->add_scale_reuse(alpha, beta, mtx2); change_values(alpha, mtx, beta, mtx2); reuse->update_values(alpha, mtx, beta, mtx2, result);
- Parameters:
scale_this – the scalar by which this matrix will be scaled.
scale_other – the scalar by which this matrix will be scaled.
mtx_other – the matrix which will be added to this, scaled by scale_other. It needs to be sorted by column index, otherwise the result will be incorrect.
- Returns:
std::pair containing the result of the computation, stored on the same executor as this matrix, and a scale_add_reuse_info object allowing value updates to the output matrix.
- std::pair<std::unique_ptr<Csr>, permuting_reuse_info> transpose_reuse(
Computes the necessary data to update a transposed matrix from its original matrix.
auto [transposed, reuse] = matrix->transpose_reuse(); change_values(matrix); reuse->update_values(matrix, transposed);
- Returns:
an std::pair consisting of the transposed matrix and a reuse info struct that can be used to update values in the transposed matrix.
- std::unique_ptr<Csr> permute(
- ptr_param<const Permutation<index_type>> permutation,
- permute_mode mode = permute_mode::symmetric,
Creates a permuted copy \(A'\) of this matrix \(A\) with the given permutation \(P\). By default, this computes a symmetric permutation (permute_mode::symmetric). For the effect of the different permutation modes, see permute_mode
- Parameters:
permutation – The input permutation.
mode – The permutation mode. If permute_mode::inverse is set, we use the inverse permutation \(P^{-1}\) instead of \(P\). If permute_mode::rows is set, the rows will be permuted. If permute_mode::columns is set, the columns will be permuted.
- Returns:
The permuted matrix.
- std::unique_ptr<Csr> permute(
- ptr_param<const Permutation<index_type>> row_permutation,
- ptr_param<const Permutation<index_type>> column_permutation,
- bool invert = false,
Creates a non-symmetrically permuted copy \(A'\) of this matrix \(A\) with the given row and column permutations \(P\) and \(Q\). The operation will compute \(A'(i, j) = A(p[i], q[j])\), or \(A' = P A Q^T\) if
invertisfalse, and \(A'(p[i], q[j]) = A(i,j)\), or \(A' = P^{-1} A Q^{-T}\) ifinvertistrue.- Parameters:
row_permutation – The permutation \(P\) to apply to the rows
column_permutation – The permutation \(Q\) to apply to the columns
invert – If set to
false, uses the input permutations, otherwise uses their inverses \(P^{-1}, Q^{-1}\)
- Returns:
The permuted matrix.
- std::pair<std::unique_ptr<Csr>, permuting_reuse_info> permute_reuse(
- ptr_param<const Permutation<index_type>> permutation,
- permute_mode mode = permute_mode::symmetric,
Computes the operations necessary to propagate changed values from a matrix A to a permuted matrix. The semantics of this function match those of permute(ptr_param<const Permutation<index_type>>, permute_mode). Updating values works as follows:
auto [permuted, reuse] = matrix->permute_reuse(permutation, mode); change_values(matrix); reuse->update_values(matrix, permuted);
- Parameters:
permutation – The input permutation.
mode – The permutation mode. If permute_mode::inverse is set, we use the inverse permutation \(P^{-1}\) instead of \(P\). If permute_mode::rows is set, the rows will be permuted. If permute_mode::columns is set, the columns will be permuted.
- Returns:
an std::pair consisting of the permuted matrix and the reuse info that can be used to update values in the permuted matrix.
- std::pair<std::unique_ptr<Csr>, permuting_reuse_info> permute_reuse(
- ptr_param<const Permutation<index_type>> row_permutation,
- ptr_param<const Permutation<index_type>> column_permutation,
- bool invert = false,
Computes the operations necessary to propagate changed values from a matrix A to a permuted matrix. The semantics of this function match those of permute(ptr_param<const Permutation<index_type>>, ptr_param<const
Permutation<index_type>>, bool). Updating values works as follows:
auto [permuted, reuse] = matrix->permute_reuse(row_perm, col_perm, inv); change_values(matrix); reuse->update_values(matrix, permuted);
- Parameters:
row_permutation – The permutation \(P\) to apply to the rows
column_permutation – The permutation \(Q\) to apply to the columns
invert – If set to
false, uses the input permutations, otherwise uses their inverses \(P^{-1}, Q^{-1}\)
- Returns:
an std::pair consisting of the permuted matrix and the reuse info that can be used to update values in the permuted matrix.
- std::unique_ptr<Csr> scale_permute(
- ptr_param<const ScaledPermutation<value_type, index_type>> permutation,
- permute_mode = permute_mode::symmetric,
Creates a scaled and permuted copy of this matrix. For an explanation of the permutation modes, see permute(ptr_param<const Permutation<index_type>>, permute_mode)
- Parameters:
permutation – The scaled permutation.
mode – The permutation mode.
- Returns:
The permuted matrix.
- std::unique_ptr<Csr> scale_permute(
- ptr_param<const ScaledPermutation<value_type, index_type>> row_permutation,
- ptr_param<const ScaledPermutation<value_type, index_type>> column_permutation,
- bool invert = false,
Creates a scaled and permuted copy of this matrix. For an explanation of the parameters, see permute(ptr_param<const Permutation<index_type>>, ptr_param<const
Permutation<index_type>>, permute_mode)
- Parameters:
row_permutation – The scaled row permutation.
column_permutation – The scaled column permutation.
invert – If set to
false, uses the input permutations, otherwise uses their inverses \(P^{-1}, Q^{-1}\)
- Returns:
The permuted matrix.
- virtual std::unique_ptr<Diagonal<ValueType>> extract_diagonal(
Extracts the diagonal entries of the matrix into a vector.
- Parameters:
diag – the vector into which the diagonal will be written
- virtual std::unique_ptr<absolute_type> compute_absolute(
Gets the AbsoluteLinOp
- Returns:
a pointer to the new absolute object
-
virtual void compute_absolute_inplace() override#
Compute absolute inplace on each element.
-
void sort_by_column_index()#
Sorts all (value, col_idx) pairs in each row by column index
-
inline value_type *get_values() noexcept#
Returns the values of the matrix.
- Returns:
the values of the matrix.
-
inline const value_type *get_const_values() const noexcept#
Returns the values of the matrix.
Note
This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.
- Returns:
the values of the matrix.
-
std::unique_ptr<Dense<ValueType>> create_value_view()#
Creates a Dense view of the value array of this matrix as a column vector of dimensions nnz x 1.
- std::unique_ptr<const Dense<ValueType>> create_const_value_view(
Creates a const Dense view of the value array of this matrix as a column vector of dimensions nnz x 1.
-
inline index_type *get_col_idxs() noexcept#
Returns the column indexes of the matrix.
- Returns:
the column indexes of the matrix.
-
inline const index_type *get_const_col_idxs() const noexcept#
Returns the column indexes of the matrix.
Note
This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.
- Returns:
the column indexes of the matrix.
-
inline index_type *get_row_ptrs() noexcept#
Returns the row pointers of the matrix.
- Returns:
the row pointers of the matrix.
-
inline const index_type *get_const_row_ptrs() const noexcept#
Returns the row pointers of the matrix.
Note
This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.
- Returns:
the row pointers of the matrix.
-
inline index_type *get_srow() noexcept#
Returns the starting rows.
- Returns:
the starting rows.
-
inline const index_type *get_const_srow() const noexcept#
Returns the starting rows.
Note
This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.
- Returns:
the starting rows.
-
inline size_type get_num_srow_elements() const noexcept#
Returns the number of the srow stored elements (involved warps)
- Returns:
the number of the srow stored elements (involved warps)
-
inline size_type get_num_stored_elements() const noexcept#
Returns the number of elements explicitly stored in the matrix.
- Returns:
the number of elements explicitly stored in the matrix
-
inline std::shared_ptr<strategy_type> get_strategy() const noexcept#
Returns the strategy
- Returns:
the strategy
Set the strategy
- Parameters:
strategy – the csr strategy
-
inline void scale(ptr_param<const LinOp> alpha)#
Scales the matrix with a scalar.
- Parameters:
alpha – The entire matrix is scaled by alpha. alpha has to be a 1x1 Dense matrix.
-
inline void inv_scale(ptr_param<const LinOp> alpha)#
Scales the matrix with the inverse of a scalar.
- Parameters:
alpha – The entire matrix is scaled by 1 / alpha. alpha has to be a 1x1 Dense matrix.
- std::unique_ptr<Csr<ValueType, IndexType>> create_submatrix( ) const#
Creates a submatrix from this Csr matrix given row and column index_set objects.
Note
This is not a view but creates a new, separate CSR matrix.
- Parameters:
row_index_set – the row index set containing the set of rows to be in the submatrix.
column_index_set – the col index set containing the set of columns to be in the submatrix.
- Returns:
A new CSR matrix with the elements that belong to the row and columns of this matrix as specified by the index sets.
- std::unique_ptr<Csr<ValueType, IndexType>> create_submatrix(
- const span &row_span,
- const span &column_span,
Creates a submatrix from this Csr matrix given row and column spans
Note
This is not a view but creates a new, separate CSR matrix.
- Parameters:
row_span – the row span containing the contiguous set of rows to be in the submatrix.
column_span – the column span containing the contiguous set of columns to be in the submatrix.
- Returns:
A new CSR matrix with the elements that belong to the row and columns of this matrix as specified by the index sets.
Public Static Functions
- std::shared_ptr<const Executor> exec,
- std::shared_ptr<strategy_type> strategy,
Creates an uninitialized CSR matrix of the specified size.
- Parameters:
exec – Executor associated to the matrix
strategy – the strategy of CSR
- Returns:
A smart pointer to the newly created matrix.
- std::shared_ptr<const Executor> exec,
- const dim<2> &size = {},
- size_type num_nonzeros = {},
- std::shared_ptr<strategy_type> strategy = nullptr,
Creates an uninitialized CSR matrix of the specified size.
- Parameters:
exec – Executor associated to the matrix
size – size of the matrix
num_nonzeros – number of nonzeros
strategy – the strategy of CSR, or the default strategy if set to nullptr
- Returns:
A smart pointer to the newly created matrix.
- std::shared_ptr<const Executor> exec,
- const dim<2> &size,
- array<value_type> values,
- array<index_type> col_idxs,
- array<index_type> row_ptrs,
- std::shared_ptr<strategy_type> strategy = nullptr,
Creates a CSR matrix from already allocated (and initialized) row pointer, column index and value arrays.
Note
If one of
row_ptrs,col_idxsorvaluesis not an rvalue, not an array of IndexType, IndexType and ValueType, respectively, or is on the wrong executor, an internal copy of that array will be created, and the original array data will not be used in the matrix.- Parameters:
exec – Executor associated to the matrix
size – size of the matrix
values – array of matrix values
col_idxs – array of column indexes
row_ptrs – array of row pointers
strategy – the strategy the matrix uses for SpMV operations
- Returns:
A smart pointer to the newly created matrix.
- std::shared_ptr<const Executor> exec,
- const dim<2> &size,
- std::initializer_list<InputValueType> values,
- std::initializer_list<InputColumnIndexType> col_idxs,
- std::initializer_list<InputRowPtrType> row_ptrs,
create(std::shared_ptr<const Executor>,const dim<2>&, array<value_type>, array<index_type>, array<index_type>)
create(std::shared_ptr<const Executor>,const dim<2>&, array<value_type>, array<index_type>, array<index_type>)
- std::shared_ptr<const Executor> exec,
- const dim<2> &size,
- gko::detail::const_array_view<ValueType> &&values,
- gko::detail::const_array_view<IndexType> &&col_idxs,
- gko::detail::const_array_view<IndexType> &&row_ptrs,
- std::shared_ptr<strategy_type> strategy = nullptr,
Creates a constant (immutable) Csr matrix from a set of constant arrays.
- Parameters:
exec – the executor to create the matrix on
size – the dimensions of the matrix
values – the value array of the matrix
col_idxs – the column index array of the matrix
row_ptrs – the row pointer array of the matrix
strategy – the strategy the matrix uses for SpMV operations
- Returns:
A smart pointer to the constant matrix wrapping the input arrays (if they reside on the same executor as the matrix) or a copy of these arrays on the correct executor.
- Returns:
A smart pointer to the newly created matrix.
-
class strategy_type#
strategy_type is to decide how to set the csr algorithm.
The practical strategy method should inherit strategy_type and implement its
process,clac_sizefunction and the corresponding device kernel.Subclassed by
Public Functions
-
inline strategy_type(std::string name)#
Creates a strategy_type.
- Parameters:
name – the name of strategy
-
inline std::string get_name()#
Returns the name of strategy
- Returns:
the name of strategy
- virtual void process( ) = 0#
Computes srow according to row pointers.
- Parameters:
mtx_row_ptrs – the row pointers of the matrix
mtx_srow – the srow of the matrix
-
virtual int64_t clac_size(const int64_t nnz) = 0#
Computes the srow size according to the number of nonzeros.
- Parameters:
nnz – the number of nonzeros
- Returns:
the size of srow
-
virtual std::shared_ptr<strategy_type> copy() = 0#
Copy a strategy. This is a workaround until strategies are revamped, since strategies like
automaticaldo not work when actually shared.
-
class classical #
Inherits from
classical is a strategy_type which uses the same number of threads on each row. Classical strategy uses multithreads to calculate on parts of rows and then do a reduction of these threads results. The number of threads per row depends on the max number of stored elements per row.
Public Functions
-
inline classical()#
Creates a classical strategy.
- inline virtual void process( ) override#
Computes srow according to row pointers.
- Parameters:
mtx_row_ptrs – the row pointers of the matrix
mtx_srow – the srow of the matrix
-
inline virtual int64_t clac_size(const int64_t nnz) override#
Computes the srow size according to the number of nonzeros.
- Parameters:
nnz – the number of nonzeros
- Returns:
the size of srow
-
inline virtual std::shared_ptr<strategy_type> copy() override#
Copy a strategy. This is a workaround until strategies are revamped, since strategies like
automaticaldo not work when actually shared.
-
inline classical()#
-
class merge_path #
Inherits from
merge_path is a strategy_type which uses the merge_path algorithm. merge_path is according to Merrill and Garland: Merge-Based Parallel Sparse Matrix-Vector Multiplication
Public Functions
-
inline merge_path()#
Creates a merge_path strategy.
- inline virtual void process( ) override#
Computes srow according to row pointers.
- Parameters:
mtx_row_ptrs – the row pointers of the matrix
mtx_srow – the srow of the matrix
-
inline virtual int64_t clac_size(const int64_t nnz) override#
Computes the srow size according to the number of nonzeros.
- Parameters:
nnz – the number of nonzeros
- Returns:
the size of srow
-
inline virtual std::shared_ptr<strategy_type> copy() override#
Copy a strategy. This is a workaround until strategies are revamped, since strategies like
automaticaldo not work when actually shared.
-
inline merge_path()#
-
class cusparse #
Inherits from
cusparse is a strategy_type which uses the sparselib csr.
Note
cusparse is also known to the hip executor which converts between cuda and hip.
Public Functions
-
inline cusparse()#
Creates a cusparse strategy.
- inline virtual void process( ) override#
Computes srow according to row pointers.
- Parameters:
mtx_row_ptrs – the row pointers of the matrix
mtx_srow – the srow of the matrix
-
inline virtual int64_t clac_size(const int64_t nnz) override#
Computes the srow size according to the number of nonzeros.
- Parameters:
nnz – the number of nonzeros
- Returns:
the size of srow
-
inline virtual std::shared_ptr<strategy_type> copy() override#
Copy a strategy. This is a workaround until strategies are revamped, since strategies like
automaticaldo not work when actually shared.
-
inline cusparse()#
-
class sparselib #
Inherits from
sparselib is a strategy_type which uses the sparselib csr.
Note
Uses cusparse in cuda and hipsparse in hip.
Public Functions
-
inline sparselib()#
Creates a sparselib strategy.
- inline virtual void process( ) override#
Computes srow according to row pointers.
- Parameters:
mtx_row_ptrs – the row pointers of the matrix
mtx_srow – the srow of the matrix
-
inline virtual int64_t clac_size(const int64_t nnz) override#
Computes the srow size according to the number of nonzeros.
- Parameters:
nnz – the number of nonzeros
- Returns:
the size of srow
-
inline virtual std::shared_ptr<strategy_type> copy() override#
Copy a strategy. This is a workaround until strategies are revamped, since strategies like
automaticaldo not work when actually shared.
-
inline sparselib()#
-
class load_balance #
Inherits from
load_balance is a strategy_type which uses the load balance algorithm.
Public Functions
-
inline load_balance()#
Creates a load_balance strategy.
Warning
this is deprecated! Please rely on the new automatic strategy instantiation or use one of the other constructors.
Creates a load_balance strategy with CUDA executor.
- Parameters:
exec – the CUDA executor
Creates a load_balance strategy with HIP executor.
- Parameters:
exec – the HIP executor
Creates a load_balance strategy with DPCPP executor.
Note
TODO: porting - we hardcode the subgroup size is 32
- Parameters:
exec – the DPCPP executor
- inline load_balance(
- int64_t nwarps,
- int warp_size = 32,
- bool cuda_strategy = true,
- std::string strategy_name = "none",
Creates a load_balance strategy with specified parameters
Note
The warp_size must be the size of full warp. When using this constructor, set_strategy needs to be called with correct parameters which is replaced during the conversion.
- Parameters:
nwarps – the number of warps in the executor
warp_size – the warp size of the executor
cuda_strategy – whether the
cuda_strategyneeds to be used.
- inline virtual void process( ) override#
Computes srow according to row pointers.
- Parameters:
mtx_row_ptrs – the row pointers of the matrix
mtx_srow – the srow of the matrix
-
inline virtual int64_t clac_size(const int64_t nnz) override#
Computes the srow size according to the number of nonzeros.
- Parameters:
nnz – the number of nonzeros
- Returns:
the size of srow
-
inline virtual std::shared_ptr<strategy_type> copy() override#
Copy a strategy. This is a workaround until strategies are revamped, since strategies like
automaticaldo not work when actually shared.
-
inline load_balance()#
-
class automatical #
Inherits from
Public Functions
-
inline automatical()#
Creates an automatical strategy.
Warning
this is deprecated! Please rely on the new automatic strategy instantiation or use one of the other constructors.
Creates an automatical strategy with CUDA executor.
- Parameters:
exec – the CUDA executor
Creates an automatical strategy with HIP executor.
- Parameters:
exec – the HIP executor
Creates an automatical strategy with Dpcpp executor.
Note
TODO: porting - we hardcode the subgroup size is 32
- Parameters:
exec – the Dpcpp executor
- inline automatical(
- int64_t nwarps,
- int warp_size = 32,
- bool cuda_strategy = true,
- std::string strategy_name = "none",
Creates an automatical strategy with specified parameters
Note
The warp_size must be the size of full warp. When using this constructor, set_strategy needs to be called with correct parameters which is replaced during the conversion.
- Parameters:
nwarps – the number of warps in the executor
warp_size – the warp size of the executor
cuda_strategy – whether the
cuda_strategyneeds to be used.
-
inline automatical()#
-
class multiply_reuse_info#
Class describing the internal lookup structures created by multiply_reuse(const Csr*) to recompute a sparse matrix-matrix product with updated values.
-
class multiply_add_reuse_info#
Class describing the internal lookup structures created by multiply_add_reuse to recompute a sparse matrix-matrix product with updated values.
Public Functions
- void update_values(
- ptr_param<const Csr> mtx,
- ptr_param<const Dense<value_type>> scale_mult,
- ptr_param<const Csr> mtx_mult,
- ptr_param<const Dense<value_type>> scale_add,
- ptr_param<const Csr> mtx_add,
- ptr_param<Csr> out,
Recomputes the sparse matrix-matrix product
out = scale_mult * mtx * mtx_mult + scale_add * mtx_addwhen only the values of mtx, scale_mult, mtx_mult, scale_add, mtx_add changed, but the sparsity patterns of mtx, mtx_mult, mtx_add and out are unchanged.
-
class scale_add_reuse_info#
Class describing the internal lookup structures created by scale_add_reuse to recompute a sparse matrix-matrix sum with updated values.
Public Functions
- void update_values(
- ptr_param<const Dense<value_type>> scale1,
- ptr_param<const Csr> mtx1,
- ptr_param<const Dense<value_type>> scale2,
- ptr_param<const Csr> mtx2,
- ptr_param<Csr> out,
Recomputes the sparse matrix-matrix sum
out = scale1 * mtx1 + scale2 * mtx2when only the values of mtx1, scale1, mtx2, scale2 changed, but the sparsity patterns of mtx1, mtx2 and out are unchanged.
-
struct permuting_reuse_info#
A struct describing a transformation of the matrix that reorders the values of the matrix into the transformed matrix.
Public Functions
-
explicit permuting_reuse_info()#
Creates an empty reuse info.
- explicit permuting_reuse_info(
- std::unique_ptr<Permutation<index_type>> value_permutation,
Creates a reuse info structure from its value permutation.
- void update_values( ) const#
Propagates the values from an input matrix to the transformed matrix. The output matrix needs to have been computed using the transformation that was also used to generate this reuse data. Internally, this permutes the input value vector into the output value vector.
-
explicit permuting_reuse_info()#