gko::matrix::Hybrid#

Hybrid ELL + COO. Stores the bulk of each row’s nonzeros in an ELL-padded portion and spills the remainder into a COO tail. Useful when most rows have a similar nonzero count but a few outliers would otherwise inflate the ELL padding.

template<typename ValueType = default_precision, typename IndexType = int32>
class Hybrid #

Inherits from

  • public gko::EnableLinOp<Hybrid<default_precision, int32>>

  • public ConvertibleTo<Hybrid<next_precision<default_precision>, int32>>

  • public ConvertibleTo<Hybrid<next_precision<default_precision, 2>, int32>>

  • public ConvertibleTo<Hybrid<next_precision<default_precision, 3>, int32>>

  • public ConvertibleTo<Dense<default_precision>>

  • public ConvertibleTo<Csr<default_precision, int32>>

  • public gko::DiagonalExtractable<default_precision>

  • public gko::ReadableFromMatrixData<default_precision, int32>

  • public gko::WritableToMatrixData<default_precision, int32>

  • public gko::EnableAbsoluteComputation<remove_complex<Hybrid<default_precision, int32>>>

HYBRID is a matrix format which splits the matrix into ELLPACK and COO format. Achieve the excellent performance with a proper partition of ELLPACK and COO.

Template Parameters:
  • ValueType – precision of matrix elements

  • IndexType – precision of matrix indexes

Public Functions

virtual std::unique_ptr<Diagonal<ValueType>> extract_diagonal(
) const override#

Extracts the diagonal entries of the matrix into a vector.

Parameters:

diag – the vector into which the diagonal will be written

virtual std::unique_ptr<absolute_type> compute_absolute(
) const override#

Gets the AbsoluteLinOp

Returns:

a pointer to the new absolute object

virtual void compute_absolute_inplace() override#

Compute absolute inplace on each element.

inline value_type *get_ell_values() noexcept#

Returns the values of the ell part.

Returns:

the values of the ell part

inline const value_type *get_const_ell_values() const noexcept#

Returns the values of the ell part.

Note

This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.

Returns:

the values of the ell part

inline index_type *get_ell_col_idxs() noexcept#

Returns the column indexes of the ell part.

Returns:

the column indexes of the ell part

inline const index_type *get_const_ell_col_idxs() const noexcept#

Returns the column indexes of the ell part.

Note

This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.

Returns:

the column indexes of the ell part

inline size_type get_ell_num_stored_elements_per_row() const noexcept#

Returns the number of stored elements per row of ell part.

Returns:

the number of stored elements per row of ell part

inline size_type get_ell_stride() const noexcept#

Returns the stride of the ell part.

Returns:

the stride of the ell part

inline size_type get_ell_num_stored_elements() const noexcept#

Returns the number of elements explicitly stored in the ell part.

Returns:

the number of elements explicitly stored in the ell part

inline value_type &ell_val_at(size_type row, size_type idx) noexcept#

Returns the idx-th non-zero element of the row-th row in the ell part.

Note

the method has to be called on the same Executor the matrix is stored at (e.g. trying to call this method on a GPU matrix from the OMP results in a runtime error)

Parameters:
  • row – the row of the requested element

  • idx – the idx-th stored element of the row

inline value_type ell_val_at(
size_type row,
size_type idx,
) const noexcept#

Returns the idx-th non-zero element of the row-th row in the ell part.

Note

the method has to be called on the same Executor the matrix is stored at (e.g. trying to call this method on a GPU matrix from the OMP results in a runtime error)

Parameters:
  • row – the row of the requested element

  • idx – the idx-th stored element of the row

inline index_type &ell_col_at(size_type row, size_type idx) noexcept#

Returns the idx-th column index of the row-th row in the ell part.

Note

the method has to be called on the same Executor the matrix is stored at (e.g. trying to call this method on a GPU matrix from the OMP results in a runtime error)

Parameters:
  • row – the row of the requested element

  • idx – the idx-th stored element of the row

inline index_type ell_col_at(
size_type row,
size_type idx,
) const noexcept#

Returns the idx-th column index of the row-th row in the ell part.

Note

the method has to be called on the same Executor the matrix is stored at (e.g. trying to call this method on a GPU matrix from the OMP results in a runtime error)

Parameters:
  • row – the row of the requested element

  • idx – the idx-th stored element of the row

inline const ell_type *get_ell() const noexcept#

Returns the matrix of the ell part

Returns:

the matrix of the ell part

inline value_type *get_coo_values() noexcept#

Returns the values of the coo part.

Returns:

the values of the coo part.

inline const value_type *get_const_coo_values() const noexcept#

Returns the values of the coo part.

Note

This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.

Returns:

the values of the coo part.

inline index_type *get_coo_col_idxs() noexcept#

Returns the column indexes of the coo part.

Returns:

the column indexes of the coo part.

inline const index_type *get_const_coo_col_idxs() const noexcept#

Returns the column indexes of the coo part.

Note

This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.

Returns:

the column indexes of the coo part.

inline index_type *get_coo_row_idxs() noexcept#

Returns the row indexes of the coo part.

Returns:

the row indexes of the coo part.

inline const index_type *get_const_coo_row_idxs() const noexcept#

Returns the row indexes of the coo part.

Note

This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.

Returns:

the row indexes of the coo part.

inline size_type get_coo_num_stored_elements() const noexcept#

Returns the number of elements explicitly stored in the coo part.

Returns:

the number of elements explicitly stored in the coo part

inline const coo_type *get_coo() const noexcept#

Returns the matrix of the coo part

Returns:

the matrix of the coo part

inline size_type get_num_stored_elements() const noexcept#

Returns the number of elements explicitly stored in the matrix.

Returns:

the number of elements explicitly stored in the matrix

inline std::shared_ptr<strategy_type> get_strategy() const noexcept#

Returns the strategy

Returns:

the strategy

template<typename HybType>
std::shared_ptr<typename HybType::strategy_type> get_strategy(
) const#

Returns the current strategy allowed in given hybrid format

Template Parameters:

HybType – hybrid type

Returns:

the strategy

Hybrid &operator=(const Hybrid&)#

Copy-assigns a Hybrid matrix. Preserves the executor, copy-assigns the Ell and Coo matrices.

Hybrid &operator=(Hybrid&&)#

Move-assigns a Hybrid matrix. Preserves the executor, move-assigns the Ell and Coo matrices. The moved-from matrix is empty (0x0 with empty Ell/Coo matrices).

Hybrid(const Hybrid&)#

Copy-assigns a Hybrid matrix. Inherits the executor, copies the Ell and Coo matrices.

Hybrid(Hybrid&&)#

Move-assigns a Hybrid matrix. Inherits the executor, moves the Ell and Coo matrices. The moved-from matrix is empty (0x0 with empty Ell/Coo matrices).

Public Static Functions

static std::unique_ptr<Hybrid> create(
std::shared_ptr<const Executor> exec,
std::shared_ptr<strategy_type> strategy = std::make_shared<automatic>(),
)#

Creates an uninitialized Hybrid matrix of specified method. (ell_num_stored_elements_per_row is set to the number of cols of the matrix. ell_stride is set to the number of rows of the matrix.)

Parameters:
  • execExecutor associated to the matrix

  • strategy – strategy of deciding the Hybrid config

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Hybrid> create(
std::shared_ptr<const Executor> exec,
const dim<2> &size,
std::shared_ptr<strategy_type> strategy = std::make_shared<automatic>(),
)#

Creates an uninitialized Hybrid matrix of the specified size and method. (ell_num_stored_elements_per_row is set to the number of cols of the matrix. ell_stride is set to the number of rows of the matrix.)

Parameters:
  • execExecutor associated to the matrix

  • size – size of the matrix

  • strategy – strategy of deciding the Hybrid config

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Hybrid> create(
std::shared_ptr<const Executor> exec,
const dim<2> &size,
size_type num_stored_elements_per_row,
std::shared_ptr<strategy_type> strategy = std::make_shared<automatic>(),
)#

Creates an uninitialized Hybrid matrix of the specified size and method. (ell_stride is set to the number of rows of the matrix.)

Parameters:
  • execExecutor associated to the matrix

  • size – size of the matrix

  • num_stored_elements_per_row – the number of stored elements per row

  • strategy – strategy of deciding the Hybrid config

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Hybrid> create(
std::shared_ptr<const Executor> exec,
const dim<2> &size,
size_type num_stored_elements_per_row,
size_type stride,
std::shared_ptr<strategy_type> strategy,
)#

Creates an uninitialized Hybrid matrix of the specified size and method.

Parameters:
  • execExecutor associated to the matrix

  • size – size of the matrix

  • num_stored_elements_per_row – the number of stored elements per row

  • stride – stride of the rows

  • strategy – strategy of deciding the Hybrid config

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Hybrid> create(
std::shared_ptr<const Executor> exec,
const dim<2> &size,
size_type num_stored_elements_per_row,
size_type stride,
size_type num_nonzeros = {},
std::shared_ptr<strategy_type> strategy = std::make_shared<automatic>(),
)#

Creates an uninitialized Hybrid matrix of the specified size and method.

Parameters:
  • execExecutor associated to the matrix

  • size – size of the matrix

  • num_stored_elements_per_row – the number of stored elements per row

  • stride – stride of the rows

  • num_nonzeros – number of nonzeros

  • strategy – strategy of deciding the Hybrid config

Returns:

A smart pointer to the newly created matrix.

class strategy_type#

strategy_type is to decide how to set the hybrid config. It computes the number of stored elements per row of the ell part and then set the number of residual nonzeros as the number of nonzeros of the coo part.

The practical strategy method should inherit strategy_type and implement its compute_ell_num_stored_elements_per_row function.

Subclassed by

Public Functions

inline strategy_type()#

Creates a strategy_type.

inline void compute_hybrid_config(
const array<size_type> &row_nnz,
size_type *ell_num_stored_elements_per_row,
size_type *coo_nnz,
)#

Computes the config of the Hybrid matrix (ell_num_stored_elements_per_row and coo_nnz). For now, it copies row_nnz to the reference executor and performs all operations on the reference executor.

Parameters:
  • row_nnz – the number of nonzeros of each row

  • ell_num_stored_elements_per_row – the output number of stored elements per row of the ell part

  • coo_nnz – the output number of nonzeros of the coo part

inline size_type get_ell_num_stored_elements_per_row() const noexcept#

Returns the number of stored elements per row of the ell part.

Returns:

the number of stored elements per row of the ell part

inline size_type get_coo_nnz() const noexcept#

Returns the number of nonzeros of the coo part.

Returns:

the number of nonzeros of the coo part

virtual size_type compute_ell_num_stored_elements_per_row(
array<size_type> *row_nnz,
) const = 0#

Computes the number of stored elements per row of the ell part.

Parameters:

row_nnz – the number of nonzeros of each row

Returns:

the number of stored elements per row of the ell part

class column_limit #

Inherits from

column_limit is a strategy_type which decides the number of stored elements per row of the ell part by specifying the number of columns.

Public Functions

inline explicit column_limit(size_type num_column = 0)#

Creates a column_limit strategy.

Parameters:

num_column – the specified number of columns of the ell part

inline virtual size_type compute_ell_num_stored_elements_per_row(
array<size_type> *row_nnz,
) const override#

Computes the number of stored elements per row of the ell part.

Parameters:

row_nnz – the number of nonzeros of each row

Returns:

the number of stored elements per row of the ell part

inline auto get_num_columns() const#

Get the number of columns limit

Returns:

the number of columns limit

class imbalance_limit #

Inherits from

imbalance_limit is a strategy_type which decides the number of stored elements per row of the ell part according to the percent. It sorts the number of nonzeros of each row and takes the value at the position floor(percent * num_row) as the number of stored elements per row of the ell part. Thus, at least percent rows of all are in the ell part.

Public Functions

inline explicit imbalance_limit(double percent = 0.8)#

Creates a imbalance_limit strategy.

Parameters:

percent – the row_nnz[floor(num_rows*percent)] is the number of stored elements per row of the ell part

inline virtual size_type compute_ell_num_stored_elements_per_row(
array<size_type> *row_nnz,
) const override#

Computes the number of stored elements per row of the ell part.

Parameters:

row_nnz – the number of nonzeros of each row

Returns:

the number of stored elements per row of the ell part

inline auto get_percentage() const#

Get the percent setting

Returns:

percent

class imbalance_bounded_limit #

Inherits from

imbalance_bounded_limit is a strategy_type which decides the number of stored elements per row of the ell part. It uses the imbalance_limit and adds the upper bound of the number of ell’s cols by the number of rows.

Public Functions

inline imbalance_bounded_limit(
double percent = 0.8,
double ratio = 0.0001,
)#

Creates a imbalance_bounded_limit strategy.

inline virtual size_type compute_ell_num_stored_elements_per_row(
array<size_type> *row_nnz,
) const override#

Computes the number of stored elements per row of the ell part.

Parameters:

row_nnz – the number of nonzeros of each row

Returns:

the number of stored elements per row of the ell part

inline auto get_percentage() const#

Get the percent setting

Returns:

percent

inline auto get_ratio() const#

Get the ratio setting

Returns:

ratio

class minimal_storage_limit #

Inherits from

minimal_storage_limit is a strategy_type which decides the number of stored elements per row of the ell part. It is determined by the size of ValueType and IndexType, the storage is the minimum among all partition.

Public Functions

inline minimal_storage_limit()#

Creates a minimal_storage_limit strategy.

inline virtual size_type compute_ell_num_stored_elements_per_row(
array<size_type> *row_nnz,
) const override#

Computes the number of stored elements per row of the ell part.

Parameters:

row_nnz – the number of nonzeros of each row

Returns:

the number of stored elements per row of the ell part

inline auto get_percentage() const#

Get the percent setting

Returns:

percent

class automatic #

Inherits from

automatic is a strategy_type which decides the number of stored elements per row of the ell part automatically.

Public Functions

inline automatic()#

Creates an automatic strategy.

inline virtual size_type compute_ell_num_stored_elements_per_row(
array<size_type> *row_nnz,
) const override#

Computes the number of stored elements per row of the ell part.

Parameters:

row_nnz – the number of nonzeros of each row

Returns:

the number of stored elements per row of the ell part