`gko::factorization::ParIct`#

Parallel threshold-based incomplete Cholesky. Refines both the sparsity pattern of \(L\) and its numerical entries simultaneously: at each sweep it adds fill-in candidates from the current residual, re-runs a fixed-point iteration, and drops the smallest entries by magnitude. Algorithm of Anzt et al. (ParILUT family).

template<typename ValueType = default_precision, typename IndexType = int32> class ParIct #

Inherits from

public gko::Composition<default_precision>

ParICT is an incomplete threshold-based Cholesky factorization which is computed in parallel.

\(L\) is a lower triangular matrix which approximates a given symmetric positive definite matrix \(A\) with \(A \approx LL^T\). Here, \(L\) has a sparsity pattern that is improved iteratively based on its element-wise magnitude. The initial sparsity pattern is chosen based on the lower triangle of \(A\).

One iteration of the ParICT algorithm consists of the following steps:

Calculate the residual \(R = A - LL^T\).
Add new non-zero locations from \(R\) to \(L\). The new non-zero locations are initialised from the corresponding residual entries.
Execute a fixed-point iteration on \(L\) according to

\[\begin{split} F(L)_{ij} = \begin{cases} \frac{1}{l_{jj}} \left( a_{ij} - \sum_{k=1}^{j-1} l_{ik}\, l_{jk} \right), & i \neq j, \\ \sqrt{ a_{ij} - \sum_{k=1}^{j-1} l_{ik}\, l_{jk} }, & i = j. \end{cases} \end{split}\]
Remove the smallest entries (by magnitude) from \(L\).
Execute a fixed-point iteration on the (now sparser) \(L\).

This ParICT algorithm thus improves the sparsity pattern and the approximation of \(L\) simultaneously.

References

Anzt, H., Chow, E., Dongarra, J. ParILUT — A Parallel Threshold ILU for GPUs. 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 231–241. https://doi.org/10.1109/IPDPS.2019.00033

Template Parameters:

ValueType – Type of the values of all matrices used in this class
IndexType – Type of the indices of all matrices used in this class

Public Static Functions

static parameters_type parse( const config::pnode &config, const config::registry &context, const config::type_descriptor &td_for_child = config::make_type_descriptor<ValueType, IndexType>(), )#

Create the parameters from the property_tree. Because this is directly tied to the specific type, the value/index type settings within config are ignored and type_descriptor is only used for children configs.

Parameters:

config – the property tree for setting
context – the registry
td_for_child – the type descriptor for children configs. The default uses the value/index type of this class.

Returns:

parameters

struct parameters_type#

Public Members

size_type iterations#: The number of total iterations of ParICT that will be executed. The default value is 5.

bool skip_sorting#

true means it is known that the matrix given to this factory will be sorted first by row, then by column index, false means it is unknown or not sorted, so an additional sorting step will be performed during the factorization (it will not change the matrix given). The matrix must be sorted for this factorization to work.

The system_matrix, which will be given to this factory, must be sorted (first by row, then by column) in order for the algorithm to work. If it is known that the matrix will be sorted, this parameter can be set to true to skip the sorting (therefore, shortening the runtime). However, if it is unknown or if the matrix is known to be not sorted, it must remain false, otherwise, the factorization might be incorrect.

bool approximate_select#

true means the candidate selection will use an inexact selection algorithm. false means an exact selection algorithm will be used.

Using the approximate selection algorithm can give a significant speed-up, but may in the worst case cause the algorithm to vastly exceed its fill_in_limit. The exact selection needs more time, but more closely fulfills the fill_in_limit except for pathological cases (many candidates with equal magnitude).

The default behavior is to use approximate selection.

bool deterministic_sample#

true means the sample used for the selection algorithm will be chosen deterministically. This is only relevant when using approximate_select. It is mostly used for testing.

The selection algorithm used for approximate_select uses a small sample of the input data to determine an approximate threshold. The choice of elements can either be randomized, i.e., we may use different elements during each execution, or deterministic, i.e., the element choices are always the same.

Note that even though the threshold selection step may be made deterministic this way, the calculation of the IC factors can still be non-deterministic due to its asynchronous iterations.

The default behavior is to use a random sample.

double fill_in_limit#

the amount of fill-in that is allowed in L compared to the lower triangle of A.

The threshold for removing candidates from the intermediate L is set such that the resulting sparsity pattern has at most fill_in_limit times the number of non-zeros of the lower triangle of A factorization..

The default value 2.0 allows twice the number of non-zeros in L compared to the lower triangle of A.

std::shared_ptr<typename matrix_type::strategy_type> l_strategy#: Strategy which will be used by the L matrix. The default value nullptr will result in the strategy classical.

std::shared_ptr<typename matrix_type::strategy_type> lt_strategy#: Strategy which will be used by the L^T matrix. The default value nullptr will result in the strategy classical.

gko::factorization::ParIct#

`gko::factorization::ParIct`#