gko::factorization::ParIct#
Parallel threshold-based incomplete Cholesky. Refines both the sparsity pattern of \(L\) and its numerical entries simultaneously: at each sweep it adds fill-in candidates from the current residual, re-runs a fixed-point iteration, and drops the smallest entries by magnitude. Algorithm of Anzt et al. (ParILUT family).
-
template<typename ValueType = default_precision, typename IndexType = int32>
class ParIct # Inherits from
public gko::Composition<default_precision>
ParICT is an incomplete threshold-based Cholesky factorization which is computed in parallel.
\(L\) is a lower triangular matrix which approximates a given symmetric positive definite matrix \(A\) with \(A \approx LL^T\). Here, \(L\) has a sparsity pattern that is improved iteratively based on its element-wise magnitude. The initial sparsity pattern is chosen based on the lower triangle of \(A\).
One iteration of the ParICT algorithm consists of the following steps:
Calculate the residual \(R = A - LL^T\).
Add new non-zero locations from \(R\) to \(L\). The new non-zero locations are initialised from the corresponding residual entries.
Execute a fixed-point iteration on \(L\) according to
\[\begin{split} F(L)_{ij} = \begin{cases} \frac{1}{l_{jj}} \left( a_{ij} - \sum_{k=1}^{j-1} l_{ik}\, l_{jk} \right), & i \neq j, \\ \sqrt{ a_{ij} - \sum_{k=1}^{j-1} l_{ik}\, l_{jk} }, & i = j. \end{cases} \end{split}\]Remove the smallest entries (by magnitude) from \(L\).
Execute a fixed-point iteration on the (now sparser) \(L\).
This ParICT algorithm thus improves the sparsity pattern and the approximation of \(L\) simultaneously.
- References
Anzt, H., Chow, E., Dongarra, J. ParILUT — A Parallel Threshold ILU for GPUs. 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 231–241. https://doi.org/10.1109/IPDPS.2019.00033
- Template Parameters:
ValueType – Type of the values of all matrices used in this class
IndexType – Type of the indices of all matrices used in this class
Public Static Functions
- static parameters_type parse(
- const config::pnode &config,
- const config::registry &context,
- const config::type_descriptor &td_for_child = config::make_type_descriptor<ValueType, IndexType>(),
Create the parameters from the property_tree. Because this is directly tied to the specific type, the value/index type settings within config are ignored and type_descriptor is only used for children configs.
- Parameters:
config – the property tree for setting
context – the registry
td_for_child – the type descriptor for children configs. The default uses the value/index type of this class.
- Returns:
parameters
-
struct parameters_type#
Public Members
-
size_type iterations#
The number of total iterations of ParICT that will be executed. The default value is 5.
-
bool skip_sorting#
truemeans it is known that the matrix given to this factory will be sorted first by row, then by column index,falsemeans it is unknown or not sorted, so an additional sorting step will be performed during the factorization (it will not change the matrix given). The matrix must be sorted for this factorization to work.The
system_matrix, which will be given to this factory, must be sorted (first by row, then by column) in order for the algorithm to work. If it is known that the matrix will be sorted, this parameter can be set totrueto skip the sorting (therefore, shortening the runtime). However, if it is unknown or if the matrix is known to be not sorted, it must remainfalse, otherwise, the factorization might be incorrect.
-
bool approximate_select#
truemeans the candidate selection will use an inexact selection algorithm.falsemeans an exact selection algorithm will be used.Using the approximate selection algorithm can give a significant speed-up, but may in the worst case cause the algorithm to vastly exceed its
fill_in_limit. The exact selection needs more time, but more closely fulfills thefill_in_limitexcept for pathological cases (many candidates with equal magnitude).The default behavior is to use approximate selection.
-
bool deterministic_sample#
truemeans the sample used for the selection algorithm will be chosen deterministically. This is only relevant when usingapproximate_select. It is mostly used for testing.The selection algorithm used for
approximate_selectuses a small sample of the input data to determine an approximate threshold. The choice of elements can either be randomized, i.e., we may use different elements during each execution, or deterministic, i.e., the element choices are always the same.Note that even though the threshold selection step may be made deterministic this way, the calculation of the IC factors can still be non-deterministic due to its asynchronous iterations.
The default behavior is to use a random sample.
-
double fill_in_limit#
the amount of fill-in that is allowed in L compared to the lower triangle of A.
The threshold for removing candidates from the intermediate L is set such that the resulting sparsity pattern has at most
fill_in_limittimes the number of non-zeros of the lower triangle of A factorization..The default value
2.0allows twice the number of non-zeros in L compared to the lower triangle of A.
-
std::shared_ptr<typename matrix_type::strategy_type> l_strategy#
Strategy which will be used by the L matrix. The default value
nullptrwill result in the strategyclassical.
-
std::shared_ptr<typename matrix_type::strategy_type> lt_strategy#
Strategy which will be used by the L^T matrix. The default value
nullptrwill result in the strategyclassical.
-
size_type iterations#