ILU family (ILU, ParILU, ParILUT)#

The Incomplete LU family preconditions a general non-symmetric system \(A\) by computing an approximation \(L U \approx A\) and using its action as the preconditioner — at apply time, the preconditioner solves \(L U z = r\) by chaining a forward triangular solve with \(L\) and a backward triangular solve with \(U\).

Three pieces work together:

The matching preconditioner LinOp, gko::preconditioner::Ilu, wraps the factorisation result and configures the two triangular-solve LinOps used at apply time. Default solvers are direct LowerTrs / UpperTrs; Krylov inner solvers can be substituted for cases where the triangular solves are the bottleneck.

When to use ILU#

  • Non-symmetric matrices where Jacobi is too weak but a full LU factorisation would not fit in memory. ILU’s storage is bounded by the sparsity pattern of \(A\) (for ILU(0)) or by a tunable budget (for ParILUT).

  • As the preconditioner inside BiCGSTAB, GMRES, or IDR(s) — ILU does not preserve symmetry, so non-symmetric Krylov methods are the natural pairing.

  • Pair with a fill-reducing reordering (AMD, nested dissection) and/or an MC64 matching pre-step on indefinite systems to improve diagonal dominance before factorisation.

For SPD matrices, use the IC family instead — it preserves symmetry and lets you pair with CG.

Picking a variant#

Variant

Algorithm

Best when

factorization::Ilu

Sequential ILU(0) on CPU

Small to moderate matrices; the canonical correctness baseline.

factorization::ParIlu

Asynchronous fixed-point ILU(0)

Larger matrices on GPU or many-core CPU; same sparsity pattern as Ilu.

factorization::ParIlut

Threshold-based ILU with sparsity-pattern refinement

When ILU(0) is too sparse — ParILUT improves the pattern at the cost of more storage and longer generation.

Construction#

// Standard ILU(0) preconditioner inside BiCGSTAB
auto bicgstab = gko::solver::Bicgstab<double>::build()
                    .with_criteria(...)
                    .with_preconditioner(
                        gko::preconditioner::Ilu<>::build()
                            .with_factorization(
                                gko::factorization::ParIlu<double, gko::int32>
                                    ::build().on(exec))
                            .on(exec))
                    .on(exec);

// With iterative inner solves for the triangular systems
auto preconditioner =
    gko::preconditioner::Ilu<gko::solver::Bicgstab<double>,
                             gko::solver::Bicgstab<double>>::build()
        .with_factorization(
            gko::factorization::ParIlu<double, gko::int32>::build().on(exec))
        .with_l_solver(
            gko::solver::Bicgstab<double>::build()
                .with_criteria(
                    gko::stop::Iteration::build()
                        .with_max_iters(3u).on(exec))
                .on(exec))
        .with_u_solver(
            gko::solver::Bicgstab<double>::build()
                .with_criteria(
                    gko::stop::Iteration::build()
                        .with_max_iters(3u).on(exec))
                .on(exec))
        .on(exec);

The first template parameter of preconditioner::Ilu is the \(L\) solver type, the second is the \(U\) solver type; both default to solver::LowerTrs / solver::UpperTrs.

Factory parameters#

Parameter

Type

Purpose

factorization

LinOpFactory

The factorisation that produces \(L\) and \(U\). Typically Ilu, ParIlu, or ParIlut. Required.

l_solver

LinOpFactory

Factory for the lower-triangular solver. Defaults to a direct lower-triangular solve.

u_solver

LinOpFactory

Factory for the upper-triangular solver. Defaults to a direct upper-triangular solve.

Implementation notes#

ILU does not preserve symmetry of the system matrix, so the preconditioned operator is generally non-symmetric even when \(A\) is — use non-symmetric Krylov methods (BiCGSTAB, GMRES, IDR(s)) with ILU rather than CG.

ParILU and ParILUT generation are asynchronous fixed-point iterations of Chow & Patel; the number of sweeps needed for a usable preconditioner depends on the parallelism level (typically 3 on OpenMP, 10+ on GPU). ParILUT additionally alternates add-fill-in / fixed-point iterate / drop-smallest steps to refine the sparsity pattern.

See also

References: