gko::experimental::distributed::Vector#

Distributed dense vector. Each rank owns the rows of the global vector that match its partition; per-rank storage is a regular matrix::Dense, accessible through get_local_vector(). Inner products, norms, and other reductions perform a single MPI Allreduce across the communicator.

template<typename ValueType = double>
class Vector #

Inherits from

  • public gko::EnableLinOp<Vector<double>>

  • public ConvertibleTo<Vector<next_precision<double>>>

  • public ConvertibleTo<Vector<next_precision<double, 2>>>

  • public ConvertibleTo<Vector<next_precision<double, 3>>>

  • public gko::EnableAbsoluteComputation<remove_complex<Vector<double>>>

  • public gko::experimental::distributed::DistributedBase

Vector is a format which explicitly stores (multiple) distributed column vectors in a dense storage format.

The (multi-)vector is distributed by row, which is described by a Partition. The local vectors are stored using the Dense format. The vector should be filled using the read_distributed method, e.g.

auto part = Partition<...>::build_from_mapping(...);
auto vector = Vector<...>::create(exec, comm);
vector->read_distributed(matrix_data, part);
Using this approach the size of the global vectors, as well as the size of the local vectors, will be automatically inferred. It is possible to create a vector with specified global and local sizes and fill the local vectors using the accessor get_local_vector.

Note

Operations between two vectors (axpy, dot product, etc.) are only valid if both vectors where created using the same partition.

Template Parameters:

ValueType – The precision of vector elements.

Public Functions

void read_distributed(
const device_matrix_data<ValueType, int64> &data,
ptr_param<const Partition<int64, int64>> partition,
)#

Reads a vector from the device_matrix_data structure and a global row partition.

The number of rows of the matrix data is ignored, only its number of columns is relevant. Both the number of local and global rows are inferred from the row partition.

Note

The matrix data can contain entries for rows other than those owned by the process. Entries for those rows are discarded.

Parameters:
  • data – The device_matrix_data structure

  • partition – The global row partition

void read_distributed(
const matrix_data<ValueType, int64> &data,
ptr_param<const Partition<int64, int64>> partition,
)#

Reads a vector from the matrix_data structure and a global row partition.

See read_distributed().

Note

For efficiency it is advised to use the device_matrix_data overload.

virtual std::unique_ptr<absolute_type> compute_absolute(
) const override#

Gets the AbsoluteLinOp

Returns:

a pointer to the new absolute object

virtual void compute_absolute_inplace() override#

Compute absolute inplace on each element.

std::unique_ptr<complex_type> make_complex() const#

Creates a complex copy of the original vectors. If the original vectors were real, the imaginary part of the result will be zero.

void make_complex(ptr_param<complex_type> result) const#

Writes a complex copy of the original vectors to given complex vectors. If the original vectors were real, the imaginary part of the result will be zero.

std::unique_ptr<real_type> get_real() const#

Creates new real vectors and extracts the real part of the original vectors into that.

void get_real(ptr_param<real_type> result) const#

Extracts the real part of the original vectors into given real vectors.

std::unique_ptr<real_type> get_imag() const#

Creates new real vectors and extracts the imaginary part of the original vectors into that.

void get_imag(ptr_param<real_type> result) const#

Extracts the imaginary part of the original vectors into given real vectors.

void fill(ValueType value)#

Fill the distributed vectors with a given value.

Parameters:

value – the value to be filled

void scale(ptr_param<const LinOp> alpha)#

Scales the vectors with a scalar (aka: BLAS scal).

Parameters:

alpha – If alpha is 1x1 Dense matrx, the all vectors are scaled by alpha. If it is a Dense row vector of values, then i-th column vector is scaled with the i-th element of alpha (the number of columns of alpha has to match the number of vectors).

void inv_scale(ptr_param<const LinOp> alpha)#

Scales the vectors with the inverse of a scalar.

Parameters:

alpha – If alpha is 1x1 Dense matrix, the all vectors are scaled by 1 / alpha. If it is a Dense row vector of values, then i-th column vector is scaled with the inverse of the i-th element of alpha (the number of columns of alpha has to match the number of vectors).

void add_scaled(
ptr_param<const LinOp> alpha,
ptr_param<const LinOp> b,
)#

Adds b scaled by alpha to the vectors (aka: BLAS axpy).

Parameters:
  • alpha – If alpha is 1x1 Dense matrix, the all vectors of b are scaled by alpha. If it is a Dense row vector of values, then i-th column vector of b is scaled with the i-th element of alpha (the number of columns of alpha has to match the number of vectors).

  • b – a (multi-)vector of the same dimension as this

void sub_scaled(
ptr_param<const LinOp> alpha,
ptr_param<const LinOp> b,
)#

Subtracts b scaled by alpha from the vectors (aka: BLAS axpy).

Parameters:
  • alpha – If alpha is 1x1 Dense matrix, the all vectors of b are scaled by alpha. If it is a Dense row vector of values, then i-th column vector of b is scaled with the i-th element of alpha (the number of c

  • b – a (multi-)vector of the same dimension as this

void compute_dot(
ptr_param<const LinOp> b,
ptr_param<LinOp> result,
) const#

Computes the column-wise dot product of this (multi-)vector and b using a global reduction.

Parameters:
  • b – a (multi-)vector of same dimension as this

  • result – a Dense row matrix, used to store the dot product (the number of column in result must match the number of columns of this)

void compute_dot(
ptr_param<const LinOp> b,
ptr_param<LinOp> result,
array<char> &tmp,
) const#

Computes the column-wise dot product of this (multi-)vector and b using a global reduction.

Parameters:
  • b – a (multi-)vector of same dimension as this

  • result – a Dense row matrix, used to store the dot product (the number of column in result must match the number of columns of this)

  • tmp – the temporary storage to use for partial sums during the reduction computation. It may be resized and/or reset to the correct executor.

void compute_conj_dot(
ptr_param<const LinOp> b,
ptr_param<LinOp> result,
) const#

Computes the column-wise dot product of this (multi-)vector and conj(b) using a global reduction.

Parameters:
  • b – a (multi-)vector of same dimension as this

  • result – a Dense row matrix, used to store the dot product (the number of column in result must match the number of columns of this)

void compute_conj_dot(
ptr_param<const LinOp> b,
ptr_param<LinOp> result,
array<char> &tmp,
) const#

Computes the column-wise dot product of this (multi-)vector and conj(b) using a global reduction.

Parameters:
  • b – a (multi-)vector of same dimension as this

  • result – a Dense row matrix, used to store the dot product (the number of column in result must match the number of columns of this)

  • tmp – the temporary storage to use for partial sums during the reduction computation. It may be resized and/or reset to the correct executor.

void compute_squared_norm2(ptr_param<LinOp> result) const#

Computes the square of the column-wise Euclidean ( \(L^2\)) norm of this (multi-)vector using a global reduction.

Parameters:

result – a Dense row vector, used to store the norm (the number of columns in the vector must match the number of columns of this)

void compute_squared_norm2(
ptr_param<LinOp> result,
array<char> &tmp,
) const#

Computes the square of the column-wise Euclidean ( \(L^2\)) norm of this (multi-)vector using a global reduction.

Parameters:
  • result – a Dense row vector, used to store the norm (the number of columns in the vector must match the number of columns of this)

  • tmp – the temporary storage to use for partial sums during the reduction computation. It may be resized and/or reset to the correct executor.

void compute_norm2(ptr_param<LinOp> result) const#

Computes the Euclidean (L^2) norm of this (multi-)vector using a global reduction.

Parameters:

result – a Dense row matrix, used to store the norm (the number of columns in result must match the number of columns of this)

void compute_norm2(
ptr_param<LinOp> result,
array<char> &tmp,
) const#

Computes the Euclidean (L^2) norm of this (multi-)vector using a global reduction.

Parameters:
  • result – a Dense row matrix, used to store the norm (the number of columns in result must match the number of columns of this)

  • tmp – the temporary storage to use for partial sums during the reduction computation. It may be resized and/or reset to the correct executor.

void compute_norm1(ptr_param<LinOp> result) const#

Computes the column-wise (L^1) norm of this (multi-)vector.

Parameters:

result – a Dense row matrix, used to store the norm (the number of columns in result must match the number of columns of this)

void compute_norm1(
ptr_param<LinOp> result,
array<char> &tmp,
) const#

Computes the column-wise (L^1) norm of this (multi-)vector using a global reduction.

Parameters:
  • result – a Dense row matrix, used to store the norm (the number of columns in result must match the number of columns of this)

  • tmp – the temporary storage to use for partial sums during the reduction computation. It may be resized and/or reset to the correct executor.

void compute_mean(ptr_param<LinOp> result) const#

Computes the column-wise mean of this (multi-)vector using a global reduction.

Parameters:

result – a Dense row matrix, used to store the mean (the number of columns in result must match the number of columns of this)

void compute_mean(
ptr_param<LinOp> result,
array<char> &tmp,
) const#

Computes the column-wise arithmetic mean of this (multi-)vector using a global reduction.

Parameters:
  • result – a Dense row matrix, used to store the mean (the number of columns in result must match the number of columns of this)

  • tmp – the temporary storage to use for partial sums during the reduction computation. It may be resized and/or reset to the correct executor.

value_type &at_local(size_type row, size_type col) noexcept#

Returns a single element of the multi-vector.

Note

the method has to be called on the same Executor the multi-vector is stored at (e.g. trying to call this method on a GPU multi-vector from the OMP results in a runtime error)

Parameters:
  • row – the local row of the requested element

  • col – the local column of the requested element

ValueType &at_local(size_type idx) noexcept#

Returns a single element of the multi-vector.

Useful for iterating across all elements of the multi-vector. However, it is less efficient than the two-parameter variant of this method.

Note

the method has to be called on the same Executor the matrix is stored at (e.g. trying to call this method on a GPU matrix from the OMP results in a runtime error)

Parameters:

idx – a linear index of the requested element (ignoring the stride)

value_type *get_local_values()#

Returns a pointer to the array of local values of the multi-vector.

Returns:

the pointer to the array of local values

const value_type *get_const_local_values() const#

Returns a pointer to the array of local values of the multi-vector.

Note

This is the constant version of the function, which can be significantly more memory efficient than the non-constant version, so always prefer this version.

Returns:

the pointer to the array of local values

const local_vector_type *get_local_vector() const#

Direct (read) access to the underlying local local_vector_type vectors.

Returns:

a constant pointer to the underlying local_vector_type vectors

std::unique_ptr<const real_type> create_real_view() const#

Create a real view of the (potentially) complex original multi-vector. If the original vector is real, nothing changes. If the original vector is complex, the result is created by viewing the complex vector with as real with a reinterpret_cast with twice the number of columns and double the stride.

std::unique_ptr<real_type> create_real_view()#

Create a real view of the (potentially) complex original multi-vector. If the original vector is real, nothing changes. If the original vector is complex, the result is created by viewing the complex vector with as real with a reinterpret_cast with twice the number of columns and double the stride.

std::unique_ptr<Vector> create_submatrix(
local_span rows,
local_span columns,
dim<2> global_size,
)#

Creates a view of a submatrix of this vector.

Parameters:
  • rows – The local rows of the submatrix

  • columns – The local columns of the submatrix

  • global_size – The global size of the submatrix

Returns:

A view of a submatrix.

Public Static Functions

static std::unique_ptr<Vector> create_with_config_of(
ptr_param<const Vector> other,
)#

Creates a distributed Vector with the same size and stride as another Vector.

Parameters:

other – The other vector whose configuration needs to copied.

static std::unique_ptr<Vector> create_with_type_of(
ptr_param<const Vector> other,
std::shared_ptr<const Executor> exec,
)#

Creates an empty Vector with the same type as another Vector, but on a different executor.

Note

The new multi-vector uses the same communicator as other.

Parameters:
  • other – The other multi-vector whose type we target.

  • exec – The executor of the new multi-vector.

Returns:

an empty Vector with the type of other.

static std::unique_ptr<Vector> create_with_type_of(
ptr_param<const Vector> other,
std::shared_ptr<const Executor> exec,
const dim<2> &global_size,
const dim<2> &local_size,
size_type stride,
)#

Creates an Vector with the same type as another Vector, but on a different executor and with a different size.

Parameters:
  • other – The other multi-vector whose type we target.

  • exec – The executor of the new multi-vector.

  • global_size – The global size of the multi-vector.

  • local_size – The local size of the multi-vector.

  • stride – The stride of the new multi-vector.

Returns:

a Vector of specified size with the type of other.

static std::unique_ptr<Vector> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
dim<2> global_size,
dim<2> local_size,
size_type stride,
)#

Creates an empty distributed vector with a specified size

Parameters:
  • execExecutor associated with vector

  • comm – Communicator associated with vector

  • global_size – Global size of the vector

  • local_size – Processor-local size of the vector

  • stride – Stride of the local vector.

Returns:

A smart pointer to the newly created vector.

static std::unique_ptr<Vector> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
dim<2> global_size = {},
dim<2> local_size = {},
)#

Creates an empty distributed vector with a specified size

Parameters:
  • execExecutor associated with vector

  • comm – Communicator associated with vector

  • global_size – Global size of the vector

  • local_size – Processor-local size of the vector, uses local_size[1] as the stride

Returns:

A smart pointer to the newly created vector.

static std::unique_ptr<Vector> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
dim<2> global_size,
std::unique_ptr<local_vector_type> local_vector,
)#

Creates a distributed vector from local vectors with a specified size.

Note

The data form the local_vector will be moved into the new distributed vector. You could either move in a std::unique_ptr directly, copy a local vector with gko::clone, or create a unique non-owning view of a given local vector with gko::make_dense_view.

Parameters:
  • execExecutor associated with this vector

  • comm – Communicator associated with this vector

  • global_size – The global size of the vector

  • local_vector – The underlying local vector, the data will be moved into this

Returns:

A smart pointer to the newly created vector.

static std::unique_ptr<Vector> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
std::unique_ptr<local_vector_type> local_vector,
)#

Creates a distributed vector from local vectors. The global size will be deduced from the local sizes, which will incur a collective communication.

Note

The data form the local_vector will be moved into the new distributed vector. You could either move in a std::unique_ptr directly, copy a local vector with gko::clone, or create a unique non-owning view of a given local vector with gko::make_dense_view.

Parameters:
  • execExecutor associated with this vector

  • comm – Communicator associated with this vector

  • local_vector – The underlying local vector, the data will be moved into this.

Returns:

A smart pointer to the newly created vector.

static std::unique_ptr<const Vector> create_const(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
dim<2> global_size,
std::unique_ptr<const local_vector_type> local_vector,
)#

Creates a constant (immutable) distributed Vector from a constant local vector.

Parameters:
  • execExecutor associated with this vector

  • comm – Communicator associated with this vector

  • global_size – The global size of the vector

  • local_vector – The underlying local vector, of which a view is created

Returns:

A smart pointer to the newly created vector.

static std::unique_ptr<const Vector> create_const(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
std::unique_ptr<const local_vector_type> local_vector,
)#

Creates a constant (immutable) distributed Vector from a constant local vector. The global size will be deduced from the local sizes, which will incur a collective communication.

Parameters:
  • execExecutor associated with this vector

  • comm – Communicator associated with this vector

  • local_vector – The underlying local vector, of which a view is created

Returns:

A smart pointer to the newly created vector.