Move data between executors#
When you need to copy a Ginkgo object from one executor to another — typically host ↔ GPU — there are three idioms. They look similar but they don’t all do the same thing.
The three options#
// Deep copy returning a new object on `target`, typed like A.
auto B = gko::clone(target, A);
// Method form: same deep copy, but the return type is the base
// polymorphic type (e.g. unique_ptr<LinOp>, not unique_ptr<Csr<...>>).
auto B_base = A->clone(target);
// Copy in place. B already exists on `target`; overwrite its contents.
B->copy_from(A);
Idiom |
What it does |
When to use |
|---|---|---|
|
Allocates a new object on |
You want a fresh, typed copy on the target executor and don’t want to |
|
Same deep copy, but returns the base type the |
You’re working through a base pointer anyway, or you’ll downcast with |
|
Copies |
You want to reuse an existing buffer (e.g. inside an iteration loop) without reallocating. |
The header doxygen for gko::clone calls out the type-preservation
point explicitly: “the difference between this function and directly
calling LinOp::clone() is that this one preserves the static type of
the object.” That makes gko::clone the preferred form whenever you
want to keep using the concrete matrix or vector type without an
intermediate cast.
Plain gko::array#
array has its own cross-executor constructor — pass the target
executor and the source array:
gko::array<int> a_host{host_exec, {1, 2, 3, 4}};
gko::array<int> a_dev{device_exec, a_host}; // copy across
There is no array::copy_to_executor() — use the constructor above or
gko::clone(exec, a) for the generic form.
Skip the copy when possible#
Two cases where copy_from is a no-op:
Same executor instance. Copying inside the same executor still goes through device memcpy semantics, but no transfer crosses the PCIe boundary.
Memory-accessible executors. When the source and destination both use unified memory or pinned host memory, the runtime can resolve the copy without touching the data. See
Executor::memory_accessiblefor the rule.
For non-owning views — where you want to wrap an existing buffer without copying at all — see Zero-copy from application memory.
Common pitfalls#
copy_fromrequires the destination to already exist. Build it on the target executor first (Csr::create(target),Dense::create(target, dim)).copy_frombetween different formats does conversion. Going from COO to CSR viacopy_fromruns the format conversion kernel, not a pure memcpy.apply(b, x)does not convert executors silently. Operand executors must match the operator’s. Mismatches trigger an internal clone — correct, but wasteful inside loops.
See also
Switch executor — set up the target executor in the first place.
Executor model — the conceptual reference.