External Integration Guide -------------------------- This guide describes how to integrate ``rgpot`` into external codebases, particularly legacy projects with their own type systems. It covers the namespace collision problem, the recommended mitigation strategies, and a worked example using `eOn `_ as a concrete case study. The Namespace Collision Problem ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``rgpot`` defines two names that are common in atomistic simulation codebases: ``AtomMatrix`` A lightweight row-major matrix class in ``rgpot::types::AtomMatrix``. However, ``Potential.hpp`` contains a top-level ``using rgpot::types::AtomMatrix;`` directive (line 30) that pulls this name into the global scope of any translation unit that includes the header. ``Potential`` The Cap'n Proto schema (``Potentials.capnp``) defines ``interface Potential``, which the capnp compiler generates as a global ``class Potential``. This collides with any consumer that defines its own ``class Potential`` at global scope. Both names are extremely common in computational chemistry codebases. For example, `eOn `_ defines: ``AtomMatrix`` A typedef for ``Eigen::Matrix``. ``class Potential`` The base class for all eOn potential energy surfaces. Including both eOn and rgpot headers in the same translation unit produces hard compilation errors from the conflicting definitions. Root Cause ^^^^^^^^^^ The collision stems from two issues: 1. The ``using`` directive in ``Potential.hpp``: .. code:: cpp // rgpot/CppCore/rgpot/Potential.hpp, line 30 using rgpot::types::AtomMatrix; This leaks ``AtomMatrix`` into the global namespace for any downstream consumer. The ``PotentialBase`` class itself is correctly inside ``namespace rgpot``, but the ``using`` directive precedes the namespace block. 1. Cap'n Proto code generation does not namespace its output. The ``interface Potential`` in ``Potentials.capnp`` produces a top-level ``class Potential`` in the generated C++ header. This is a constraint of the capnp compiler rather than an rgpot design choice. Mitigation Strategies ~~~~~~~~~~~~~~~~~~~~~ Strategy 1: Namespace the ``using`` directive (recommended for rgpot) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Move the ``using`` directive inside the ``rgpot`` namespace: .. code:: cpp // Before (leaks to global scope): using rgpot::types::AtomMatrix; namespace rgpot { ... } // After (contained within rgpot): namespace rgpot { using types::AtomMatrix; ... } // namespace rgpot This is a backward-compatible change for any code that already qualifies ``rgpot::PotentialBase`` or ``rgpot::Potential``. Code that relied on the global ``AtomMatrix`` leak would need to switch to ``rgpot::types::AtomMatrix`` or add its own ``using`` inside its own namespace. Strategy 2: Namespace the consumer's types ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If the consuming codebase can be refactored, placing its types inside a project namespace eliminates collisions entirely: .. code:: cpp // Instead of global scope: class Potential { ... }; using AtomMatrix = Eigen::Matrix; // Use a project namespace: namespace eon { class Potential { ... }; using AtomMatrix = Eigen::Matrix; } // namespace eon This is the cleanest solution but requires touching every file in the consumer codebase -- impractical for large legacy projects. Strategy 3: Separate translation units (for legacy codes) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ When neither rgpot nor the consumer can be easily modified, use separate translation units (TUs) so the conflicting headers never appear together: :: TU 1 (ConsumerBridge.cpp): #include "consumer/Potential.h" // consumer's Potential // Wraps consumer types into a flat-array callback TU 2 (RpcServer.cpp): #include "Potentials.capnp.h" // capnp's Potential // Implements the RPC server using only flat arrays The two TUs communicate through a type-free interface such as a ``std::function`` over flat C arrays, avoiding any shared header that contains either ``Potential`` or ``AtomMatrix``. Strategy 4: RPC-client-only linking ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For consumers that only need to *call* rgpot potentials (not serve them), link only the Cap'n Proto schema library (``ptlrpc_dep`` in meson) rather than the full ``rgpot_dep``. This avoids pulling in ``Potential.hpp`` and its ``AtomMatrix`` leak entirely. In meson: .. code:: meson rgpot_proj = subproject('rgpot', default_options: ['with_rpc_client_only=true']) ptlrpc_dep = rgpot_proj.get_variable('ptlrpc_dep') # Only capnp schema + generated code, no rgpot types Flat-Array Callback Pattern ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The recommended integration pattern for legacy codes is a flat-array callback that requires no shared types between the consumer and rgpot: .. code:: cpp // ForceCallback: a std::function over flat C arrays using ForceCallback = std::function; This signature maps directly to the Cap'n Proto ``ForceInput`` / ``PotentialResult`` schema without requiring any Eigen, AtomMatrix, or ``rgpot::ForceInput`` types. The callback is created in the consumer's TU (which sees the consumer's types) and passed to the RPC server TU (which sees the capnp types). Neither TU needs to include the other's headers. Data layout ^^^^^^^^^^^ All arrays use the same flat layout as the Cap'n Proto schema: .. table:: +----------------+------------------------------------+----------------+ | Array | Layout | Size | +================+====================================+================+ | Positions | ``[x1,y1,z1, x2,y2,z2, ...]`` | ``nAtoms * 3`` | +----------------+------------------------------------+----------------+ | Atomic numbers | ``[Z1, Z2, ...]`` | ``nAtoms`` | +----------------+------------------------------------+----------------+ | Box | ``[ax,ay,az, bx,by,bz, cx,cy,cz]`` | ``9`` | +----------------+------------------------------------+----------------+ | Forces | ``[Fx1,Fy1,Fz1, ...]`` | ``nAtoms * 3`` | +----------------+------------------------------------+----------------+ This is row-major and matches the convention used by both the Rust core (``rgpot_force_input_t``) and the existing C++ potentials (``ForceInput``). Worked Example: eOn ~~~~~~~~~~~~~~~~~~~ `eOn `_ is a saddle-point search framework with its own ``class Potential`` and ``AtomMatrix`` (Eigen-based) at global scope. It integrates with rgpot's RPC server using the two-TU + flat-array pattern. Architecture ^^^^^^^^^^^^ :: +---------------------------+ +---------------------------+ | ServeMode.cpp (TU 1) | | ServeRpcServer.cpp (TU 2) | | | | | | #include "Potential.h" | | #include "Potentials.capnp.h" | (eOn's Potential class) | | (capnp's Potential iface) | | | | | | makeForceCallback(pot) |---->| startRpcServer(callback) | | wraps pot->force() | | unwraps capnp structs | | into ForceCallback | | calls callback(...) | +---------------------------+ +---------------------------+ | | | ForceCallback (flat arrays) | +------------------------------------+ No shared types TU 1: ServeMode.cpp ^^^^^^^^^^^^^^^^^^^ This file includes eOn's ``Potential.h`` and wraps any eOn potential into a ``ForceCallback``: .. code:: cpp #include "Potential.h" // eOn's Potential, AtomMatrix #include "ServeRpcServer.h" // ForceCallback typedef only namespace { ForceCallback makeForceCallback(std::shared_ptr<::Potential> pot) { return [pot = std::move(pot)](long nAtoms, const double *positions, const int *atomicNrs, double *forces, double *energy, const double *box) { double variance = 0.0; pot->force(nAtoms, positions, atomicNrs, forces, energy, &variance, box); }; } } // anonymous namespace void serveMode(const Parameters ¶ms, const std::string &host, uint16_t port) { auto eon_pot = helper_functions::makePotential(params); auto callback = makeForceCallback(std::move(eon_pot)); startRpcServer(std::move(callback), host, port); } TU 2: ServeRpcServer.cpp ^^^^^^^^^^^^^^^^^^^^^^^^ This file includes the capnp-generated header and implements the RPC server. It never includes eOn's ``Potential.h``: .. code:: cpp #include "Potentials.capnp.h" // capnp's Potential interface #include "ServeRpcServer.h" // ForceCallback typedef class CallbackPotImpl final : public Potential::Server { ForceCallback m_callback; public: explicit CallbackPotImpl(ForceCallback cb) : m_callback(std::move(cb)) {} kj::Promise calculate(CalculateContext ctx) override { auto fip = ctx.getParams().getFip(); auto pos = fip.getPos(); auto atmnrs = fip.getAtmnrs(); auto box = fip.getBox(); long nAtoms = static_cast(atmnrs.size()); std::vector forces(nAtoms * 3, 0.0); double energy = 0.0; // Call through the flat-array callback -- no eOn types here m_callback(nAtoms, pos.begin(), atmnrs.begin(), forces.data(), &energy, box.begin()); auto result = ctx.getResults().initResult(); result.setEnergy(energy); auto fout = result.initForces(forces.size()); for (size_t i = 0; i < forces.size(); ++i) { fout.set(i, forces[i]); } return kj::READY_NOW; } }; Bridge header: ServeRpcServer.h ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The bridge header defines only the ``ForceCallback`` type and the server entry points. It includes neither eOn nor capnp headers: .. code:: cpp #pragma once #include #include #include #include using ForceCallback = std::function; void startRpcServer(ForceCallback callback, const std::string &host, uint16_t port); void startPooledRpcServer(std::vector pool, const std::string &host, uint16_t port); Build system integration ^^^^^^^^^^^^^^^^^^^^^^^^ eOn pulls rgpot as a meson subproject with ``with_rpc_client_only=true``, linking only the capnp schema dependency: .. code:: meson rgpot_proj = subproject('rgpot', default_options: ['with_rpc_client_only=true']) ptlrpc_dep = rgpot_proj.get_variable('ptlrpc_dep') serve_sources = ['ServeMode.cpp', 'ServeRpcServer.cpp'] # ptlrpc_dep provides capnp schema; no rgpot types linked Running the server ^^^^^^^^^^^^^^^^^^ .. code:: bash # Serve a Lennard-Jones potential on port 12345 eonclient --serve "lj:12345" # Serve a Metatomic ML potential eonclient --serve "metatomic:12345" --config model.ini # Multiple potentials concurrently eonclient --serve "lj:12345,metatomic:12346" --config model.ini # Gateway mode: single port, pooled instances eonclient -p metatomic --serve-port 12345 --replicas 4 --gateway \ --config model.ini Any rgpot-compatible client (Julia, Python, C++) can then connect: .. code:: julia # Julia (ChemGP) using ChemGP pot = RpcPotential("localhost", 12345, Int32[29, 29], Float64[20,0,0, 0,20,0, 0,0,20]) E, F = calculate(pot, Float64[0,0,0, 2.2,0,0]) Summary of Approaches ~~~~~~~~~~~~~~~~~~~~~ .. table:: +---------------------------------+--------------------------+-------------------------------------------------+ | Approach | Effort | When to use | +=================================+==========================+=================================================+ | Move ``using`` inside namespace | Low (rgpot change) | Default for new rgpot releases. | +---------------------------------+--------------------------+-------------------------------------------------+ | Namespace consumer types | High (consumer refactor) | Greenfield projects or major rewrites. | +---------------------------------+--------------------------+-------------------------------------------------+ | Separate TUs + flat callback | Medium (build system) | Legacy codes with global-scope types. | +---------------------------------+--------------------------+-------------------------------------------------+ | RPC-client-only linking | Low (build option) | Consumer only calls potentials, does not serve. | +---------------------------------+--------------------------+-------------------------------------------------+ For legacy codebases like eOn where refactoring all types into a namespace is impractical, the two-TU + flat-array callback pattern provides a clean integration path with no modifications required to either rgpot or the consumer's existing type system.