|
ImpactX
|
Classes | |
| struct | PushSingleParticle |
| struct | PushSingleParticleSpin |
Functions | |
| template<typename T, typename IndexType> | |
| AMREX_GPU_DEVICE AMREX_FORCE_INLINE decltype(auto) | load_pdata (T *ptr, IndexType const i) |
| template<auto P_Method, int N, bool ForceWriteback = false, typename T, typename IndexType, typename ValType> | |
| AMREX_GPU_DEVICE AMREX_FORCE_INLINE void | store_pdata (ValType const &AMREX_RESTRICT val, T *const AMREX_RESTRICT ptr, IndexType const i) |
| template<typename T_Element, typename F> | |
| void | dispatch_misalignment (T_Element &element, F &&f) |
| template<typename T_Element> | |
| void | push_all_particles (ImpactXParticleContainer::iterator &pti, RefPart &AMREX_RESTRICT ref_part, T_Element &element, bool spin) |
| void impactx::elements::mixin::detail::dispatch_misalignment | ( | T_Element & | element, |
| F && | f ) |
Run a callable with the element's runtime misalignment lifted to a compile-time bool, selecting specialized (branch-free) push paths for with and without alignment.
With ImpactX_OPTIMIZE_ALIGNMENT (default), alignment-capable elements compile two push variants (with/without the shift/rotate transforms) and pick the matching one at runtime, so a perfectly aligned element skips the otherwise-identity transforms. Without it, only the always-on (always shift/rotate) variant is compiled, which halves the element-push compile time at the cost of running the transforms on every particle even when the element is perfectly aligned.
| AMREX_GPU_DEVICE AMREX_FORCE_INLINE decltype(auto) impactx::elements::mixin::detail::load_pdata | ( | T * | ptr, |
| IndexType const | i ) |
Load particle data from array pointers
On GPU and CPU w/o SIMD, this dereferences a particle property at the index position i. On CPU with SIMD, this loads a SIMD register at the IndexType::width SIMD-wide index position i.
| T | data type (amrex::ParticleReal or uint64_t) |
| IndexType | int or amrex::SIMDindex<SIMD_WIDTH, int> |
| ptr | pointer to the array data |
| i | index or SIMD index |
| void impactx::elements::mixin::detail::push_all_particles | ( | ImpactXParticleContainer::iterator & | pti, |
| RefPart &AMREX_RESTRICT | ref_part, | ||
| T_Element & | element, | ||
| bool | spin ) |
This pushes all particles on a particle iterator tile/box
| AMREX_GPU_DEVICE AMREX_FORCE_INLINE void impactx::elements::mixin::detail::store_pdata | ( | ValType const &AMREX_RESTRICT | val, |
| T *const AMREX_RESTRICT | ptr, | ||
| IndexType const | i ) |
Store particle data back to array pointers
On GPU and CPU without SIMD, this does nothing because we already modified the (global) RAM directly via pointer.
On CPU with SIMD, this performs a conditional writeback of a SIMD register to RAM (index in pointer array), but only if the argument was not passed as const and thus was likely changed.
Good optimizing compilers can eliminate writebacks of unchanged values themselves, but we better help a little for robustness. Background: https://github.com/AMReX-Codes/amrex/pull/4520#issuecomment-3064064215
The ForceWriteback flag forces the writeback regardless of the push method's argument constness. Use it for slots that may be modified by something other than the named push method itself (e.g., a pre/post step in the caller), where the push method's signature is not the authoritative source of truth for whether the value changed.
| P_Method | pointer to the push method (for is_nth_arg_non_const) |
| N | the argument index (for is_nth_arg_non_const) |
| ForceWriteback | if true, write back regardless of P_Method's argument constness |
| T | data type |
| IndexType | int or SIMD index |
| ValType | the type of the value to store |
| val | the value to store |
| ptr | pointer to the SoA data |
| i | index or SIMD index |