Commit Graph

154 Commits

Author SHA1 Message Date
e4c10eca0f Stabilize EScalar CUDA fallback path 2026-05-03 16:05:47 +08:00
4430d04ee7 Stabilize EScalar CUDA sync defaults 2026-05-03 00:24:50 +08:00
74ba5feb86 Pin EScalar scalar CUDA transfers 2026-05-02 19:21:57 +08:00
6f28111a43 Keep EScalar mixed GPU RP opt-in 2026-05-02 18:38:43 +08:00
f638cbc4e8 Add mixed GPU RP path for EScalar 2026-05-02 18:27:26 +08:00
59a216ad93 Optimize BSSN EScalar GPU path baseline 2026-05-02 18:19:15 +08:00
52beb4d153 Checkpoint Z4C CUDA resident sync progress 2026-05-02 10:53:52 +08:00
ba61702fc0 Checkpoint Z4C CUDA throttling progress 2026-05-02 10:04:23 +08:00
fcd98649f6 Checkpoint Z4C CUDA optimization progress 2026-05-02 08:55:25 +08:00
a5c8188305 Disable unsafe Z4C AMR device path by default 2026-05-02 01:36:41 +08:00
383e936e88 Save Z4C CUDA optimization progress 2026-05-02 00:49:02 +08:00
531b31e8db Stabilize cached Z4C CUDA sync after regrid 2026-05-01 20:04:04 +08:00
30b778daa3 Save Z4C CUDA transfer progress 2026-05-01 18:51:19 +08:00
db9383e439 Initialize cached sync runtime in derived evolvers 2026-05-01 18:34:43 +08:00
35b6ceff02 Broaden cached CUDA sync paths 2026-05-01 18:03:04 +08:00
51f3819892 Save generated source formatting state 2026-04-30 20:47:44 +08:00
b1974ef146 Stabilize device AMR restrict across regrid 2026-04-30 20:01:18 +08:00
be9033f449 Add optional CUDA surface interpolation 2026-04-30 19:21:19 +08:00
6835608f92 Add configurable analysis MAP cadence 2026-04-30 19:10:12 +08:00
da4d56ccf7 Optimize BSSN surface interpolation fast path 2026-04-30 18:25:21 +08:00
a6483d013d Add CUDA AMR restrict diagnostics 2026-04-30 12:20:44 +08:00
8486532920 Add resident BSSN GPU point interpolation 2026-04-30 11:39:15 +08:00
18e9c9cc50 Optimize BSSN CUDA resident AMR prolong path 2026-04-30 10:58:15 +08:00
1ee229a91f Add keyed BSSN CUDA resident banks 2026-04-29 19:44:19 +08:00
68eab03bac Add opt-in BSSN CUDA resident AMR path 2026-04-29 19:15:37 +08:00
090d8657ae Optimize BSSN CUDA state transfers 2026-04-29 18:34:31 +08:00
22c1e7168b Optimize BSSN CUDA resident state and CUDA-aware MPI 2026-04-29 17:05:10 +08:00
a0dab90bcb Switch to NVIDIA HPC Toolchain 2026-04-29 08:31:49 +08:00
c689cc8dc9 [WIP] Add CUDA support for Z4C
Rewritten done by Codex.
This still has errors, do not pick this one now.
2026-04-27 11:58:43 +08:00
60fee8f1c1 Fix Z4C C++ gauge damping ordering 2026-04-26 15:38:13 +08:00
843b116954 Add C++ Z4C RHS path and port some BSSN optimizations 2026-04-25 10:39:01 +08:00
c768e1220b Also disable cached sync for Z4C 2026-04-25 10:25:54 +08:00
02f149e2e3 Disable cached sync for BSSN-EScalar 2026-04-25 10:17:47 +08:00
422e8ec4dc Fallback BSSN-EScalar restrict/prolong path 2026-04-25 10:10:34 +08:00
f521a97563 Fix ABE CPU version build error 2026-04-25 09:39:49 +08:00
53c55451b3 Update makefile and scripts for CUDA BSSN configuration and build commands 2026-04-25 09:19:50 +08:00
768345954f Add optional BSSN kernel profiling switches
(cherry picked from commit 9c31384b2f)
2026-04-25 08:39:43 +08:00
8e9463aa90 Localize chi Ricci intermediates in RHS
(cherry picked from commit 65e0f95f40)
2026-04-25 08:37:41 +08:00
7c6f15002e Elide dead stores in BSSN RHS hot path
(cherry picked from commit f9fbf97e64)
2026-04-25 08:37:40 +08:00
6410c62e3e Add fine-grained step timing and trim BH RHS overhead
(cherry picked from commit 968522995b)
2026-04-25 08:37:19 +08:00
11977eb82f Merge wave and mass extraction interpolation
(cherry picked from commit f3988ac8ca)
2026-04-25 08:25:34 +08:00
cce8a44fc4 Cache wave extraction angular kernels
(cherry picked from commit e4c25eb21f)
2026-04-25 08:24:36 +08:00
c589097618 Reuse mass integrand across detector radii
(cherry picked from commit 4b10519876)
2026-04-25 08:24:11 +08:00
b713e5a9be Batch constraint norm reductions
(cherry picked from commit 3a58273501)
2026-04-25 08:22:00 +08:00
0396701572 Optimize constraint refresh after regrid
(cherry picked from commit 5c65cea2f0)
2026-04-25 08:18:51 +08:00
bb20c9a876 fix ADM Constrant Violation Analysis 2026-04-15 19:19:16 +08:00
8fe60ea703 Add zero matter handling and interpolation for resident state in CUDA BSSN 2026-04-15 00:25:53 +08:00
9ab7e7c7f9 Fuse phases 5 and 6 for Gamma_rhs computation and optimize phases 8 and 9 for efficiency 2026-04-14 23:23:04 +08:00
f9119e8a2a Add resident-GA mode switch and simplify sync logic 2026-04-14 21:09:27 +08:00
726d743376 Fuse Ricci assembly and optimize trK/Aij gauge kernels 2026-04-14 19:20:12 +08:00