Commit Graph

2677 Commits

Author SHA1 Message Date
Hansung Kim
7f4adfaaa2 Increase SMEM size for flash 2025-01-30 01:20:17 -08:00
Hansung Kim
30001b7677 Merge branch 'ae' into ae-flash-ampere 2025-01-30 01:16:21 -08:00
Hansung Kim
6bdc6af607 Fix branch name and dims for flash script 2025-01-30 01:15:57 -08:00
Hansung Kim
5ab2cc4334 Switch to fp32 for flash 2025-01-30 01:12:55 -08:00
Hansung Kim
bc64474114 Update config for flash ampere 2025-01-30 01:06:54 -08:00
Hansung Kim
b73147cd06 Add compile and operand generate script for flash 2025-01-30 01:04:20 -08:00
Hansung Kim
471f89e371 Add arg binary for flash 2025-01-30 01:02:12 -08:00
Hansung Kim
7e1fc54c97 Fix typo in path 2025-01-30 00:41:42 -08:00
Hansung Kim
50c8f1c410 Add operand generate script for tcore 2025-01-29 23:33:09 -08:00
Richard Yan
dc46135f66 fix compile tcore script 2025-01-29 23:31:09 -08:00
Richard Yan
91a82c9f0f merge kernel changes from kernels-asplos-ae 2025-01-29 22:11:25 -08:00
Richard Yan
a61bf257ff modify makefile to point to new locations 2025-01-29 21:27:59 -08:00
Richard Yan
0d842a5930 more renaming and cleanup 2025-01-29 21:22:41 -08:00
Richard Yan
f98cd9bc22 remove old ci 2025-01-29 20:39:47 -08:00
Richard Yan
d4b78377a1 fix virgo kernel scripts 2025-01-29 20:19:42 -08:00
Richard Yan
0e6bcf51f1 cleanup 2025-01-29 18:38:49 -08:00
Richard Yan
5ba132e87b regression restructure 2025-01-29 18:30:32 -08:00
Hansung Kim
3de51577ef Check-in gemmini headers instead of submodule 2025-01-29 17:10:37 -08:00
Richard Yan
e86aac3a6f Merge branch 'new-cisc' into kernels-asplos-ae 2025-01-29 17:03:54 -08:00
Richard Yan
24894b1712 Merge branch 'new-cisc' of https://github.com/hansungk/vortex into new-cisc 2025-01-29 17:03:05 -08:00
Richard Yan
d47ef75614 update idle kernel 2025-01-29 17:00:08 -08:00
Richard Yan
ec41200845 updated no dma gemmini kernel 2025-01-29 16:59:44 -08:00
Hansung Kim
c26558bc93 Add fence before rescale 2025-01-28 23:48:02 -08:00
Hansung Kim
198a25cb16 Set NUM_CORES to 8 for Volta/Ampere 2025-01-28 22:49:36 -08:00
Hansung Kim
f2b5a3409d Merge branch 'new-cisc' into kernels-asplos-ae 2025-01-28 21:18:12 -08:00
Richard Yan
8c45b8b4b7 Merge branch 'new-cisc' of https://github.com/hansungk/vortex-private into new-cisc 2025-01-28 17:14:49 -08:00
Hansung Kim
e43f3c02a9 sgemm_impl: FP_SIZE to 16 2025-01-28 17:06:04 -08:00
Richard Yan
b1e6495630 update kernels 2025-01-28 16:39:17 -08:00
Hansung Kim
d98a414765 Change gemmini_mmio.h to fp16 GEMM setting 2025-01-28 16:36:55 -08:00
Hansung Kim
e4c0bbd039 sgemm: Check-in argument binaries 2025-01-28 16:04:56 -08:00
Hansung Kim
45e9407c99 sgemm: Check-in argument binaries 2025-01-28 15:58:27 -08:00
Hansung Kim
9894efe6c9 Update toolchain env paths for dork 2025-01-28 15:04:14 -08:00
Hansung Kim
5ef4c8023e sgemm_impl: Disable wmma fast store
Doesn't seem to have a big impact on tcore util.
2024-11-11 14:06:15 -08:00
Hansung Kim
7d7cb5f60a flash: Disable perf loop multiplier 2024-11-10 22:44:02 -08:00
Hansung Kim
4448f31fdc fence: Fix moving fence to start of loop
For unknown reasons, guarding the fence with a tid == 0 branch causes a
TL source ID re-used assertion.  Just call the fence from all
thread/warps as a workaround.  At least, all threads in a warp will
coalesce into one request.
2024-11-09 22:04:45 -08:00
Hansung Kim
cb916ead39 Fix potential bitwidth bug in compute API 2024-11-09 20:59:58 -08:00
Hansung Kim
68054689c9 flash: Move fence to start of loop; wrap all MMIO in one tid=0 branch 2024-11-09 20:59:26 -08:00
Hansung Kim
fcd8b0b892 flash: Disable rescale flag check
GEMM-II finishes much earlier than softmax for this to be a problem.
2024-11-09 20:37:58 -08:00
Hansung Kim
1c9b022156 flash: Rename nowarpspec to default 2024-11-09 19:58:45 -08:00
Hansung Kim
8fe6d918f2 flash: Update tcore kernel to use new CISC 2024-11-09 19:49:20 -08:00
Hansung Kim
76a6aaf085 flash: doc update 2024-11-09 19:09:09 -08:00
Hansung Kim
673e07ed43 flash: Add non-warp-specialized gemmini flash kernel 2024-11-09 19:08:39 -08:00
Hansung Kim
ac42f2dbba sgemm_gemmini_dma: Update with new compute API 2024-11-09 16:49:39 -08:00
Hansung Kim
ad75561efe flash: Reduce fence calls to improve util 2024-11-09 16:44:17 -08:00
Hansung Kim
6990fcc1e6 Add compute-and-mvout-to-spad API 2024-11-09 16:43:45 -08:00
Hansung Kim
952b8debbb flash: Update to use new CISC interface 2024-11-09 16:21:34 -08:00
Hansung Kim
dc89309ad0 Merge branch 'kernels-flash' into new-cisc 2024-11-09 14:42:46 -08:00
Hansung Kim
365b1d8e67 flash: Add begin end markers 2024-11-09 10:19:16 -08:00
Hansung Kim
1e3d476e70 Switch header configs to flash 2024-11-08 21:56:42 -08:00
Richard Yan
c114a7a4ab new gemm kernel 2024-11-08 20:55:27 -08:00