e7229dae27
Checkpoint wu arch cases before scalar spawn wrapper
2026-05-24 10:51:58 +08:00
8f7dba5920
Add Blackwell instructions test kernel and update linker script
...
- Add kernels/blackwell_insts/ with test kernel and input data
- Update linker script with extended memory layout
- Remove obsolete sgemm_tcore_blackwell Makefile
- Update VX_types.h and common.h
2026-05-06 14:50:28 +08:00
bcc566b621
Add Blackwell SGEMM kernel scaffolding
2026-04-25 10:15:31 +08:00
Virgo-AE Eval
71f713b9fc
Disable git pull for archive
...
Only use local refs in the archive for reproducibility.
2025-02-07 14:51:25 -08:00
Richard Yan
9847072eff
fix hexadecile
2025-01-31 02:02:18 -08:00
Richard Yan
f8c51669c1
fix toolchain env sh
2025-01-30 21:17:12 -08:00
Richard Yan
17a9d31be5
fix dma invocation
2025-01-30 15:33:58 -08:00
Hansung Kim
238b942133
Add missing library remake
2025-01-30 13:24:23 -08:00
Hansung Kim
2c1ac4e938
Do git pull to make sure up-to-date
2025-01-30 01:47:35 -08:00
Richard Yan
9cdee597b6
Merge branch 'ae' of https://github.com/richardyrh/virgo-kernels into ae
2025-01-30 01:34:29 -08:00
Richard Yan
dde3602046
disable prints for virgo gemm
2025-01-30 01:34:22 -08:00
Hansung Kim
6bdc6af607
Fix branch name and dims for flash script
2025-01-30 01:15:57 -08:00
Hansung Kim
b73147cd06
Add compile and operand generate script for flash
2025-01-30 01:04:20 -08:00
Hansung Kim
471f89e371
Add arg binary for flash
2025-01-30 01:02:12 -08:00
Hansung Kim
7e1fc54c97
Fix typo in path
2025-01-30 00:41:42 -08:00
Hansung Kim
50c8f1c410
Add operand generate script for tcore
2025-01-29 23:33:09 -08:00
Richard Yan
dc46135f66
fix compile tcore script
2025-01-29 23:31:09 -08:00
Richard Yan
91a82c9f0f
merge kernel changes from kernels-asplos-ae
2025-01-29 22:11:25 -08:00
Richard Yan
a61bf257ff
modify makefile to point to new locations
2025-01-29 21:27:59 -08:00
Richard Yan
0d842a5930
more renaming and cleanup
2025-01-29 21:22:41 -08:00
Richard Yan
f98cd9bc22
remove old ci
2025-01-29 20:39:47 -08:00
Richard Yan
d4b78377a1
fix virgo kernel scripts
2025-01-29 20:19:42 -08:00
Richard Yan
0e6bcf51f1
cleanup
2025-01-29 18:38:49 -08:00
Richard Yan
5ba132e87b
regression restructure
2025-01-29 18:30:32 -08:00
Hansung Kim
3de51577ef
Check-in gemmini headers instead of submodule
2025-01-29 17:10:37 -08:00
Richard Yan
e86aac3a6f
Merge branch 'new-cisc' into kernels-asplos-ae
2025-01-29 17:03:54 -08:00
Richard Yan
24894b1712
Merge branch 'new-cisc' of https://github.com/hansungk/vortex into new-cisc
2025-01-29 17:03:05 -08:00
Richard Yan
d47ef75614
update idle kernel
2025-01-29 17:00:08 -08:00
Richard Yan
ec41200845
updated no dma gemmini kernel
2025-01-29 16:59:44 -08:00
Hansung Kim
c26558bc93
Add fence before rescale
2025-01-28 23:48:02 -08:00
Hansung Kim
198a25cb16
Set NUM_CORES to 8 for Volta/Ampere
2025-01-28 22:49:36 -08:00
Hansung Kim
f2b5a3409d
Merge branch 'new-cisc' into kernels-asplos-ae
2025-01-28 21:18:12 -08:00
Richard Yan
8c45b8b4b7
Merge branch 'new-cisc' of https://github.com/hansungk/vortex-private into new-cisc
2025-01-28 17:14:49 -08:00
Hansung Kim
e43f3c02a9
sgemm_impl: FP_SIZE to 16
2025-01-28 17:06:04 -08:00
Richard Yan
b1e6495630
update kernels
2025-01-28 16:39:17 -08:00
Hansung Kim
d98a414765
Change gemmini_mmio.h to fp16 GEMM setting
2025-01-28 16:36:55 -08:00
Hansung Kim
e4c0bbd039
sgemm: Check-in argument binaries
2025-01-28 16:04:56 -08:00
Hansung Kim
45e9407c99
sgemm: Check-in argument binaries
2025-01-28 15:58:27 -08:00
Hansung Kim
9894efe6c9
Update toolchain env paths for dork
2025-01-28 15:04:14 -08:00
Hansung Kim
5ef4c8023e
sgemm_impl: Disable wmma fast store
...
Doesn't seem to have a big impact on tcore util.
2024-11-11 14:06:15 -08:00
Hansung Kim
7d7cb5f60a
flash: Disable perf loop multiplier
2024-11-10 22:44:02 -08:00
Hansung Kim
4448f31fdc
fence: Fix moving fence to start of loop
...
For unknown reasons, guarding the fence with a tid == 0 branch causes a
TL source ID re-used assertion. Just call the fence from all
thread/warps as a workaround. At least, all threads in a warp will
coalesce into one request.
2024-11-09 22:04:45 -08:00
Hansung Kim
cb916ead39
Fix potential bitwidth bug in compute API
2024-11-09 20:59:58 -08:00
Hansung Kim
68054689c9
flash: Move fence to start of loop; wrap all MMIO in one tid=0 branch
2024-11-09 20:59:26 -08:00
Hansung Kim
fcd8b0b892
flash: Disable rescale flag check
...
GEMM-II finishes much earlier than softmax for this to be a problem.
2024-11-09 20:37:58 -08:00
Hansung Kim
1c9b022156
flash: Rename nowarpspec to default
2024-11-09 19:58:45 -08:00
Hansung Kim
8fe6d918f2
flash: Update tcore kernel to use new CISC
2024-11-09 19:49:20 -08:00
Hansung Kim
76a6aaf085
flash: doc update
2024-11-09 19:09:09 -08:00
Hansung Kim
673e07ed43
flash: Add non-warp-specialized gemmini flash kernel
2024-11-09 19:08:39 -08:00
Hansung Kim
ac42f2dbba
sgemm_gemmini_dma: Update with new compute API
2024-11-09 16:49:39 -08:00