763 Commits

Author SHA1 Message Date
Hansung Kim
6cad8edd18 tensor: Fix operand alignment in pipelining 2024-10-16 22:01:02 -07:00
Hansung Kim
77dae3e1f9 tensor: Write staging pipeline for A tile 2024-10-16 21:21:48 -07:00
Richard Yan
9e86007e90 add imp component to rad smem, add core serialized access, update 2p smem 2024-10-16 16:20:58 -07:00
Hansung Kim
444dd5d7e1 tensor: Add destination reg to IO 2024-10-16 14:25:38 -07:00
Hansung Kim
e2abe1cffd tensor: Sequence set/steps in the execute-side 2024-10-15 19:12:45 -07:00
Hansung Kim
efaf599fbe tensor: Assert alignment of A and B response queues 2024-10-15 17:08:32 -07:00
Hansung Kim
de393115cd tensor: Translate TL response source to set/step tag 2024-10-15 16:50:40 -07:00
Hansung Kim
2ca2ee37b0 tensor: Fix writeback datawidth 2024-10-15 15:45:59 -07:00
Hansung Kim
ab8d3554bb Bump vortex to tensor-decoupled 2024-10-15 15:45:52 -07:00
Hansung Kim
90949f488b tensor: Add memory response queue 2024-10-14 22:36:18 -07:00
Hansung Kim
8d2e13b4ee tensor: Hold step until req fired for both A and B 2024-10-14 22:07:20 -07:00
Hansung Kim
14a640bf2d tensor: Do proper source generation
SourceGenerator keeps on givin'
2024-10-14 21:38:54 -07:00
Hansung Kim
bf6f7210b7 tensor: Generate TL traffic, separate edges for A and B 2024-10-14 21:15:07 -07:00
Hansung Kim
9ac8f2492c tensor: Minimal diplomacy config for unittest 2024-10-14 20:54:33 -07:00
Hansung Kim
01f53a8be1 tensor: Sequence through set/steps 2024-10-14 20:20:30 -07:00
Hansung Kim
3165108c8b Add synthesizable unit test for tensor 2024-10-14 19:47:00 -07:00
Hansung Kim
327615e330 Add state regs and init/writeback transition 2024-10-14 17:28:51 -07:00
Hansung Kim
6a3aa549d3 Add skeleton for Hopper Tensor Core 2024-10-14 15:02:25 -07:00
Hansung Kim
447977bd89 addResource hopper tensor core 2024-10-14 15:02:08 -07:00
Richard Yan
0989d90dd2 connect tc nodes and maybe fix distributor node 2024-10-07 02:59:22 -07:00
Richard Yan
4f057c6994 Merge branch 'main' of https://github.com/ucb-bar/radiance into main 2024-10-05 02:48:48 -07:00
Richard Yan
c6df484c00 add tensor core read client 2024-10-05 02:48:47 -07:00
Hansung Kim
b4bd9ecbc9 Dummy comment 2024-10-02 15:18:56 -07:00
Richard Yan
2929a84ecc get smem params from key 2024-09-26 16:49:06 -07:00
Richard Yan
f11385218f move virgo components into shared mem module, more cleanup 2024-09-26 14:41:46 -07:00
Richard Yan
998f73b54a general cleanup 2024-09-24 18:17:00 -07:00
Richard Yan
3b8c9812b4 refactor smem counter 2024-09-24 17:24:52 -07:00
Richard Yan
85336399c2 refactor radiance cluster shared memory into components 2024-09-24 03:14:32 -07:00
Richard Yan
20cf4609b7 camelCase 2024-09-22 01:21:37 -07:00
Richard Yan
daacae9edc fallback for hint select 2024-09-11 15:09:52 -07:00
Richard Yan
f1a1b77828 actually support large smem subbanks 2024-09-10 23:24:02 -07:00
Richard Yan
13142ab0b9 Merge branch 'main' of https://github.com/ucb-bar/radiance into main 2024-09-10 18:30:54 -07:00
Richard Yan
810db6a1ea new crossbar w/ individual select and group hint, subbanks > num lanes support 2024-09-10 18:30:48 -07:00
Hansung Kim
b335132c34 Parameterize tensor core FP16 2024-09-10 15:38:12 -07:00
Hansung Kim
4b031d1ade radiance.mk: Remove SMEM_LOG_SIZE override 2024-09-10 15:38:12 -07:00
Richard Yan
3fd0fd296b queued cisc commands 2024-09-09 22:38:16 -07:00
Richard Yan
06edba2a78 fix comb loop & revert xbar temporarily 2024-09-09 02:27:08 -07:00
Richard Yan
afc6ba7eca fix ext policy xbar, add rectangular tile support 2024-09-08 13:21:31 -07:00
Richard Yan
378b3531d4 balanced shared memory across cores 2024-09-07 20:29:27 -07:00
Richard Yan
84972181a5 large smem size, fix single gemmini, bump vortex 2024-09-05 16:50:03 -07:00
Hansung Kim
24df14d7af Bump vortex 2024-08-28 16:23:49 -07:00
Hansung Kim
e31f25b432 Switch to FP32 tensor core for use in flash 2024-08-28 16:23:27 -07:00
Hansung Kim
ec0c8750d3 Bump vortex 2024-08-20 14:47:18 -07:00
Hansung Kim
2364cd213e Bump vortex 2024-08-15 13:40:09 -07:00
Hansung Kim
2e6221661a radiance.mk: Reenable LSU_DUP_DISABLE
LSU dedup is useful for sharedmem for which we don't have a coalescer
on, esp. when broadcasting a single value that's cached in smem to all
threads in the kernel.
2024-08-15 13:38:02 -07:00
Hansung Kim
d8823a0416 Add back generated verilog for FP32 TensorDPU 2024-08-12 19:52:13 -07:00
Hansung Kim
7b06c1778c Bump vortex 2024-08-07 11:30:09 -07:00
Hansung Kim
c1d95ff205 Revert rename 2024-08-07 11:29:42 -07:00
Hansung Kim
477f3955ed Update generated SV for tensordpu 2024-08-07 11:09:57 -07:00
Hansung Kim
32c7aed263 Fix fp exception by rounding right after MulRawFN 2024-08-07 11:09:55 -07:00