Hansung Kim
6cad8edd18
tensor: Fix operand alignment in pipelining
2024-10-16 22:01:02 -07:00
Hansung Kim
77dae3e1f9
tensor: Write staging pipeline for A tile
2024-10-16 21:21:48 -07:00
Richard Yan
9e86007e90
add imp component to rad smem, add core serialized access, update 2p smem
2024-10-16 16:20:58 -07:00
Hansung Kim
444dd5d7e1
tensor: Add destination reg to IO
2024-10-16 14:25:38 -07:00
Hansung Kim
e2abe1cffd
tensor: Sequence set/steps in the execute-side
2024-10-15 19:12:45 -07:00
Hansung Kim
efaf599fbe
tensor: Assert alignment of A and B response queues
2024-10-15 17:08:32 -07:00
Hansung Kim
de393115cd
tensor: Translate TL response source to set/step tag
2024-10-15 16:50:40 -07:00
Hansung Kim
2ca2ee37b0
tensor: Fix writeback datawidth
2024-10-15 15:45:59 -07:00
Hansung Kim
ab8d3554bb
Bump vortex to tensor-decoupled
2024-10-15 15:45:52 -07:00
Hansung Kim
90949f488b
tensor: Add memory response queue
2024-10-14 22:36:18 -07:00
Hansung Kim
8d2e13b4ee
tensor: Hold step until req fired for both A and B
2024-10-14 22:07:20 -07:00
Hansung Kim
14a640bf2d
tensor: Do proper source generation
...
SourceGenerator keeps on givin'
2024-10-14 21:38:54 -07:00
Hansung Kim
bf6f7210b7
tensor: Generate TL traffic, separate edges for A and B
2024-10-14 21:15:07 -07:00
Hansung Kim
9ac8f2492c
tensor: Minimal diplomacy config for unittest
2024-10-14 20:54:33 -07:00
Hansung Kim
01f53a8be1
tensor: Sequence through set/steps
2024-10-14 20:20:30 -07:00
Hansung Kim
3165108c8b
Add synthesizable unit test for tensor
2024-10-14 19:47:00 -07:00
Hansung Kim
327615e330
Add state regs and init/writeback transition
2024-10-14 17:28:51 -07:00
Hansung Kim
6a3aa549d3
Add skeleton for Hopper Tensor Core
2024-10-14 15:02:25 -07:00
Hansung Kim
447977bd89
addResource hopper tensor core
2024-10-14 15:02:08 -07:00
Richard Yan
0989d90dd2
connect tc nodes and maybe fix distributor node
2024-10-07 02:59:22 -07:00
Richard Yan
4f057c6994
Merge branch 'main' of https://github.com/ucb-bar/radiance into main
2024-10-05 02:48:48 -07:00
Richard Yan
c6df484c00
add tensor core read client
2024-10-05 02:48:47 -07:00
Hansung Kim
b4bd9ecbc9
Dummy comment
2024-10-02 15:18:56 -07:00
Richard Yan
2929a84ecc
get smem params from key
2024-09-26 16:49:06 -07:00
Richard Yan
f11385218f
move virgo components into shared mem module, more cleanup
2024-09-26 14:41:46 -07:00
Richard Yan
998f73b54a
general cleanup
2024-09-24 18:17:00 -07:00
Richard Yan
3b8c9812b4
refactor smem counter
2024-09-24 17:24:52 -07:00
Richard Yan
85336399c2
refactor radiance cluster shared memory into components
2024-09-24 03:14:32 -07:00
Richard Yan
20cf4609b7
camelCase
2024-09-22 01:21:37 -07:00
Richard Yan
daacae9edc
fallback for hint select
2024-09-11 15:09:52 -07:00
Richard Yan
f1a1b77828
actually support large smem subbanks
2024-09-10 23:24:02 -07:00
Richard Yan
13142ab0b9
Merge branch 'main' of https://github.com/ucb-bar/radiance into main
2024-09-10 18:30:54 -07:00
Richard Yan
810db6a1ea
new crossbar w/ individual select and group hint, subbanks > num lanes support
2024-09-10 18:30:48 -07:00
Hansung Kim
b335132c34
Parameterize tensor core FP16
2024-09-10 15:38:12 -07:00
Hansung Kim
4b031d1ade
radiance.mk: Remove SMEM_LOG_SIZE override
2024-09-10 15:38:12 -07:00
Richard Yan
3fd0fd296b
queued cisc commands
2024-09-09 22:38:16 -07:00
Richard Yan
06edba2a78
fix comb loop & revert xbar temporarily
2024-09-09 02:27:08 -07:00
Richard Yan
afc6ba7eca
fix ext policy xbar, add rectangular tile support
2024-09-08 13:21:31 -07:00
Richard Yan
378b3531d4
balanced shared memory across cores
2024-09-07 20:29:27 -07:00
Richard Yan
84972181a5
large smem size, fix single gemmini, bump vortex
2024-09-05 16:50:03 -07:00
Hansung Kim
24df14d7af
Bump vortex
2024-08-28 16:23:49 -07:00
Hansung Kim
e31f25b432
Switch to FP32 tensor core for use in flash
2024-08-28 16:23:27 -07:00
Hansung Kim
ec0c8750d3
Bump vortex
2024-08-20 14:47:18 -07:00
Hansung Kim
2364cd213e
Bump vortex
2024-08-15 13:40:09 -07:00
Hansung Kim
2e6221661a
radiance.mk: Reenable LSU_DUP_DISABLE
...
LSU dedup is useful for sharedmem for which we don't have a coalescer
on, esp. when broadcasting a single value that's cached in smem to all
threads in the kernel.
2024-08-15 13:38:02 -07:00
Hansung Kim
d8823a0416
Add back generated verilog for FP32 TensorDPU
2024-08-12 19:52:13 -07:00
Hansung Kim
7b06c1778c
Bump vortex
2024-08-07 11:30:09 -07:00
Hansung Kim
c1d95ff205
Revert rename
2024-08-07 11:29:42 -07:00
Hansung Kim
477f3955ed
Update generated SV for tensordpu
2024-08-07 11:09:57 -07:00
Hansung Kim
32c7aed263
Fix fp exception by rounding right after MulRawFN
2024-08-07 11:09:55 -07:00