Commit Graph

715 Commits

Author SHA1 Message Date
Hansung Kim
0ba61aabb6 tensor: Instantiate correct fake tcore module according to parameter
This has to align with what the verilog source actually uses.
2024-10-27 18:48:44 -07:00
Hansung Kim
13b9577723 Instantiate fake tensor modules outside of diplomacy
Adding them to the Diplomacy graph will cause to widen source widths
which can have area implications.

This gets rid of the need to do addResource() calls to the manually
generated Verilog files.  Their module parameters should be kept the
same as what's used in the parent Verilog module, however.
2024-10-25 23:02:25 -07:00
Hansung Kim
543eb2feb4 tensor: Support FP16 in TensorCoreDecoupled 2024-10-25 22:26:04 -07:00
Hansung Kim
eed821eda6 tensor: Add test for 8-dim fp16 DPU 2024-10-25 21:57:28 -07:00
Hansung Kim
46a57fdf9b tensor: Parameterize dimension in TensorDotProductUnit 2024-10-25 21:57:22 -07:00
Hansung Kim
51dfebb6a7 tensor: Support pipe = 1 in FillBuffer for higher throughput 2024-10-25 20:20:53 -07:00
Hansung Kim
d46a343239 tensor: Fix metadata of C req; fix dequeue / req gen timing 2024-10-25 19:13:42 -07:00
Hansung Kim
1a1a4a088d tensor: Fix access state transition to consider C req 2024-10-25 18:23:51 -07:00
Hansung Kim
991025e896 tensor: Fix C reg being dropped by checking space in respQueueC 2024-10-25 18:10:35 -07:00
Hansung Kim
81efecb3c8 tensor: Fix timing of fullCTag 2024-10-25 17:29:35 -07:00
Hansung Kim
43e064fe82 tensor: Add access logic for C from regfile 2024-10-25 15:22:52 -07:00
Hansung Kim
fc5b864b86 Bump vortex; addResource tensor regfile if 2024-10-24 20:35:14 -07:00
Hansung Kim
31fa440000 Bump vortex 2024-10-24 15:25:12 -07:00
Hansung Kim
ccfb467587 Bump vortex 2024-10-24 15:24:28 -07:00
Hansung Kim
988f0e3174 smem: Disable sanity check on partialData 2024-10-24 15:24:28 -07:00
Hansung Kim
f989bfccc2 Add tensorCoreDecoupled param to WithRadianceCores 2024-10-24 15:24:28 -07:00
Richard Yan
68e715e284 fix unaligned port 2024-10-24 13:42:45 -07:00
Richard Yan
9b8d16d184 Merge branch 'main' of https://github.com/ucb-bar/radiance into main 2024-10-23 15:09:48 -07:00
Richard Yan
0a54018650 dual read port srams 2024-10-23 15:09:43 -07:00
Hansung Kim
2a8c488d28 tensor: Reassert initiate.ready as soon as access ready 2024-10-22 23:10:11 -07:00
Hansung Kim
95ecc5180f tensor: Decouple warp in execute from access
This allows the access stage to accept new initiate back-to-back without
waiting for the previous writeback to finish.
2024-10-22 22:44:33 -07:00
Hansung Kim
072904a82b Bump vortex 2024-10-22 22:06:24 -07:00
Hansung Kim
0a682fb6ef tensor: dontTouch TensorDPU io
Prevents bits.c from being optimized out and set to Z in
TensorCoreDecoupled.
2024-10-22 17:55:14 -07:00
Hansung Kim
85eb5e334f Bump vortex 2024-10-22 17:47:54 -07:00
Hansung Kim
b566748bcb tensor: Address gen for block-wise contiguous layout
Necessary to meet 32B-alignment requirement for SMEM.
2024-10-22 17:17:08 -07:00
Hansung Kim
54ce0f7c34 tensor: Increase numSourceId to 16 to match RadianceTile 2024-10-22 17:08:38 -07:00
Hansung Kim
8818fc9203 tensor: Fix tagWidth for tensor mem io 2024-10-22 16:29:33 -07:00
Hansung Kim
c613341a77 Disable addPath for old verilog; Deassert valid for tensor core
There's an uncaught TL source bug when the core is busy, which doesn't
really need to be fixed with this.
2024-10-22 15:02:55 -07:00
Hansung Kim
83c1e9a0be Merge branch 'tensor-decoupled' 2024-10-22 14:35:44 -07:00
Hansung Kim
e705e8557f Fake tensor core at RadianceTile for Verilog unique-ification 2024-10-22 14:33:10 -07:00
Hansung Kim
d705843c9c Merge commit 'origin/main~1' 2024-10-21 22:41:03 -07:00
Hansung Kim
0fe2b3b07e Bump vortex 2024-10-21 22:39:28 -07:00
Hansung Kim
408888ae8f tensor: addPath()s for hopper generated chisel
FIXME: SourceGenerator has a name-clash.
2024-10-21 22:38:53 -07:00
Hansung Kim
a98cb32343 tensor: Inject stalls to A ram for fuzzing 2024-10-21 22:02:51 -07:00
Richard Yan
8307d8d154 emergency push 2024-10-21 13:50:26 -07:00
Hansung Kim
b3c328b1be tensor: Assert minimum response queue depth with doc 2024-10-18 23:11:32 -07:00
Hansung Kim
e946403d78 tensor: Fix typo, reduce resp queue depth 2024-10-18 22:55:00 -07:00
Hansung Kim
0aadc6074a tensor: Decouple A and B access states
Get rid of set/stepAccess states and let A and B access progress
independently.
2024-10-18 22:42:41 -07:00
Hansung Kim
c0292dd0aa tensor: Enlarge operand buffer for A for better SMEM reuse 2024-10-18 21:53:24 -07:00
Hansung Kim
93c9bcc32f tensor: Stage B as well for full throughput 2024-10-18 20:12:15 -07:00
Hansung Kim
c4b5a11fde tensor: Replace staging logic for A with FillBuffer 2024-10-18 19:54:20 -07:00
Hansung Kim
7fab6f89ad tensor: Properly route FillBuffer to DPU 2024-10-18 17:33:55 -07:00
Hansung Kim
91d9897c27 tensor: Write FillBuffer for tile buffering 2024-10-18 17:17:41 -07:00
Hansung Kim
c2f39f7474 tensor: Rename substepExecute 2024-10-18 16:21:43 -07:00
Hansung Kim
64ea48ace3 tensor: Consider data reuse for B memory request
B is reused every 4 steps because of the k->i->j iteration order.
2024-10-18 13:46:04 -07:00
Hansung Kim
a2519da58f tensor: SMEM address generation 2024-10-17 16:36:18 -07:00
Hansung Kim
2741af0b2b tensor: Keep set/step in the tag writeback queue 2024-10-17 15:44:05 -07:00
Hansung Kim
7de8e86d4f tensor: Sync rd with DPU using a queue 2024-10-17 15:18:47 -07:00
Richard Yan
ffdabf9184 add tag to tc smem interface, bump vortex 2024-10-17 14:49:11 -07:00
Hansung Kim
8847278ad1 tensor: Instantiate actual DPU 2024-10-17 14:44:34 -07:00