Hansung Kim
0ba61aabb6
tensor: Instantiate correct fake tcore module according to parameter
...
This has to align with what the verilog source actually uses.
2024-10-27 18:48:44 -07:00
Hansung Kim
13b9577723
Instantiate fake tensor modules outside of diplomacy
...
Adding them to the Diplomacy graph will cause to widen source widths
which can have area implications.
This gets rid of the need to do addResource() calls to the manually
generated Verilog files. Their module parameters should be kept the
same as what's used in the parent Verilog module, however.
2024-10-25 23:02:25 -07:00
Hansung Kim
543eb2feb4
tensor: Support FP16 in TensorCoreDecoupled
2024-10-25 22:26:04 -07:00
Hansung Kim
eed821eda6
tensor: Add test for 8-dim fp16 DPU
2024-10-25 21:57:28 -07:00
Hansung Kim
46a57fdf9b
tensor: Parameterize dimension in TensorDotProductUnit
2024-10-25 21:57:22 -07:00
Hansung Kim
51dfebb6a7
tensor: Support pipe = 1 in FillBuffer for higher throughput
2024-10-25 20:20:53 -07:00
Hansung Kim
d46a343239
tensor: Fix metadata of C req; fix dequeue / req gen timing
2024-10-25 19:13:42 -07:00
Hansung Kim
1a1a4a088d
tensor: Fix access state transition to consider C req
2024-10-25 18:23:51 -07:00
Hansung Kim
991025e896
tensor: Fix C reg being dropped by checking space in respQueueC
2024-10-25 18:10:35 -07:00
Hansung Kim
81efecb3c8
tensor: Fix timing of fullCTag
2024-10-25 17:29:35 -07:00
Hansung Kim
43e064fe82
tensor: Add access logic for C from regfile
2024-10-25 15:22:52 -07:00
Hansung Kim
fc5b864b86
Bump vortex; addResource tensor regfile if
2024-10-24 20:35:14 -07:00
Hansung Kim
31fa440000
Bump vortex
2024-10-24 15:25:12 -07:00
Hansung Kim
ccfb467587
Bump vortex
2024-10-24 15:24:28 -07:00
Hansung Kim
988f0e3174
smem: Disable sanity check on partialData
2024-10-24 15:24:28 -07:00
Hansung Kim
f989bfccc2
Add tensorCoreDecoupled param to WithRadianceCores
2024-10-24 15:24:28 -07:00
Richard Yan
68e715e284
fix unaligned port
2024-10-24 13:42:45 -07:00
Richard Yan
9b8d16d184
Merge branch 'main' of https://github.com/ucb-bar/radiance into main
2024-10-23 15:09:48 -07:00
Richard Yan
0a54018650
dual read port srams
2024-10-23 15:09:43 -07:00
Hansung Kim
2a8c488d28
tensor: Reassert initiate.ready as soon as access ready
2024-10-22 23:10:11 -07:00
Hansung Kim
95ecc5180f
tensor: Decouple warp in execute from access
...
This allows the access stage to accept new initiate back-to-back without
waiting for the previous writeback to finish.
2024-10-22 22:44:33 -07:00
Hansung Kim
072904a82b
Bump vortex
2024-10-22 22:06:24 -07:00
Hansung Kim
0a682fb6ef
tensor: dontTouch TensorDPU io
...
Prevents bits.c from being optimized out and set to Z in
TensorCoreDecoupled.
2024-10-22 17:55:14 -07:00
Hansung Kim
85eb5e334f
Bump vortex
2024-10-22 17:47:54 -07:00
Hansung Kim
b566748bcb
tensor: Address gen for block-wise contiguous layout
...
Necessary to meet 32B-alignment requirement for SMEM.
2024-10-22 17:17:08 -07:00
Hansung Kim
54ce0f7c34
tensor: Increase numSourceId to 16 to match RadianceTile
2024-10-22 17:08:38 -07:00
Hansung Kim
8818fc9203
tensor: Fix tagWidth for tensor mem io
2024-10-22 16:29:33 -07:00
Hansung Kim
c613341a77
Disable addPath for old verilog; Deassert valid for tensor core
...
There's an uncaught TL source bug when the core is busy, which doesn't
really need to be fixed with this.
2024-10-22 15:02:55 -07:00
Hansung Kim
83c1e9a0be
Merge branch 'tensor-decoupled'
2024-10-22 14:35:44 -07:00
Hansung Kim
e705e8557f
Fake tensor core at RadianceTile for Verilog unique-ification
2024-10-22 14:33:10 -07:00
Hansung Kim
d705843c9c
Merge commit 'origin/main~1'
2024-10-21 22:41:03 -07:00
Hansung Kim
0fe2b3b07e
Bump vortex
2024-10-21 22:39:28 -07:00
Hansung Kim
408888ae8f
tensor: addPath()s for hopper generated chisel
...
FIXME: SourceGenerator has a name-clash.
2024-10-21 22:38:53 -07:00
Hansung Kim
a98cb32343
tensor: Inject stalls to A ram for fuzzing
2024-10-21 22:02:51 -07:00
Richard Yan
8307d8d154
emergency push
2024-10-21 13:50:26 -07:00
Hansung Kim
b3c328b1be
tensor: Assert minimum response queue depth with doc
2024-10-18 23:11:32 -07:00
Hansung Kim
e946403d78
tensor: Fix typo, reduce resp queue depth
2024-10-18 22:55:00 -07:00
Hansung Kim
0aadc6074a
tensor: Decouple A and B access states
...
Get rid of set/stepAccess states and let A and B access progress
independently.
2024-10-18 22:42:41 -07:00
Hansung Kim
c0292dd0aa
tensor: Enlarge operand buffer for A for better SMEM reuse
2024-10-18 21:53:24 -07:00
Hansung Kim
93c9bcc32f
tensor: Stage B as well for full throughput
2024-10-18 20:12:15 -07:00
Hansung Kim
c4b5a11fde
tensor: Replace staging logic for A with FillBuffer
2024-10-18 19:54:20 -07:00
Hansung Kim
7fab6f89ad
tensor: Properly route FillBuffer to DPU
2024-10-18 17:33:55 -07:00
Hansung Kim
91d9897c27
tensor: Write FillBuffer for tile buffering
2024-10-18 17:17:41 -07:00
Hansung Kim
c2f39f7474
tensor: Rename substepExecute
2024-10-18 16:21:43 -07:00
Hansung Kim
64ea48ace3
tensor: Consider data reuse for B memory request
...
B is reused every 4 steps because of the k->i->j iteration order.
2024-10-18 13:46:04 -07:00
Hansung Kim
a2519da58f
tensor: SMEM address generation
2024-10-17 16:36:18 -07:00
Hansung Kim
2741af0b2b
tensor: Keep set/step in the tag writeback queue
2024-10-17 15:44:05 -07:00
Hansung Kim
7de8e86d4f
tensor: Sync rd with DPU using a queue
2024-10-17 15:18:47 -07:00
Richard Yan
ffdabf9184
add tag to tc smem interface, bump vortex
2024-10-17 14:49:11 -07:00
Hansung Kim
8847278ad1
tensor: Instantiate actual DPU
2024-10-17 14:44:34 -07:00