Hansung Kim
|
543eb2feb4
|
tensor: Support FP16 in TensorCoreDecoupled
|
2024-10-25 22:26:04 -07:00 |
|
Hansung Kim
|
eed821eda6
|
tensor: Add test for 8-dim fp16 DPU
|
2024-10-25 21:57:28 -07:00 |
|
Hansung Kim
|
46a57fdf9b
|
tensor: Parameterize dimension in TensorDotProductUnit
|
2024-10-25 21:57:22 -07:00 |
|
Hansung Kim
|
51dfebb6a7
|
tensor: Support pipe = 1 in FillBuffer for higher throughput
|
2024-10-25 20:20:53 -07:00 |
|
Hansung Kim
|
d46a343239
|
tensor: Fix metadata of C req; fix dequeue / req gen timing
|
2024-10-25 19:13:42 -07:00 |
|
Hansung Kim
|
1a1a4a088d
|
tensor: Fix access state transition to consider C req
|
2024-10-25 18:23:51 -07:00 |
|
Hansung Kim
|
991025e896
|
tensor: Fix C reg being dropped by checking space in respQueueC
|
2024-10-25 18:10:35 -07:00 |
|
Hansung Kim
|
81efecb3c8
|
tensor: Fix timing of fullCTag
|
2024-10-25 17:29:35 -07:00 |
|
Hansung Kim
|
43e064fe82
|
tensor: Add access logic for C from regfile
|
2024-10-25 15:22:52 -07:00 |
|
Hansung Kim
|
fc5b864b86
|
Bump vortex; addResource tensor regfile if
|
2024-10-24 20:35:14 -07:00 |
|
Hansung Kim
|
31fa440000
|
Bump vortex
|
2024-10-24 15:25:12 -07:00 |
|
Hansung Kim
|
ccfb467587
|
Bump vortex
|
2024-10-24 15:24:28 -07:00 |
|
Hansung Kim
|
988f0e3174
|
smem: Disable sanity check on partialData
|
2024-10-24 15:24:28 -07:00 |
|
Hansung Kim
|
f989bfccc2
|
Add tensorCoreDecoupled param to WithRadianceCores
|
2024-10-24 15:24:28 -07:00 |
|
Richard Yan
|
68e715e284
|
fix unaligned port
|
2024-10-24 13:42:45 -07:00 |
|
Richard Yan
|
9b8d16d184
|
Merge branch 'main' of https://github.com/ucb-bar/radiance into main
|
2024-10-23 15:09:48 -07:00 |
|
Richard Yan
|
0a54018650
|
dual read port srams
|
2024-10-23 15:09:43 -07:00 |
|
Hansung Kim
|
2a8c488d28
|
tensor: Reassert initiate.ready as soon as access ready
|
2024-10-22 23:10:11 -07:00 |
|
Hansung Kim
|
95ecc5180f
|
tensor: Decouple warp in execute from access
This allows the access stage to accept new initiate back-to-back without
waiting for the previous writeback to finish.
|
2024-10-22 22:44:33 -07:00 |
|
Hansung Kim
|
072904a82b
|
Bump vortex
|
2024-10-22 22:06:24 -07:00 |
|
Hansung Kim
|
0a682fb6ef
|
tensor: dontTouch TensorDPU io
Prevents bits.c from being optimized out and set to Z in
TensorCoreDecoupled.
|
2024-10-22 17:55:14 -07:00 |
|
Hansung Kim
|
85eb5e334f
|
Bump vortex
|
2024-10-22 17:47:54 -07:00 |
|
Hansung Kim
|
b566748bcb
|
tensor: Address gen for block-wise contiguous layout
Necessary to meet 32B-alignment requirement for SMEM.
|
2024-10-22 17:17:08 -07:00 |
|
Hansung Kim
|
54ce0f7c34
|
tensor: Increase numSourceId to 16 to match RadianceTile
|
2024-10-22 17:08:38 -07:00 |
|
Hansung Kim
|
8818fc9203
|
tensor: Fix tagWidth for tensor mem io
|
2024-10-22 16:29:33 -07:00 |
|
Hansung Kim
|
c613341a77
|
Disable addPath for old verilog; Deassert valid for tensor core
There's an uncaught TL source bug when the core is busy, which doesn't
really need to be fixed with this.
|
2024-10-22 15:02:55 -07:00 |
|
Hansung Kim
|
83c1e9a0be
|
Merge branch 'tensor-decoupled'
|
2024-10-22 14:35:44 -07:00 |
|
Hansung Kim
|
e705e8557f
|
Fake tensor core at RadianceTile for Verilog unique-ification
|
2024-10-22 14:33:10 -07:00 |
|
Hansung Kim
|
d705843c9c
|
Merge commit 'origin/main~1'
|
2024-10-21 22:41:03 -07:00 |
|
Hansung Kim
|
0fe2b3b07e
|
Bump vortex
|
2024-10-21 22:39:28 -07:00 |
|
Hansung Kim
|
408888ae8f
|
tensor: addPath()s for hopper generated chisel
FIXME: SourceGenerator has a name-clash.
|
2024-10-21 22:38:53 -07:00 |
|
Hansung Kim
|
a98cb32343
|
tensor: Inject stalls to A ram for fuzzing
|
2024-10-21 22:02:51 -07:00 |
|
Richard Yan
|
8307d8d154
|
emergency push
|
2024-10-21 13:50:26 -07:00 |
|
Hansung Kim
|
b3c328b1be
|
tensor: Assert minimum response queue depth with doc
|
2024-10-18 23:11:32 -07:00 |
|
Hansung Kim
|
e946403d78
|
tensor: Fix typo, reduce resp queue depth
|
2024-10-18 22:55:00 -07:00 |
|
Hansung Kim
|
0aadc6074a
|
tensor: Decouple A and B access states
Get rid of set/stepAccess states and let A and B access progress
independently.
|
2024-10-18 22:42:41 -07:00 |
|
Hansung Kim
|
c0292dd0aa
|
tensor: Enlarge operand buffer for A for better SMEM reuse
|
2024-10-18 21:53:24 -07:00 |
|
Hansung Kim
|
93c9bcc32f
|
tensor: Stage B as well for full throughput
|
2024-10-18 20:12:15 -07:00 |
|
Hansung Kim
|
c4b5a11fde
|
tensor: Replace staging logic for A with FillBuffer
|
2024-10-18 19:54:20 -07:00 |
|
Hansung Kim
|
7fab6f89ad
|
tensor: Properly route FillBuffer to DPU
|
2024-10-18 17:33:55 -07:00 |
|
Hansung Kim
|
91d9897c27
|
tensor: Write FillBuffer for tile buffering
|
2024-10-18 17:17:41 -07:00 |
|
Hansung Kim
|
c2f39f7474
|
tensor: Rename substepExecute
|
2024-10-18 16:21:43 -07:00 |
|
Hansung Kim
|
64ea48ace3
|
tensor: Consider data reuse for B memory request
B is reused every 4 steps because of the k->i->j iteration order.
|
2024-10-18 13:46:04 -07:00 |
|
Hansung Kim
|
a2519da58f
|
tensor: SMEM address generation
|
2024-10-17 16:36:18 -07:00 |
|
Hansung Kim
|
2741af0b2b
|
tensor: Keep set/step in the tag writeback queue
|
2024-10-17 15:44:05 -07:00 |
|
Hansung Kim
|
7de8e86d4f
|
tensor: Sync rd with DPU using a queue
|
2024-10-17 15:18:47 -07:00 |
|
Richard Yan
|
ffdabf9184
|
add tag to tc smem interface, bump vortex
|
2024-10-17 14:49:11 -07:00 |
|
Hansung Kim
|
8847278ad1
|
tensor: Instantiate actual DPU
|
2024-10-17 14:44:34 -07:00 |
|
Hansung Kim
|
e1e3ac8274
|
tensor: Fix busy state
|
2024-10-16 22:22:27 -07:00 |
|
Hansung Kim
|
23edc34c7e
|
tensor: Add two TLRAM config for full throughput test
|
2024-10-16 22:15:35 -07:00 |
|