Hansung Kim
|
b3c328b1be
|
tensor: Assert minimum response queue depth with doc
|
2024-10-18 23:11:32 -07:00 |
|
Hansung Kim
|
e946403d78
|
tensor: Fix typo, reduce resp queue depth
|
2024-10-18 22:55:00 -07:00 |
|
Hansung Kim
|
0aadc6074a
|
tensor: Decouple A and B access states
Get rid of set/stepAccess states and let A and B access progress
independently.
|
2024-10-18 22:42:41 -07:00 |
|
Hansung Kim
|
c0292dd0aa
|
tensor: Enlarge operand buffer for A for better SMEM reuse
|
2024-10-18 21:53:24 -07:00 |
|
Hansung Kim
|
93c9bcc32f
|
tensor: Stage B as well for full throughput
|
2024-10-18 20:12:15 -07:00 |
|
Hansung Kim
|
c4b5a11fde
|
tensor: Replace staging logic for A with FillBuffer
|
2024-10-18 19:54:20 -07:00 |
|
Hansung Kim
|
7fab6f89ad
|
tensor: Properly route FillBuffer to DPU
|
2024-10-18 17:33:55 -07:00 |
|
Hansung Kim
|
91d9897c27
|
tensor: Write FillBuffer for tile buffering
|
2024-10-18 17:17:41 -07:00 |
|
Hansung Kim
|
c2f39f7474
|
tensor: Rename substepExecute
|
2024-10-18 16:21:43 -07:00 |
|
Hansung Kim
|
64ea48ace3
|
tensor: Consider data reuse for B memory request
B is reused every 4 steps because of the k->i->j iteration order.
|
2024-10-18 13:46:04 -07:00 |
|
Hansung Kim
|
a2519da58f
|
tensor: SMEM address generation
|
2024-10-17 16:36:18 -07:00 |
|
Hansung Kim
|
2741af0b2b
|
tensor: Keep set/step in the tag writeback queue
|
2024-10-17 15:44:05 -07:00 |
|
Hansung Kim
|
7de8e86d4f
|
tensor: Sync rd with DPU using a queue
|
2024-10-17 15:18:47 -07:00 |
|
Hansung Kim
|
8847278ad1
|
tensor: Instantiate actual DPU
|
2024-10-17 14:44:34 -07:00 |
|
Hansung Kim
|
e1e3ac8274
|
tensor: Fix busy state
|
2024-10-16 22:22:27 -07:00 |
|
Hansung Kim
|
23edc34c7e
|
tensor: Add two TLRAM config for full throughput test
|
2024-10-16 22:15:35 -07:00 |
|
Hansung Kim
|
6cad8edd18
|
tensor: Fix operand alignment in pipelining
|
2024-10-16 22:01:02 -07:00 |
|
Hansung Kim
|
77dae3e1f9
|
tensor: Write staging pipeline for A tile
|
2024-10-16 21:21:48 -07:00 |
|
Hansung Kim
|
444dd5d7e1
|
tensor: Add destination reg to IO
|
2024-10-16 14:25:38 -07:00 |
|
Hansung Kim
|
e2abe1cffd
|
tensor: Sequence set/steps in the execute-side
|
2024-10-15 19:12:45 -07:00 |
|
Hansung Kim
|
efaf599fbe
|
tensor: Assert alignment of A and B response queues
|
2024-10-15 17:08:32 -07:00 |
|
Hansung Kim
|
de393115cd
|
tensor: Translate TL response source to set/step tag
|
2024-10-15 16:50:40 -07:00 |
|
Hansung Kim
|
2ca2ee37b0
|
tensor: Fix writeback datawidth
|
2024-10-15 15:45:59 -07:00 |
|
Hansung Kim
|
ab8d3554bb
|
Bump vortex to tensor-decoupled
|
2024-10-15 15:45:52 -07:00 |
|
Hansung Kim
|
90949f488b
|
tensor: Add memory response queue
|
2024-10-14 22:36:18 -07:00 |
|
Hansung Kim
|
8d2e13b4ee
|
tensor: Hold step until req fired for both A and B
|
2024-10-14 22:07:20 -07:00 |
|
Hansung Kim
|
14a640bf2d
|
tensor: Do proper source generation
SourceGenerator keeps on givin'
|
2024-10-14 21:38:54 -07:00 |
|
Hansung Kim
|
bf6f7210b7
|
tensor: Generate TL traffic, separate edges for A and B
|
2024-10-14 21:15:07 -07:00 |
|
Hansung Kim
|
9ac8f2492c
|
tensor: Minimal diplomacy config for unittest
|
2024-10-14 20:54:33 -07:00 |
|
Hansung Kim
|
01f53a8be1
|
tensor: Sequence through set/steps
|
2024-10-14 20:20:30 -07:00 |
|
Hansung Kim
|
3165108c8b
|
Add synthesizable unit test for tensor
|
2024-10-14 19:47:00 -07:00 |
|
Hansung Kim
|
327615e330
|
Add state regs and init/writeback transition
|
2024-10-14 17:28:51 -07:00 |
|
Hansung Kim
|
6a3aa549d3
|
Add skeleton for Hopper Tensor Core
|
2024-10-14 15:02:25 -07:00 |
|
Hansung Kim
|
447977bd89
|
addResource hopper tensor core
|
2024-10-14 15:02:08 -07:00 |
|
Hansung Kim
|
b4bd9ecbc9
|
Dummy comment
|
2024-10-02 15:18:56 -07:00 |
|
Richard Yan
|
2929a84ecc
|
get smem params from key
|
2024-09-26 16:49:06 -07:00 |
|
Richard Yan
|
f11385218f
|
move virgo components into shared mem module, more cleanup
|
2024-09-26 14:41:46 -07:00 |
|
Richard Yan
|
998f73b54a
|
general cleanup
|
2024-09-24 18:17:00 -07:00 |
|
Richard Yan
|
3b8c9812b4
|
refactor smem counter
|
2024-09-24 17:24:52 -07:00 |
|
Richard Yan
|
85336399c2
|
refactor radiance cluster shared memory into components
|
2024-09-24 03:14:32 -07:00 |
|
Richard Yan
|
20cf4609b7
|
camelCase
|
2024-09-22 01:21:37 -07:00 |
|
Richard Yan
|
daacae9edc
|
fallback for hint select
|
2024-09-11 15:09:52 -07:00 |
|
Richard Yan
|
f1a1b77828
|
actually support large smem subbanks
|
2024-09-10 23:24:02 -07:00 |
|
Richard Yan
|
13142ab0b9
|
Merge branch 'main' of https://github.com/ucb-bar/radiance into main
|
2024-09-10 18:30:54 -07:00 |
|
Richard Yan
|
810db6a1ea
|
new crossbar w/ individual select and group hint, subbanks > num lanes support
|
2024-09-10 18:30:48 -07:00 |
|
Hansung Kim
|
b335132c34
|
Parameterize tensor core FP16
|
2024-09-10 15:38:12 -07:00 |
|
Hansung Kim
|
4b031d1ade
|
radiance.mk: Remove SMEM_LOG_SIZE override
|
2024-09-10 15:38:12 -07:00 |
|
Richard Yan
|
3fd0fd296b
|
queued cisc commands
|
2024-09-09 22:38:16 -07:00 |
|
Richard Yan
|
06edba2a78
|
fix comb loop & revert xbar temporarily
|
2024-09-09 02:27:08 -07:00 |
|
Richard Yan
|
afc6ba7eca
|
fix ext policy xbar, add rectangular tile support
|
2024-09-08 13:21:31 -07:00 |
|