Commit Graph

2487 Commits

Author SHA1 Message Date
0ad87bde81 Implement WU architecture support 2026-05-25 19:25:05 +08:00
323ed7d7e9 Update Vortex core for Blackwell tensor instructions
- Add Blackwell tensor core support in VX_tensor_blackwell_core.sv
- Update decode, execute, and dispatch logic for new instructions
- Extend VX_define.vh and VX_types.vh with Blackwell ISA definitions
2026-05-06 14:50:54 +08:00
cb912d3b8b Add Blackwell tensor RTL scaffolding 2026-04-25 10:15:31 +08:00
Hansung Kim
f1d0fac518 Change to 8-core Volta/Ampere config 2025-01-28 22:36:58 -08:00
Hansung Kim
c8529c4339 Disable EXT_T_HOPPER atm for flash runs 2024-11-08 21:52:52 -08:00
Hansung Kim
cf000afc8f tensor: Remove unused a[2] and a[3] ports for FP32 DPU 2024-11-08 14:34:47 -08:00
Richard Yan
8dc2a25e32 oopsie 3 2024-11-02 14:52:43 -07:00
Richard Yan
d794e055b6 oopsie 2 2024-11-02 14:51:52 -07:00
Richard Yan
ed61418ebf oopsie 2024-11-01 02:55:38 -07:00
Richard Yan
2e3ea060a5 gate operand read 2024-11-01 02:44:35 -07:00
Hansung Kim
ef902614ff tensor: Fix race in inflight_tensor counter 2024-10-29 14:14:31 -07:00
Hansung Kim
1013a74abd tensor: Switch back to hopper + 4 cores 2024-10-28 23:39:39 -07:00
Hansung Kim
19876ab9fd tensor: Fix wrong writeback bit 2024-10-28 21:47:25 -07:00
Hansung Kim
8a66b5ed89 tensor: Connect SMEM addr/rf IO 2024-10-28 19:42:02 -07:00
Hansung Kim
4376bd33a2 tensor: Decode rs1/rs2 of HGMMA for smem addresses 2024-10-28 19:41:37 -07:00
Hansung Kim
72db04cec0 tensor: Switch to 8cores, non-hopper config 2024-10-27 19:47:22 -07:00
Hansung Kim
3e67ddd6c6 tensor: Properly guard tc_rf_if for non-hopper 2024-10-27 17:55:09 -07:00
Hansung Kim
1bc4afe2bb tensor: Bore tensor regfile IO to execute units 2024-10-24 20:32:18 -07:00
Hansung Kim
c88fd89f1f tensor: Don't make initiate_valid depend on ready 2024-10-24 19:29:21 -07:00
Richard Yan
b64e53ff02 Merge branch 'rtl' of github.com:hansungk/vortex-private into rtl 2024-10-24 16:51:22 -07:00
Richard Yan
155cbb0abc tc rf read port 2024-10-24 16:51:15 -07:00
Hansung Kim
40565de8cd tensor: Fix initiate sync with meta queue when !commit.ready 2024-10-24 16:41:54 -07:00
Hansung Kim
3ebeb43568 tensor: Fix inflight_tensor decrement, add under/overflow checks 2024-10-24 14:36:29 -07:00
Hansung Kim
8337488ed3 tensor: Don't check invalid writeback reg for ghost writes 2024-10-24 14:36:18 -07:00
Hansung Kim
e855a47295 Add missing commit_if.tensor bit inits 2024-10-24 13:28:30 -07:00
Hansung Kim
c77a25c968 tensor: Add missing HOPPER guard 2024-10-23 20:33:45 -07:00
Hansung Kim
78df981366 tensor: Simply metadata queue
Enqueue all different-warp reqs into the queue. There is a slight chance
that an HGMMA_WAIT might be blocked from commit when there are multiple
different-warp HGMMAs blocking the dequeue end, but it should be
uncommon.
2024-10-22 22:01:18 -07:00
Hansung Kim
69cbbdd89b tensor: Consider inflight ops for HGMMA blocking
This allows for back-to-back issue of HGMMA past the scoreboard, which
helps to minimize downtime in DPU activity in-between operations.
HGMMA_WAIT now only unblocks when *all* previous HGMMAs have finished
writeback.
2024-10-22 21:32:33 -07:00
Hansung Kim
98eb7cb594 tensor: Block both HGMMA/HGMMA_WAIT at scoreboard
If we let back-to-back HGMMAs pass at scoreboard, we can't accurately
keep track of the busy state of the tensor core and block WAITs
accordingly.

TODO: Distinguish "ready-to-fire" from "ready-to-use-writeback".
2024-10-22 21:10:55 -07:00
Hansung Kim
83979c3341 tensor: Fully connect writeback IO 2024-10-22 20:17:00 -07:00
Hansung Kim
47dff74d3a tensor: Fix commit/metadata logic for HGMMA
Block HGMMA commit until previous ones are all done; always commit
HGMMA_WAIT after it passes the scoreboard.
2024-10-22 20:01:37 -07:00
Hansung Kim
3abaaff16f tensor: Fix tag and data assignment for p0/p1 bus 2024-10-22 17:47:04 -07:00
Hansung Kim
8a8f682194 tensor: Bore smem IO from core to tensor core 2024-10-22 17:42:30 -07:00
Hansung Kim
9131558950 tensor: Connect Chisel-generated TensorCoreDecoupled module
Elaborates, but most of the IOs are tied to fake.
2024-10-22 15:16:24 -07:00
Hansung Kim
32ccdeef01 Merge branch 'tensor-decoupled' into rtl 2024-10-21 22:57:07 -07:00
Hansung Kim
0f06afc3ef Update doc 2024-10-21 22:37:20 -07:00
Richard Yan
cde8da1f3b add tag to tc smem interface 2024-10-17 14:48:39 -07:00
Hansung Kim
4dcbc31a88 tensor: Separate async commit from tensor commit
With this we can prioritize commit of the async hgmma instructions over
the "ghost" commits from the TC.
2024-10-11 21:32:20 -07:00
Hansung Kim
717fe7ff29 tensor: Fix FSM when commit not ready 2024-10-11 20:24:31 -07:00
Hansung Kim
2934b1bd94 tensor: Split execution module from pipeline logic 2024-10-11 20:09:09 -07:00
Hansung Kim
f7f23e0c05 tensor: Doc update 2024-10-11 18:00:36 -07:00
Hansung Kim
42b9d23f83 tensor: Write release logic for hgmma
Upon completion of an op, tensor_core_hopper sends a "ghost" commit
signal down the pipeline with the `wb` and `tensor` bit set in
commit_if.  The scoreboard receives this signal via writeback_if and
resets the inuse_tensor status bit back to zero, which unblocks the
HGMMA_WAIT instruction.
2024-10-11 17:58:44 -07:00
Hansung Kim
408a9b5d2a tensor: Write stall logic for hgmma_wait
HGMMA_WAIT instruction stalls at issue when inuse_tensor is set, which
is done by the previous HGMMA insn. Currently inuse_tensor is never set
back to zero.
2024-10-11 17:18:01 -07:00
Hansung Kim
72f9dedce3 tensor: Disable micro-ops for hopper
Have an uarch FSM handle the stepping mechanism entirely.
2024-10-11 15:59:31 -07:00
Hansung Kim
100d69ef21 Doc update on accumulator regs 2024-10-11 15:47:58 -07:00
Hansung Kim
d9ad4809ec Add 'tensor' bit to commit_if and writeback_if
For use in the asynchronous tensor instruction.  When 1'b1, sets/unsets
the inuse_tensor status bit in the scoreboard to signal
kickoff/completion of the asynchronous tensor op.
2024-10-11 15:42:25 -07:00
Hansung Kim
58c9761829 Revert decode change for hopper
Share the same insn as non-hopper TC.
2024-10-09 21:53:04 -07:00
Hansung Kim
7ab14445f0 tensor: Test many-commit per execute with an FSM
Trick is to set commit_if.data.eop to 0, since the commit module only
signals instruction completion to VX_schedule if the eop bit is 1.
Otherwise it underflows the pending_instr buffer.

The same eop trick works for VX_scoreboard, which works around the
invalid rd writeback error.
2024-10-07 21:29:44 -07:00
Hansung Kim
e8ca4677df Remove old code for pending_instr underflow fix 2024-10-07 20:21:35 -07:00
Hansung Kim
4cac1adf7d Add dummy code for decoupled Hopper tensor core
Define EXT_T_HOPPER that, when EXT_T_ENABLE is defined, distinguishes
whether to instantiate core-coupled Volta-style or decoupled
Hopper-style Tensor Core.
2024-10-07 17:10:59 -07:00