Commit Graph

21 Commits

Author SHA1 Message Date
Hansung Kim
cb916ead39 Fix potential bitwidth bug in compute API 2024-11-09 20:59:58 -08:00
Hansung Kim
6990fcc1e6 Add compute-and-mvout-to-spad API 2024-11-09 16:43:45 -08:00
Hansung Kim
dc89309ad0 Merge branch 'kernels-flash' into new-cisc 2024-11-09 14:42:46 -08:00
Hansung Kim
1e3d476e70 Switch header configs to flash 2024-11-08 21:56:42 -08:00
Richard Yan
c114a7a4ab new gemm kernel 2024-11-08 20:55:27 -08:00
Hansung Kim
6f6ee5616f Add convergent attribute to vx_barrier
Note this attribute is only supported by Clang, so this will only be
applied to the kernel binary but not runtime.
2024-10-02 10:57:45 -07:00
Hansung Kim
4d6cdeb00b Fallback to 4 cores for flash 2024-09-07 17:40:49 -07:00
Hansung Kim
87a1c2bbfc Cores per cluster 4 to 8 2024-09-05 16:23:32 -07:00
Hansung Kim
dcd69ea304 Increase SMEM size to 256KB 2024-09-05 16:23:32 -07:00
Richard Yan
ea4819702e oopsie doopsie 2024-08-06 02:43:27 -07:00
Richard Yan
f73029889b oopsie 2024-06-12 13:34:19 -07:00
Richard Yan
7cf59c9480 dma and demo kernels 2024-06-07 18:11:19 -07:00
Richard Yan
33066af56e cisc gemmini 2024-05-08 15:46:20 -07:00
Richard Yan
01f4a69ae9 dma mvout, double buffering & other opts 2024-04-28 01:18:51 -07:00
Richard Yan
449d99f0bb dram gemm kernel 2024-04-16 17:15:22 -07:00
Hansung Kim
870846f20f vx_spawn.c: Create separate vx_spawn_tasks_contiguous 2024-03-27 15:38:52 -07:00
Hansung Kim
3729a05adc vx_spawn.c: Separate cluster-based scheduling code from original 2024-03-26 16:36:57 -07:00
Hansung Kim
f050a08d77 Write vx_spawn_tasks_cluster
This scheduling logic tries to evenly distribute warps across *all* cores,
instead of trying to fill up the first cores as much as possible.  This scheme
is necessary for the intra-cluster cores which are assumed to have equal
workloads distributed.
2024-03-26 10:45:14 -07:00
Hansung Kim
7d177492b2 Move CORES_PER_CLUSTER to vx_spawn.h 2024-03-24 01:45:30 -07:00
Blaise Tine
65ca0fff3a minor update 2023-10-20 00:48:05 -07:00
Blaise Tine
d47cccc157 Vortex 2.0 changes:
+ Microarchitecture optimizations
+ 64-bit support
+ Xilinx FPGA support
+ LLVM-16 support
+ Refactoring and quality control fixes
2023-10-19 20:51:22 -07:00