sgemm_impl: Disable wmma fast store
Doesn't seem to have a big impact on tcore util.
This commit is contained in:
@@ -108,7 +108,7 @@ static_assert(WMITER * WNITER * TCM * TCN * NUM_WARPS * CORES_PER_CLUSTER ==
|
|||||||
// scheme and instead do a fast coalesced GMEM writes for move out. This
|
// scheme and instead do a fast coalesced GMEM writes for move out. This
|
||||||
// doesn't necessarily mean breaking correctness; it means that the final
|
// doesn't necessarily mean breaking correctness; it means that the final
|
||||||
// result matrix will be stored in a swizzled form in the global memory.
|
// result matrix will be stored in a swizzled form in the global memory.
|
||||||
#define WMMA_STORE_FAST 1
|
#define WMMA_STORE_FAST 0
|
||||||
|
|
||||||
#define GEMMINI_DMA 1
|
#define GEMMINI_DMA 1
|
||||||
#define GEMMINI_DMA_FAST 1
|
#define GEMMINI_DMA_FAST 1
|
||||||
|
|||||||
Reference in New Issue
Block a user