GEMM part is disabled for faster debugging, the kernel reads the result of A*B directly from input binary.
10 KiB
10 KiB
GEMM part is disabled for faster debugging, the kernel reads the result of A*B directly from input binary.