CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers

cutlass → arch Relation

File in include/cutlassIncludes file in include/cutlass/arch
gemm / kernel / default_gemm.hwmma.h
gemm / device / default_gemm_configuration.harch.h
gemm / device / default_gemm_configuration.harch/mma.h
gemm / device / default_gemm_configuration.hwmma.h
gemm / threadblock / default_mma.harch.h
gemm / threadblock / default_mma.hwmma.h
gemm / threadblock / default_mma_core_wmma.hwmma.h
gemm / warp / default_mma_wmma_tensor_op.hwmma.h
gemm / device / device/gemm_batched.harch.h
gemm / device / device/gemm_splitk_parallel.harch.h
gemm / thread / gemm/thread/mma.harch/mma.h
gemm / thread / gemm/thread/mma_sm50.harch/mma.h
gemm / device / include/cutlass/gemm/device/gemm.harch.h
gemm / device / include/cutlass/gemm/device/gemm_complex.harch.h
gemm / threadblock / mma_base.hmemory.h
gemm / warp / mma_complex_tensor_op.hmemory_sm75.h
gemm / warp / mma_complex_tensor_op.hmma_sm75.h
gemm / warp / mma_tensor_op.hmemory_sm75.h
gemm / warp / mma_tensor_op.hmma_sm75.h
gemm / warp / mma_tensor_op_sm70.harch/mma.h
gemm / warp / mma_tensor_op_tile_iterator.hmemory_sm75.h
gemm / warp / mma_tensor_op_tile_iterator_wmma.hwmma.h
gemm / warp / mma_tensor_op_wmma.hwmma.h
transform / threadblock / transform/threadblock/predicated_tile_iterator.hmemory.h
wmma_array.hwmma.h
epilogue / warp / wmma_tensor_op_policy.hwmma.h