Matrix Multiplication Gemm

RGHT-Q: Reconfigurable GEMM Unit for Heterogeneous-Homogeneous Tensor Quantization

Abstract: The high computational demands of large language models (LLMs) are limited by the lack of GPU hardware support for heterogeneous quantization, which mixes integers and floating points. To ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

RGHT-Q: Reconfigurable GEMM Unit for Heterogeneous-Homogeneous Tensor Quantization

Trending now