Abstract: Structured pruning and quantization are fundamental techniques used to reduce the size of deep neural networks (DNNs), and typically are applied independently. Applying these techniques ...