This repo contains the official code of our ICLR'25 paper: ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation. We introduce ViDiT-Q, a quantization ...
Running the example script llm-compressor/examples/quantization_w4a4_fp4/llama3_example.py results in a runtime error. Full traceback is included below.
Hands on If you hop on Hugging Face and start browsing through large language models, you'll quickly notice a trend: Most have been trained at 16-bit floating point of Brain-float precision. FP16 and ...
Abstract: Quantization has enabled the widespread implementation of deep learning algorithms on resource-constrained Internet of Things (IoT) devices, which compresses neural networks by reducing the ...
The quantization of classical theories that admit more than one Hamiltonian description is considered. This is done from a geometrical viewpoint, both at the quantization level (geometric quantization ...
Canonical quantization of gravitational systems is obstructed by the problem of time. Due to diffeomorphism symmetry the Hamiltonian vanishes: dynamics with respect to a background time parameter ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results