optimum/quanto/library/qbytes_mm.py (1 line): - line 94: # FIXME: accuracy issues with 2.4.x optimum/quanto/tensor/weights/qbits.py (1 line): - line 99: and size[0] >= 128 # FIXME Workaround AWQ GEMM crash (GEMV might work for short inputs) optimum/quanto/tensor/weights/marlin/fp8/qbits.py (1 line): - line 70: # TODO: Here we should use `not isinstance(data, MarlinF8PackedTensor)`, but `torch.compile` is bugged when using that. optimum/quanto/library/extensions/cuda/awq/v2/gemv_cuda.cu (1 line): - line 110: // TODO: use make_divisible optimum/quanto/nn/qmodule.py (1 line): - line 202: # FIXME: here we should copy frozen weights into frozen module, but this leads to grad error