optimum/exporters/onnx/model_configs.py (13 lines): - line 109: # TODO : moved back onnx imports applied in https://github.com/huggingface/optimum/pull/2114/files after refactorization - line 518: # TODO: fix inference for transformers < v4.41 for beam_search > 1 - line 660: # TODO: remove this else in optimum 2.0 and make onnx_input_names a required argument - line 1349: # TODO : add addition_embed_type == text_image, image and image_embeds - line 1981: # TODO: support audio-prompted generation (audio_encoder_encode.onnx: corresponds to the audio encoder part - line 2100: # TODO: validate the axis name for attention_mask - line 2161: # TODO: we only need to call it encoder_sequence_length_out in the merge case - but at torch.onnx.export() - line 2262: # TODO: Transformers batched generation for Speecht5 is BROKEN (https://github.com/huggingface/transformers/pull/25943), - line 2266: num_attention_heads="encoder_attention_heads", # TODO: bugged in case encoder and decoder have different number of heads - line 2484: # TODO: Replace the TextSeq2SeqOnnxConfig inheritance with VisionToTextOnnxConfig when added. - line 2631: HIDDEN_SIZE = "text_config.hidden_size" # TODO: Isn't this bug prone? - line 2740: # TODO: remove this else in optimum 2.0 and make onnx_input_names a required argument - line 2758: # TODO: we should probably pass preprocessors to all dummy input generators. optimum/exporters/onnx/convert.py (6 lines): - line 62: # TODO : moved back onnx imports applied in https://github.com/huggingface/optimum/pull/2114/files after refactorization - line 1037: # TODO: support onnx_config.py in the model repo - line 1074: # TODO: this may be moved rather to the OnnxConfig to avoid bloating this script. - line 1216: # TODO: treating diffusion separately is quite ugly - line 1229: # TODO: fix Can't pickle local object 'get_stable_diffusion_models_for_export..' - line 1233: # TODO: fix "Cowardly refusing to serialize non-leaf tensor" error for wav2vec2-conformer optimum/exporters/onnx/base.py (6 lines): - line 55: # TODO : moved back onnx imports applied in https://github.com/huggingface/optimum/pull/2114/files after refactorization - line 340: # TODO: figure out a smart way of re-ordering potential nested structures. - line 555: # TODO: The check `self.task != "text-generation" and self.legacy` is added following the use of a single ONNX for both without/with KV cache, without subgraphs. - line 772: # TODO: we only need to call it encoder_sequence_length_out in the merge case - but at torch.onnx.export() - line 862: # TODO: remove this else in optimum 2.0 and make onnx_input_names a required argument - line 938: # TODO: doesn't this break attention_mask generation? optimum/exporters/tasks.py (6 lines): - line 200: # TODO: having several tasks pointing to the same auto-model class is bug prone to auto-detect the - line 394: # TODO: some models here support text-generation export but are not supported in ORTModelForCausalLM - line 396: # TODO: remove `-with-past` tasks and rather rely on `variant`. - line 1083: # TODO: SpeechT5 can also support audio-to-audio and automatic-speech-recognition. - line 1711: # TODO: maybe implement that. - line 2205: # TODO : fix EulerDiscreteScheduler loading to enable for SD models optimum/onnx/transformations_utils.py (4 lines): - line 116: TODO: short documentation. - line 130: TODO: short documentation. - line 234: # TODO: relax this, and keep the most permissive output shape between model1 and model2 - line 274: TODO: short documentation. optimum/exporters/onnx/config.py (4 lines): - line 39: # TODO : moved back onnx imports applied in https://github.com/huggingface/optimum/pull/2114/files after refactorization - line 206: # TODO: validate the axis name for attention_mask - line 409: # TODO: it is likely this pop() is unwanted as we then always hit - line 414: # TODO: validate the axis name for attention_mask optimum/onnxruntime/modeling_seq2seq.py (4 lines): - line 415: # TODO: make this less hacky. - line 558: # TODO: this should be improved - line 607: # TODO: using a new variable out_past_key_values is memory inefficient, - line 616: # TODO: this is extremely ugly and unreadable. What if cross-attention k/v change? optimum/onnxruntime/modeling_diffusion.py (4 lines): - line 80: # TODO: support from_pipe() - line 165: self.image_encoder = kwargs.pop("image_encoder", None) # TODO: maybe implement ORTImageEncoder - line 166: self.safety_checker = kwargs.pop("safety_checker", None) # TODO: maybe implement ORTSafetyChecker - line 206: # TODO: all components should be ORTSessionMixin's at some point optimum/exporters/onnx/model_patcher.py (4 lines): - line 256: # TODO: remove that once we got rid of OnnxConfigWithLoss or we implemented it better. - line 919: # TODO: and self.real_config.use_past_in_inputs - line 1136: # TODO: Is this OK? - line 1218: # TODO: As of torch==2.2.0, the `attention_mask` passed to the model in `generate` is 2D and of dynamic length even when the static optimum/utils/input_generators.py (3 lines): - line 759: # TODO: should it just be merged to DummyTextInputGenerator? - line 793: # TODO: find out why this fails with the commented code. - line 1322: self.batch_size = 1 # TODO: SpeechT5 does not support batch inference in Transformers for now. optimum/utils/normalized_config.py (2 lines): - line 159: # TODO: this config is bug prone, as `encoder_attention_heads` and `decoder_attention_heads` may be different - line 205: TODO: missing normalized configs (currently not useful) optimum/runs_base.py (2 lines): - line 252: # TODO support grayscale? - line 270: # TODO not trak GPU/CPU <--> numpy/torch, need to change the implementation of forward optimum/modeling_base.py (2 lines): - line 92: # TODO: Should be removed when we no longer use OptimizedModel for everything - line 203: # FIXME: when huggingface_hub fixes the return of upload_file optimum/bettertransformer/models/attention.py (2 lines): - line 23: # TODO (CRITICAL): Layer-wise attention scaling is broken for several archs. - line 313: # TODO: raise on batch_size = 1 + padding optimum/exporters/tflite/model_configs.py (2 lines): - line 93: # TODO => Needs to be investigated. - line 98: # TODO: no TensorFlow implementation, but a Jax implementation is available. optimum/onnxruntime/modeling_ort.py (2 lines): - line 125: # TODO: remove OptimizedModel and use a HubMixin to be able to combine it freely with other mixins - line 704: # TODO: This allows to support sentence-transformers models (sentence embedding), but is not validated. optimum/exporters/utils.py (2 lines): - line 520: # TODO: more flexibility in the vocoder class? - line 623: # TODO: this succession of if/else strongly suggests a refactor is needed. optimum/onnxruntime/optimization.py (2 lines): - line 161: # TODO: this is quite inefficient as we load in memory if models are <2GB without external data - line 222: # TODO: ORT save_model_to_file will save as `.data` although we save as `.onnx_data` in the export optimum/onnx/graph_transformations.py (1 line): - line 302: # TODO: update IR version in the future. optimum/onnxruntime/quantization.py (1 line): - line 362: # TODO: maybe this logic can be moved to a method in the configuration class (get_ort_quantizer_kwargs()) optimum/bettertransformer/transformation.py (1 line): - line 368: # TODO: fix once this is fixed in pytorch optimum/fx/parallelization/op_registry/op_handlers.py (1 line): - line 418: # TODO: append all-gather comm ops before all parallelized output nodes if instructed. setup.py (1 line): - line 23: # TODO: unpin pytest once https://github.com/huggingface/transformers/pull/29154 is merged & released optimum/onnxruntime/runs/calibrator.py (1 line): - line 71: # TODO estimate memory needed for entropy/percentile to autochoose number of shards optimum/commands/export/tflite.py (1 line): - line 235: # TODO: hack until TFLiteExportCommand does not use subprocess anymore. optimum/exporters/onnx/_traceable_cache.py (1 line): - line 47: # TODO: deprecate this function in favor of `cache_position` optimum/onnxruntime/utils.py (1 line): - line 84: # TODO: for encoder-decoder models, validate if bert or gpt2 optimization is better optimum/exporters/tflite/convert.py (1 line): - line 275: # TODO: maybe use calibration.select_columns(columns_needed_by_all_signatures) instead, did not work? optimum/utils/preprocessing/text_classification.py (1 line): - line 114: # TODO: do we want to do that here? optimum/onnxruntime/trainer_seq2seq.py (1 line): - line 229: # TODO: remove this hack when the legacy code that initializes generation_config from a model config is optimum/exporters/tflite/__main__.py (1 line): - line 72: # TODO: find a cleaner way to do this. optimum/utils/import_utils.py (1 line): - line 297: # TODO : Remove check_if_transformers_greater, check_if_diffusers_greater, check_if_torch_greater optimum/onnxruntime/trainer.py (1 line): - line 888: # TODO: ipex only works with inference with PyTorch, will move `inference_with_ort` to training arguments and optimum/bettertransformer/models/encoder_models.py (1 line): - line 900: # TODO: Kind of stupid to do that at each layer, should be fixed in transformers optimum/bettertransformer/models/base.py (1 line): - line 152: # TODO: remove the clone once https://github.com/huggingface/transformers/pull/27314 & https://github.com/huggingface/safetensors/pull/379 are released. optimum/fx/parallelization/api.py (1 line): - line 115: # TODO: remove this once support training-time trace optimum/utils/preprocessing/token_classification.py (1 line): - line 98: # TODO: do we want to do that here? optimum/onnxruntime/modeling_decoder.py (1 line): - line 619: # TODO: not sure if setting config.use_cache is needed for older versions of transformers optimum/configuration_utils.py (1 line): - line 218: # TODO: remove condition once transformers release version is way above 4.22. optimum/exporters/onnx/__main__.py (1 line): - line 338: # TODO: Fix in Transformers so that SdpaAttention class can be exported to ONNX. optimum/utils/preprocessing/image_classification.py (1 line): - line 113: # TODO: do we want to do that here?