optimum/exporters/onnx/model_configs.py (13 lines): - line 109: # TODO : moved back onnx imports applied in https://github.com/huggingface/optimum/pull/2114/files after refactorization - line 518: # TODO: fix inference for transformers < v4.41 for beam_search > 1 - line 660: # TODO: remove this else in optimum 2.0 and make onnx_input_names a required argument - line 1349: # TODO : add addition_embed_type == text_image, image and image_embeds - line 1981: # TODO: support audio-prompted generation (audio_encoder_encode.onnx: corresponds to the audio encoder part - line 2100: # TODO: validate the axis name for attention_mask - line 2161: # TODO: we only need to call it encoder_sequence_length_out in the merge case - but at torch.onnx.export() - line 2262: # TODO: Transformers batched generation for Speecht5 is BROKEN (https://github.com/huggingface/transformers/pull/25943), - line 2266: num_attention_heads="encoder_attention_heads", # TODO: bugged in case encoder and decoder have different number of heads - line 2484: # TODO: Replace the TextSeq2SeqOnnxConfig inheritance with VisionToTextOnnxConfig when added. - line 2631: HIDDEN_SIZE = "text_config.hidden_size" # TODO: Isn't this bug prone? - line 2740: # TODO: remove this else in optimum 2.0 and make onnx_input_names a required argument - line 2758: # TODO: we should probably pass preprocessors to all dummy input generators. optimum/exporters/onnx/convert.py (6 lines): - line 63: # TODO : moved back onnx imports applied in https://github.com/huggingface/optimum/pull/2114/files after refactorization - line 1039: # TODO: support onnx_config.py in the model repo - line 1076: # TODO: this may be moved rather to the OnnxConfig to avoid bloating this script. - line 1218: # TODO: treating diffusion separately is quite ugly - line 1231: # TODO: fix Can't pickle local object 'get_stable_diffusion_models_for_export..' - line 1235: # TODO: fix "Cowardly refusing to serialize non-leaf tensor" error for wav2vec2-conformer optimum/exporters/onnx/base.py (6 lines): - line 55: # TODO : moved back onnx imports applied in https://github.com/huggingface/optimum/pull/2114/files after refactorization - line 341: # TODO: figure out a smart way of re-ordering potential nested structures. - line 556: # TODO: The check `self.task != "text-generation" and self.legacy` is added following the use of a single ONNX for both without/with KV cache, without subgraphs. - line 773: # TODO: we only need to call it encoder_sequence_length_out in the merge case - but at torch.onnx.export() - line 863: # TODO: remove this else in optimum 2.0 and make onnx_input_names a required argument - line 939: # TODO: doesn't this break attention_mask generation? optimum/exporters/onnx/config.py (4 lines): - line 39: # TODO : moved back onnx imports applied in https://github.com/huggingface/optimum/pull/2114/files after refactorization - line 206: # TODO: validate the axis name for attention_mask - line 409: # TODO: it is likely this pop() is unwanted as we then always hit - line 414: # TODO: validate the axis name for attention_mask optimum/onnxruntime/modeling_seq2seq.py (4 lines): - line 415: # TODO: make this less hacky. - line 558: # TODO: this should be improved - line 607: # TODO: using a new variable out_past_key_values is memory inefficient, - line 616: # TODO: this is extremely ugly and unreadable. What if cross-attention k/v change? optimum/onnx/transformations_utils.py (4 lines): - line 115: TODO: short documentation. - line 129: TODO: short documentation. - line 233: # TODO: relax this, and keep the most permissive output shape between model1 and model2 - line 273: TODO: short documentation. optimum/onnxruntime/modeling_diffusion.py (4 lines): - line 80: # TODO: support from_pipe() - line 165: self.image_encoder = kwargs.pop("image_encoder", None) # TODO: maybe implement ORTImageEncoder - line 166: self.safety_checker = kwargs.pop("safety_checker", None) # TODO: maybe implement ORTSafetyChecker - line 206: # TODO: all components should be ORTSessionMixin's at some point optimum/exporters/onnx/model_patcher.py (4 lines): - line 256: # TODO: remove that once we got rid of OnnxConfigWithLoss or we implemented it better. - line 919: # TODO: and self.real_config.use_past_in_inputs - line 1136: # TODO: Is this OK? - line 1218: # TODO: As of torch==2.2.0, the `attention_mask` passed to the model in `generate` is 2D and of dynamic length even when the static optimum/onnxruntime/modeling_ort.py (2 lines): - line 125: # TODO: remove OptimizedModel and use a HubMixin to be able to combine it freely with other mixins - line 704: # TODO: This allows to support sentence-transformers models (sentence embedding), but is not validated. optimum/onnxruntime/optimization.py (2 lines): - line 161: # TODO: this is quite inefficient as we load in memory if models are <2GB without external data - line 222: # TODO: ORT save_model_to_file will save as `.data` although we save as `.onnx_data` in the export optimum/onnxruntime/trainer_seq2seq.py (1 line): - line 229: # TODO: remove this hack when the legacy code that initializes generation_config from a model config is optimum/onnx/graph_transformations.py (1 line): - line 301: # TODO: update IR version in the future. optimum/onnxruntime/quantization.py (1 line): - line 362: # TODO: maybe this logic can be moved to a method in the configuration class (get_ort_quantizer_kwargs()) optimum/onnxruntime/runs/calibrator.py (1 line): - line 71: # TODO estimate memory needed for entropy/percentile to autochoose number of shards optimum/onnxruntime/trainer.py (1 line): - line 888: # TODO: ipex only works with inference with PyTorch, will move `inference_with_ort` to training arguments and optimum/exporters/onnx/_traceable_cache.py (1 line): - line 47: # TODO: deprecate this function in favor of `cache_position` optimum/onnxruntime/modeling_decoder.py (1 line): - line 619: # TODO: not sure if setting config.use_cache is needed for older versions of transformers optimum/onnxruntime/utils.py (1 line): - line 84: # TODO: for encoder-decoder models, validate if bert or gpt2 optimization is better optimum/exporters/onnx/__main__.py (1 line): - line 338: # TODO: Fix in Transformers so that SdpaAttention class can be exported to ONNX.