optimum/exporters/openvino/model_patcher.py (7 lines): - line 442: # TODO : remove after support of transformers >= v4.40.0 - line 499: # TODO: As of torch==2.2.0, the `attention_mask` passed to the model in `generate` is 2D and of dynamic length even when the static - line 578: # TODO : deprecate _llama_gemma_update_causal_mask_legacy when transformers>=4.41.0 - line 657: # TODO: As of torch==2.2.0, the `attention_mask` passed to the model in `generate` is 2D and of dynamic length even when the static - line 2242: # TODO: For dynamo, rather use a check on fullgraph=True once this is possible (https://github.com/pytorch/pytorch/pull/120400). - line 2269: # TODO: As of torch==2.2.0, the `attention_mask` passed to the model in `generate` is 2D and of dynamic length even when the static - line 2729: # TODO: As of torch==2.2.0, the `attention_mask` passed to the model in `generate` is 2D and of dynamic length even when the static optimum/exporters/openvino/stateful.py (6 lines): - line 31: # TODO: Provide a better way based on the variables availability, but OV Python API doesn't expose required methods - line 40: ov_model (ov.Model): # TODO: Can we derive the dimensions from the model topology? - line 150: # TODO: Can we derive the dimensions from the model topology? - line 255: # TODO: Use symbols on dimensions to filter out ShapeOf subexpressions that do not bring new symbols in the subgraph - line 266: # FIXME: get_any_name is not reliable as tensor may not have any names - line 306: # TODO: Deduce from a model via ordinal reshape (?) and topology optimum/intel/openvino/modeling_diffusion.py (6 lines): - line 133: # TODO: support DiffusionPipeline.from_pipe() - line 134: # TODO: makes more sense to have a compositional OVMixin class - line 135: # TODO: instead of one bloated __init__, we should consider an __init__ per pipeline - line 238: self.image_encoder = kwargs.pop("image_encoder", None) # TODO: maybe mplement OVModelImageEncoder - line 239: self.safety_checker = kwargs.pop("safety_checker", None) # TODO: maybe mplement OVModelSafetyChecker - line 1004: # TODO: deprecate and add warnings when a random state is passed optimum/intel/openvino/quantization.py (5 lines): - line 253: # TODO: deprecate "signature_columns": model.forward() may not be the method which is called during inference, - line 813: # TODO: deprecate because model.forward() may not be the method which is called during inference, - line 1013: # TODO : Create model - line 1341: # TODO : can be set to self.model.config.name_or_path for OVModels when not provided - line 1515: # TODO: consider in the future for this method to return OVCalibrationDataset instance from either datasets.Dataset instance or its name as input. optimum/exporters/openvino/convert.py (4 lines): - line 387: # TODO: Consider applying bettertransformer regardless of stateful flag -- requires additional validation. - line 389: # TODO: Consider unpatching model after export is done in the end of this function. - line 468: ov_model.validate_nodes_and_infer_types() # TODO: remove as unnecessary validation? - line 635: # TODO: support onnx_config.py in the model repo optimum/intel/neural_compressor/trainer.py (3 lines): - line 168: # TODO : To deprecate once support transformers > 4.30.0 - line 692: # TODO: push to hub if self.args.push_to_hub and not _internal_call - line 899: # TODO : can be removed once transformers >= v4.38.0 optimum/commands/export/openvino.py (2 lines): - line 336: # TODO: add revision, subfolder and token to args - line 540: # TODO : add input shapes optimum/intel/openvino/modeling_decoder.py (2 lines): - line 793: # TODO: Apply it differently based on model type - line 794: # TODO: At least for bloom we need to replicate values for each attention head optimum/intel/neural_compressor/trainer_seq2seq.py (1 line): - line 182: # TODO: remove this hack when the legacy code that initializes generation_config from a model config is optimum/intel/neural_compressor/quantization.py (1 line): - line 117: # TODO : Create model optimum/intel/neural_compressor/configuration.py (1 line): - line 55: # TODO : add activations_dtype and weights_dtype optimum/intel/openvino/modeling_base.py (1 line): - line 261: # TODO: remove this way of applying quantization; instead apply it after instance of OVModel* is loaded optimum/exporters/ipex/cache_utils.py (1 line): - line 105: # TODO: unify API definition between CPU and XPU in IPEX version > 2.6 optimum/intel/openvino/configuration.py (1 line): - line 1170: # TODO: remove once there is support for FP8 weight compression in NNCF optimum/intel/openvino/modeling_visual_language.py (1 line): - line 2158: # remove the padding using attention_mask -- TODO: double check optimum/exporters/ipex/modeling_utils.py (1 line): - line 814: # TODO: remove this WA after IPEX 2.7