Summary: 28 instances, 22 unique Text Count if len(row_lin) > 1: # TODO: change to checking cell value tokens 3 # TODO: long term HF state compatibility fix 1 # TODO: move to a conf group 4 # TODO: move max len to methods params? 1 # TODO: check if pytorch process group is initialized 1 if len(row_lin) > 1: # TODO: change to checking cell value tokens 1 # TODO: tmp workaround for EL, remove or revise 2 # TODO: offset doesn't work for multiset currently 1 # TODO: reset the iteration status? 1 # TODO: support other types 1 # TODO: remove? 1 # TODO delete once moved to the new method 1 # TODO: clear iterators in some non-hacky way 1 # TODO: move to hydra config group 1 # TODO: provide segment values 1 # TODO: to be merged with conf_utils.py 1 # TODO: it is only used by _select_span_with_token. Move them to utils 1 # TODO: this is a hack-y logic that uses some private tokenizer structure which can be changed in HF code 1 # TODO: refactor to avoid 'if' 1 # TODO: make a long term HF compatibility fix 1 # TODO: ideally we'd want to just call 1 # TODO: merge with iterate_ds_sampled_data 1