duplicated block id: 1 size: 34 cleaned lines of code in 2 files: - mdr/qa/basic_tokenizer.py (0:0) - mdr/retrieval/utils/basic_tokenizer.py (0:0) duplicated block id: 2 size: 22 cleaned lines of code in 2 files: - mdr/qa/basic_tokenizer.py (0:0) - mdr/retrieval/utils/basic_tokenizer.py (0:0) duplicated block id: 3 size: 17 cleaned lines of code in 2 files: - scripts/train_momentum.py (0:0) - scripts/train_mhop.py (0:0) duplicated block id: 4 size: 17 cleaned lines of code in 2 files: - mdr/qa/utils.py (0:0) - mdr/retrieval/utils/tokenizer.py (0:0) duplicated block id: 5 size: 17 cleaned lines of code in 2 files: - mdr/qa/utils.py (0:0) - mdr/retrieval/utils/tokenizer.py (0:0) duplicated block id: 6 size: 19 cleaned lines of code in 2 files: - mdr/qa/basic_tokenizer.py (0:0) - mdr/retrieval/utils/basic_tokenizer.py (0:0) duplicated block id: 7 size: 19 cleaned lines of code in 2 files: - mdr/qa/basic_tokenizer.py (0:0) - mdr/retrieval/utils/basic_tokenizer.py (0:0) duplicated block id: 8 size: 15 cleaned lines of code in 2 files: - mdr/qa/utils.py (0:0) - mdr/retrieval/utils/utils.py (0:0) duplicated block id: 9 size: 15 cleaned lines of code in 2 files: - mdr/qa/utils.py (0:0) - mdr/retrieval/utils/utils.py (0:0) duplicated block id: 10 size: 16 cleaned lines of code in 2 files: - mdr/qa/qa_dataset.py (0:0) - mdr/retrieval/data/data_utils.py (0:0) duplicated block id: 11 size: 13 cleaned lines of code in 2 files: - mdr/retrieval/data/sp_datasets.py (0:0) - mdr/retrieval/data/sp_datasets.py (0:0) duplicated block id: 12 size: 21 cleaned lines of code in 2 files: - mdr/qa/basic_tokenizer.py (0:0) - mdr/retrieval/utils/basic_tokenizer.py (0:0) duplicated block id: 13 size: 12 cleaned lines of code in 2 files: - mdr/qa/utils.py (0:0) - mdr/retrieval/utils/tokenizer.py (0:0) duplicated block id: 14 size: 12 cleaned lines of code in 3 files: - mdr/qa/qa_trainer.py (0:0) - mdr/retrieval/single_trainer.py (0:0) - mdr/retrieval/mhop_trainer.py (0:0) duplicated block id: 15 size: 11 cleaned lines of code in 2 files: - mdr/qa/utils.py (0:0) - mdr/retrieval/utils/tokenizer.py (0:0) duplicated block id: 16 size: 11 cleaned lines of code in 2 files: - mdr/retrieval/utils/mhop_utils.py (0:0) - mdr/retrieval/data/data_utils.py (0:0) duplicated block id: 17 size: 9 cleaned lines of code in 3 files: - mdr/qa/qa_trainer.py (0:0) - mdr/retrieval/single_trainer.py (0:0) - mdr/retrieval/mhop_trainer.py (0:0) duplicated block id: 18 size: 13 cleaned lines of code in 2 files: - mdr/qa/utils.py (0:0) - mdr/retrieval/utils/tokenizer.py (0:0) duplicated block id: 19 size: 9 cleaned lines of code in 2 files: - mdr/qa/utils.py (0:0) - mdr/retrieval/utils/tokenizer.py (0:0) duplicated block id: 20 size: 13 cleaned lines of code in 3 files: - mdr/qa/qa_trainer.py (0:0) - mdr/retrieval/single_trainer.py (0:0) - mdr/retrieval/mhop_trainer.py (0:0) duplicated block id: 21 size: 12 cleaned lines of code in 2 files: - mdr/qa/basic_tokenizer.py (0:0) - mdr/retrieval/utils/basic_tokenizer.py (0:0) duplicated block id: 22 size: 8 cleaned lines of code in 2 files: - mdr/retrieval/data/encode_datasets.py (0:0) - mdr/retrieval/data/fever_dataset.py (0:0) duplicated block id: 23 size: 10 cleaned lines of code in 2 files: - submitit/submitit_train_qa.py (0:0) - submitit/submitit_train.py (0:0) duplicated block id: 24 size: 9 cleaned lines of code in 2 files: - mdr/qa/utils.py (0:0) - mdr/retrieval/utils/tokenizer.py (0:0) duplicated block id: 25 size: 9 cleaned lines of code in 2 files: - mdr/qa/utils.py (0:0) - mdr/retrieval/utils/tokenizer.py (0:0) duplicated block id: 26 size: 6 cleaned lines of code in 4 files: - mdr/retrieval/models/retriever.py (0:0) - mdr/retrieval/models/retriever.py (0:0) - mdr/retrieval/models/unified_retriever.py (0:0) - mdr/retrieval/models/mhop_retriever.py (0:0) duplicated block id: 27 size: 6 cleaned lines of code in 2 files: - submitit/submitit_train_qa.py (0:0) - submitit/submitit_train.py (0:0) duplicated block id: 28 size: 6 cleaned lines of code in 2 files: - mdr/qa/utils.py (0:0) - mdr/retrieval/utils/tokenizer.py (0:0) duplicated block id: 29 size: 6 cleaned lines of code in 3 files: - mdr/qa/qa_trainer.py (0:0) - mdr/retrieval/single_trainer.py (0:0) - mdr/retrieval/mhop_trainer.py (0:0) duplicated block id: 30 size: 6 cleaned lines of code in 3 files: - mdr/qa/qa_trainer.py (0:0) - mdr/retrieval/single_trainer.py (0:0) - mdr/retrieval/mhop_trainer.py (0:0) duplicated block id: 31 size: 8 cleaned lines of code in 3 files: - mdr/qa/qa_trainer.py (0:0) - mdr/retrieval/single_trainer.py (0:0) - mdr/retrieval/mhop_trainer.py (0:0) duplicated block id: 32 size: 8 cleaned lines of code in 2 files: - mdr/qa/basic_tokenizer.py (0:0) - mdr/retrieval/utils/basic_tokenizer.py (0:0) duplicated block id: 33 size: 6 cleaned lines of code in 2 files: - mdr/qa/basic_tokenizer.py (0:0) - mdr/retrieval/utils/basic_tokenizer.py (0:0) duplicated block id: 34 size: 6 cleaned lines of code in 2 files: - mdr/qa/basic_tokenizer.py (0:0) - mdr/retrieval/utils/basic_tokenizer.py (0:0) duplicated block id: 35 size: 6 cleaned lines of code in 2 files: - mdr/qa/basic_tokenizer.py (0:0) - mdr/retrieval/utils/basic_tokenizer.py (0:0)