facebookresearch / multihop_dense_retrieval
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 49% duplication:
    • 6,090 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 3,012 duplicated lines
  • 242 duplicates
system49% (3,012 lines)
Duplication per Extension
py49% (3,012 lines)
Duplication per Component (primary)
mdr/qa53% (817 lines)
scripts57% (580 lines)
mdr/retrieval52% (493 lines)
mdr/retrieval/utils59% (373 lines)
mdr/retrieval/data38% (356 lines)
scripts/eval30% (168 lines)
mdr/retrieval/models36% (129 lines)
submitit66% (96 lines)
mdr0% (0 lines)
ROOT0% (0 lines)

Duplication Between Components (50+ lines)

G mdr/qa mdr/qa mdr/retrieval/utils mdr/retrieval/utils mdr/qa--mdr/retrieval/utils 708 scripts scripts mdr/qa--scripts 545 mdr/retrieval mdr/retrieval mdr/qa--mdr/retrieval 533 mdr/retrieval/data mdr/retrieval/data mdr/qa--mdr/retrieval/data 54 mdr/retrieval--scripts 353

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 242 duplicates...
Size#FoldersFilesLinesCode
172 x 2 mdr/qa
mdr/retrieval/utils
basic_tokenizer.py
basic_tokenizer.py
16:275 (100%)
18:277 (81%)
view
73 x 2 mdr/qa
mdr/retrieval
qa_trainer.py
single_trainer.py
37:146 (23%)
37:147 (34%)
view
72 x 2 mdr/qa
mdr/retrieval
qa_trainer.py
mhop_trainer.py
37:144 (23%)
38:146 (36%)
view
72 x 2 mdr/retrieval
mdr/retrieval
mhop_trainer.py
single_trainer.py
38:146 (36%)
37:145 (34%)
view
54 x 2 mdr/qa
mdr/retrieval/utils
utils.py
tokenizer.py
255:326 (19%)
88:158 (48%)
view
49 x 2 mdr/qa
mdr/retrieval/utils
utils.py
utils.py
41:106 (17%)
21:84 (41%)
view
44 x 2 scripts
scripts
train_mhop.py
train_momentum.py
156:206 (25%)
135:185 (24%)
view
34 x 2 mdr/qa
scripts
train_ranker.py
train_mhop.py
44:84 (19%)
61:100 (20%)
view
34 x 2 mdr/qa
scripts
train_ranker.py
train_momentum.py
44:84 (19%)
35:74 (19%)
view
34 x 2 scripts
scripts
train_mhop.py
train_momentum.py
61:100 (20%)
35:74 (19%)
view
32 x 2 mdr/retrieval/models
mdr/retrieval/models
mhop_retriever.py
unified_retriever.py
56:109 (41%)
125:171 (24%)
view
29 x 2 scripts
scripts
train_mhop.py
train_momentum.py
121:155 (17%)
96:130 (16%)
view
25 x 2 scripts
scripts
train_mhop.py
train_momentum.py
225:254 (14%)
206:235 (14%)
view
25 x 2 submitit
submitit
submitit_train.py
submitit_train_qa.py
78:110 (36%)
88:120 (32%)
view
24 x 2 scripts
scripts
train_momentum.py
train_qa.py
150:177 (13%)
162:188 (6%)
view
24 x 2 mdr/qa
scripts
qa_trainer.py
train_qa.py
302:329 (7%)
235:264 (6%)
view
24 x 2 scripts/eval
scripts/eval
eval_mhop_fever.py
eval_single_fever.py
57:85 (21%)
46:74 (31%)
view
24 x 2 mdr/qa
mdr/retrieval
qa_trainer.py
mhop_trainer.py
229:253 (7%)
223:250 (12%)
view
24 x 2 scripts
scripts
train_mhop.py
train_qa.py
171:198 (14%)
162:188 (6%)
view
23 x 2 submitit
submitit
submitit_train.py
submitit_train_qa.py
23:57 (33%)
22:56 (29%)
view
Duplicated Units
The list of top 20 duplicated units.
See data for all 35 unit duplicates...
Size#FoldersFilesLinesCode
34 x 2 mdr/qa
mdr/retrieval/utils
basic_tokenizer.py
basic_tokenizer.py
0:0 
0:0 
view
22 x 2 mdr/qa
mdr/retrieval/utils
basic_tokenizer.py
basic_tokenizer.py
0:0 
0:0 
view
17 x 2 scripts
scripts
train_momentum.py
train_mhop.py
0:0 
0:0 
view
17 x 2 mdr/qa
mdr/retrieval/utils
utils.py
tokenizer.py
0:0 
0:0 
view
17 x 2 mdr/qa
mdr/retrieval/utils
utils.py
tokenizer.py
0:0 
0:0 
view
19 x 2 mdr/qa
mdr/retrieval/utils
basic_tokenizer.py
basic_tokenizer.py
0:0 
0:0 
view
19 x 2 mdr/qa
mdr/retrieval/utils
basic_tokenizer.py
basic_tokenizer.py
0:0 
0:0 
view
15 x 2 mdr/qa
mdr/retrieval/utils
utils.py
utils.py
0:0 
0:0 
view
15 x 2 mdr/qa
mdr/retrieval/utils
utils.py
utils.py
0:0 
0:0 
view
16 x 2 mdr/qa
mdr/retrieval/data
qa_dataset.py
data_utils.py
0:0 
0:0 
view
13 x 2 mdr/retrieval/data
mdr/retrieval/data
sp_datasets.py
sp_datasets.py
0:0 
0:0 
view
21 x 2 mdr/qa
mdr/retrieval/utils
basic_tokenizer.py
basic_tokenizer.py
0:0 
0:0 
view
12 x 2 mdr/qa
mdr/retrieval/utils
utils.py
tokenizer.py
0:0 
0:0 
view
12 x 3 mdr/qa
mdr/retrieval
mdr/retrieval
qa_trainer.py
single_trainer.py
mhop_trainer.py
0:0 
0:0 
0:0 
view
11 x 2 mdr/qa
mdr/retrieval/utils
utils.py
tokenizer.py
0:0 
0:0 
view
11 x 2 mdr/retrieval/utils
mdr/retrieval/data
mhop_utils.py
data_utils.py
0:0 
0:0 
view
9 x 3 mdr/qa
mdr/retrieval
mdr/retrieval
qa_trainer.py
single_trainer.py
mhop_trainer.py
0:0 
0:0 
0:0 
view
13 x 2 mdr/qa
mdr/retrieval/utils
utils.py
tokenizer.py
0:0 
0:0 
view
9 x 2 mdr/qa
mdr/retrieval/utils
utils.py
tokenizer.py
0:0 
0:0 
view
13 x 3 mdr/qa
mdr/retrieval
mdr/retrieval
qa_trainer.py
single_trainer.py
mhop_trainer.py
0:0 
0:0 
0:0 
view