facebookresearch / DPR
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 15% duplication:
    • 5,479 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 845 duplicated lines
  • 41 duplicates
system15% (845 lines)
Duplication per Extension
py14% (766 lines)
yaml30% (79 lines)
Duplication per Component (primary)
dpr/data17% (390 lines)
ROOT14% (210 lines)
dpr/models19% (159 lines)
conf/train70% (49 lines)
conf25% (30 lines)
dpr/utils1% (7 lines)
dpr0% (0 lines)
dpr/indexer0% (0 lines)
conf/encoder0% (0 lines)
conf/datasets0% (0 lines)
conf/ctx_sources0% (0 lines)
Longest Duplicates
The list of 20 longest duplicates.
See data for all 41 duplicates...
Size#FoldersFilesLinesCode
122 x 2 dpr/data
dpr/data
biencoder_data.py
tables.py
198:352 (27%)
25:179 (26%)
view
16 x 2 dpr/data
dpr/data
biencoder_data.py
tables.py
374:398 (3%)
203:233 (3%)
view
14 x 2 dpr/data
dpr/data
biencoder_data.py
tables.py
356:372 (3%)
183:201 (2%)
view
14 x 2 dpr/models
dpr/models
biencoder.py
biencoder.py
218:234 (4%)
321:337 (4%)
view
13 x 2 ROOT
ROOT
train_dense_encoder.py
train_extractive_reader.py
763:779 (2%)
555:570 (2%)
view
13 x 2 conf/train
conf/train
biencoder_local.yaml
biencoder_nq.yaml
9:27 (68%)
9:27 (68%)
view
13 x 2 dpr/models
dpr/models
biencoder.py
biencoder.py
203:216 (4%)
298:311 (4%)
view
12 x 2 ROOT
ROOT
train_dense_encoder.py
train_extractive_reader.py
167:184 (2%)
159:173 (2%)
view
12 x 2 dpr/models
dpr/models
biencoder.py
biencoder.py
182:198 (4%)
280:296 (4%)
view
11 x 2 ROOT
ROOT
train_dense_encoder.py
train_extractive_reader.py
525:536 (1%)
336:347 (2%)
view
10 x 2 ROOT
ROOT
train_dense_encoder.py
train_dense_encoder.py
690:699 (1%)
702:711 (1%)
view
9 x 2 conf/train
conf/train
biencoder_default.yaml
biencoder_nq.yaml
17:27 (47%)
17:27 (47%)
view
9 x 2 conf/train
conf/train
biencoder_default.yaml
biencoder_local.yaml
17:27 (47%)
17:27 (47%)
view
8 x 2 ROOT
ROOT
train_dense_encoder.py
train_dense_encoder.py
250:257 (1%)
336:343 (1%)
view
8 x 2 conf
conf
dense_retriever.yaml
extractive_reader_train_cfg.yaml
61:71 (22%)
60:70 (26%)
view
8 x 2 dpr/models
dpr/models
hf_models.py
hf_models.py
54:62 (3%)
82:90 (3%)
view
7 x 2 ROOT
ROOT
train_dense_encoder.py
train_extractive_reader.py
745:753 (1%)
538:546 (1%)
view
7 x 2 dpr/models
dpr/models
biencoder.py
biencoder.py
153:173 (2%)
248:269 (2%)
view
7 x 2 dpr/data
dpr/data
tables.py
tables.py
519:531 (1%)
559:568 (1%)
view
7 x 2 ROOT
ROOT
train_dense_encoder.py
train_extractive_reader.py
563:569 (1%)
375:381 (1%)
view
Duplicated Units
The list of top 7 duplicated units.
See data for all 7 unit duplicates...
Size#FoldersFilesLinesCode
21 x 2 dpr/data
dpr/data
tables.py
biencoder_data.py
0:0 
0:0 
view
12 x 2 dpr/data
dpr/data
tables.py
biencoder_data.py
0:0 
0:0 
view
9 x 2 dpr/data
dpr/data
tables.py
biencoder_data.py
0:0 
0:0 
view
7 x 2 dpr/data
dpr/data
tables.py
biencoder_data.py
0:0 
0:0 
view
7 x 2 dpr/data
dpr/data
tables.py
biencoder_data.py
0:0 
0:0 
view
6 x 2 dpr/data
dpr/data
tables.py
biencoder_data.py
0:0 
0:0 
view
6 x 2 dpr/data
dpr/data
qa_validation.py
tables.py
0:0 
0:0 
view