facebookresearch / BLINK
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 25% duplication:
    • 9,055 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 2,347 duplicated lines
  • 153 duplicates
system25% (2,347 lines)
Duplication per Extension
py25% (2,347 lines)
Duplication per Component (primary)
elq/biencoder21% (362 lines)
blink/biencoder38% (355 lines)
blink/candidate_ranking25% (272 lines)
blink/candidate_retrieval14% (233 lines)
blink/crossencoder41% (203 lines)
blink18% (203 lines)
elq/common53% (196 lines)
blink/common62% (194 lines)
elq15% (126 lines)
scripts11% (55 lines)
elq/candidate_ranking91% (52 lines)
elq/index64% (50 lines)
blink/indexer51% (46 lines)
elq/vcg_utils0% (0 lines)
ROOT0% (0 lines)

Duplication Between Components (50+ lines)

G blink/biencoder blink/biencoder elq/biencoder elq/biencoder blink/biencoder--elq/biencoder 567 blink/crossencoder blink/crossencoder blink/biencoder--blink/crossencoder 328 scripts scripts blink/biencoder--scripts 64 blink/common blink/common elq/common elq/common blink/common--elq/common 390 blink/crossencoder--elq/biencoder 230 blink blink elq elq blink--elq 190 blink/candidate_retrieval blink/candidate_retrieval blink--blink/candidate_retrieval 104 blink/candidate_ranking blink/candidate_ranking blink/candidate_ranking--blink/common 98 blink/candidate_ranking--elq/common 98 blink/candidate_ranking--blink 67 elq/candidate_ranking elq/candidate_ranking blink/candidate_ranking--elq/candidate_ranking 104 blink/indexer blink/indexer elq/index elq/index blink/indexer--elq/index 96

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 153 duplicates...
Size#FoldersFilesLinesCode
77 x 2 blink/common
elq/common
params.py
params.py
64:144 (34%)
64:144 (23%)
view
50 x 2 blink/biencoder
elq/biencoder
train_biencoder.py
train_biencoder.py
81:149 (24%)
167:234 (11%)
view
42 x 2 blink/candidate_ranking
elq/candidate_ranking
utils.py
utils.py
91:141 (44%)
43:93 (73%)
view
36 x 2 blink/indexer
elq/index
faiss_indexer.py
faiss_indexer.py
19:67 (40%)
19:67 (46%)
view
33 x 2 blink/candidate_retrieval
blink/candidate_retrieval
process_wiki_extractor_output.py
process_wiki_extractor_output_full.py
15:60 (66%)
15:61 (67%)
view
29 x 2 blink/common
elq/common
params.py
params.py
207:235 (12%)
240:268 (8%)
view
25 x 2 blink/crossencoder
elq/biencoder
train_cross.py
train_biencoder.py
131:163 (8%)
167:199 (5%)
view
25 x 2 blink/candidate_retrieval
blink/candidate_retrieval
process_wiki_extractor_output.py
process_wiki_extractor_output_links.py
15:49 (50%)
17:52 (32%)
view
25 x 2 blink/biencoder
blink/crossencoder
train_biencoder.py
train_cross.py
81:113 (12%)
131:163 (8%)
view
25 x 2 blink/candidate_retrieval
blink/candidate_retrieval
process_wiki_extractor_output_full.py
process_wiki_extractor_output_links.py
15:50 (51%)
17:52 (32%)
view
23 x 2 blink/biencoder
blink/crossencoder
train_biencoder.py
train_cross.py
193:224 (11%)
261:293 (8%)
view
23 x 2 blink/biencoder
blink/crossencoder
train_biencoder.py
train_cross.py
234:261 (11%)
305:332 (8%)
view
23 x 2 blink
elq
build_faiss_index.py
build_faiss_index.py
33:59 (45%)
32:58 (46%)
view
22 x 2 blink/biencoder
blink/crossencoder
train_biencoder.py
train_cross.py
117:145 (10%)
167:197 (7%)
view
22 x 2 blink/biencoder
elq/biencoder
train_biencoder.py
train_biencoder.py
234:260 (10%)
489:515 (5%)
view
22 x 2 blink/biencoder
elq/biencoder
data_process.py
data_process.py
69:95 (14%)
318:343 (4%)
view
22 x 2 blink/crossencoder
elq/biencoder
train_cross.py
train_biencoder.py
167:197 (7%)
203:230 (5%)
view
22 x 2 blink/crossencoder
elq/biencoder
train_cross.py
train_biencoder.py
305:331 (7%)
489:515 (5%)
view
21 x 2 blink/common
elq/common
params.py
params.py
145:165 (9%)
146:166 (6%)
view
21 x 2 blink/common
elq/common
params.py
params.py
246:266 (9%)
350:370 (6%)
view
Duplicated Units
The list of top 11 duplicated units.
See data for all 11 unit duplicates...
Size#FoldersFilesLinesCode
23 x 2 blink/candidate_ranking
elq/candidate_ranking
utils.py
utils.py
0:0 
0:0 
view
14 x 2 blink
blink/candidate_retrieval
candidate_generation.py
candidate_generators.py
0:0 
0:0 
view
11 x 3 blink/crossencoder
blink/biencoder
elq/biencoder
train_cross.py
train_biencoder.py
train_biencoder.py
0:0 
0:0 
0:0 
view
9 x 2 blink/candidate_ranking
elq/candidate_ranking
utils.py
utils.py
0:0 
0:0 
view
10 x 2 blink/indexer
elq/index
faiss_indexer.py
faiss_indexer.py
0:0 
0:0 
view
9 x 2 blink/candidate_ranking
elq/candidate_ranking
utils.py
utils.py
0:0 
0:0 
view
7 x 3 blink/crossencoder
blink/biencoder
elq/biencoder
crossencoder.py
biencoder.py
biencoder.py
0:0 
0:0 
0:0 
view
7 x 2 blink/biencoder
elq/biencoder
data_process.py
data_process.py
0:0 
0:0 
view
6 x 3 blink/crossencoder
blink/biencoder
elq/biencoder
crossencoder.py
biencoder.py
biencoder.py
0:0 
0:0 
0:0 
view
6 x 3 blink/crossencoder
blink/biencoder
elq/biencoder
train_cross.py
train_biencoder.py
train_biencoder.py
0:0 
0:0 
0:0 
view
6 x 2 blink/indexer
elq/index
faiss_indexer.py
faiss_indexer.py
0:0 
0:0 
view