facebookresearch / DisCo
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 24% duplication:
    • 13,554 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 3,254 duplicated lines
  • 102 duplicates
system24% (3,254 lines)
Duplication per Extension
py24% (3,254 lines)
Duplication per Component (primary)
ROOT80% (926 lines)
fairseq_cli99% (887 lines)
fairseq/modules22% (511 lines)
fairseq/models24% (408 lines)
fairseq/optim15% (181 lines)
scripts15% (118 lines)
fairseq/data4% (70 lines)
fairseq/tasks17% (70 lines)
fairseq/criterions19% (37 lines)
fairseq<1% (28 lines)
fairseq/strategies10% (18 lines)
fairseq/clib0% (0 lines)
fairseq/fb_tbmf_wrapper0% (0 lines)

Duplication Between Components (50+ lines)

G fairseq_cli fairseq_cli ROOT ROOT fairseq_cli--ROOT 1797

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 102 duplicates...
Size#FoldersFilesLinesCode
213 x 2 fairseq_cli
ROOT
train.py
train.py
25:306 (100%)
25:306 (100%)
view
191 x 2 fairseq_cli
ROOT
preprocess.py
preprocess.py
26:254 (100%)
26:254 (100%)
view
157 x 2 ROOT
fairseq_cli
eval_lm.py
eval_lm.py
22:225 (100%)
22:225 (100%)
view
140 x 2 fairseq_cli
ROOT
generate.py
generate.py
19:199 (100%)
19:199 (100%)
view
134 x 2 fairseq_cli
ROOT
interactive.py
interactive.py
21:195 (100%)
21:195 (100%)
view
90 x 2 fairseq/modules
fairseq/modules
masked_multihead_attention.py
multihead_attention.py
58:171 (63%)
131:242 (35%)
view
59 x 2 scripts
scripts
oracle.py
oracle_s2s.py
9:86 (100%)
9:86 (100%)
view
52 x 2 fairseq_cli
ROOT
setup.py
setup.py
13:68 (100%)
13:68 (100%)
view
31 x 2 fairseq/models
fairseq/models
bert_seq2seq.py
disco_transformer.py
695:728 (6%)
342:375 (13%)
view
28 x 2 fairseq/models
fairseq/models
bert_seq2seq.py
transformer.py
123:150 (5%)
76:103 (5%)
view
25 x 2 fairseq/modules
fairseq/modules
sparse_transformer_sentence_encoder.py
transformer_sentence_encoder.py
19:43 (39%)
69:93 (17%)
view
23 x 2 fairseq/models
fairseq/models
bert_seq2seq.py
disco_transformer.py
198:223 (4%)
47:72 (9%)
view
23 x 2 fairseq/models
fairseq/models
bert_seq2seq.py
disco_transformer.py
470:493 (4%)
313:338 (9%)
view
20 x 2 fairseq/modules
fairseq/modules
masked_multihead_attention.py
multihead_attention.py
179:205 (14%)
240:266 (7%)
view
17 x 2 fairseq/data
fairseq/data
language_pair_dataset.py
language_pair_self_dataset_mask.py
105:121 (14%)
130:146 (10%)
view
17 x 2 fairseq/optim
fairseq/optim
fp16_optimizer.py
fp16_optimizer.py
69:87 (7%)
240:258 (7%)
view
13 x 2 fairseq/modules
fairseq/modules
dynamic_convolution.py
lightweight_convolution.py
200:216 (7%)
216:232 (6%)
view
13 x 2 fairseq/models
fairseq/models
bert_seq2seq.py
disco_transformer.py
366:383 (2%)
251:268 (5%)
view
12 x 2 fairseq/modules
fairseq/modules
sparse_transformer_sentence_encoder_l...
transformer_sentence_encoder_layer.py
17:28 (36%)
25:36 (20%)
view
12 x 2 fairseq/optim
fairseq/optim
fp16_optimizer.py
fp16_optimizer.py
180:194 (5%)
352:367 (5%)
view
Duplicated Units
The list of top 20 duplicated units.
See data for all 21 unit duplicates...
Size#FoldersFilesLinesCode
148 x 2 ROOT
fairseq_cli
preprocess.py
preprocess.py
0:0 
0:0 
view
136 x 2 fairseq_cli
ROOT
eval_lm.py
eval_lm.py
0:0 
0:0 
view
145 x 2 ROOT
fairseq_cli
generate.py
generate.py
0:0 
0:0 
view
107 x 2 fairseq_cli
ROOT
interactive.py
interactive.py
0:0 
0:0 
view
57 x 2 fairseq_cli
ROOT
train.py
train.py
0:0 
0:0 
view
60 x 2 fairseq_cli
ROOT
train.py
train.py
0:0 
0:0 
view
43 x 2 fairseq_cli
ROOT
train.py
train.py
0:0 
0:0 
view
30 x 2 fairseq/models
fairseq/models
disco_transformer.py
bert_seq2seq.py
0:0 
0:0 
view
32 x 2 fairseq_cli
ROOT
train.py
train.py
0:0 
0:0 
view
24 x 2 ROOT
fairseq_cli
preprocess.py
preprocess.py
0:0 
0:0 
view
22 x 2 fairseq_cli
ROOT
train.py
train.py
0:0 
0:0 
view
28 x 2 scripts
scripts
oracle_s2s.py
oracle.py
0:0 
0:0 
view
18 x 2 fairseq_cli
ROOT
interactive.py
interactive.py
0:0 
0:0 
view
13 x 2 fairseq_cli
ROOT
train.py
train.py
0:0 
0:0 
view
9 x 2 fairseq_cli
ROOT
interactive.py
interactive.py
0:0 
0:0 
view
6 x 2 ROOT
fairseq_cli
preprocess.py
preprocess.py
0:0 
0:0 
view
6 x 2 fairseq_cli
ROOT
eval_lm.py
eval_lm.py
0:0 
0:0 
view
10 x 2 fairseq_cli
ROOT
eval_lm.py
eval_lm.py
0:0 
0:0 
view
11 x 2 fairseq/optim
fairseq/optim
nag.py
sgd.py
0:0 
0:0 
view
28 x 2 fairseq/data
fairseq/data
language_pair_self_dataset_mask.py
language_pair_dataset.py
0:0 
0:0 
view