amazon-research / BartGraphSumm
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 16% duplication:
    • 32,040 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 5,179 duplicated lines
  • 419 duplicates
system16% (5,179 lines)
Duplication per Extension
py15% (4,751 lines)
cu31% (279 lines)
cpp21% (76 lines)
cuh63% (73 lines)
Duplication per Component (primary)
src/fairseq/fairseq/models19% (1,527 lines)
src/fairseq/fairseq/modules25% (1,311 lines)
src/fairseq/fairseq/tasks23% (642 lines)
src/fairseq/fairseq/model_parallel48% (380 lines)
src/fairseq/fairseq/data6% (271 lines)
src/fairseq/fairseq/optim15% (243 lines)
src/fairseq/fairseq/clib34% (192 lines)
src/fairseq/fairseq_cli12% (150 lines)
src14% (140 lines)
src/fairseq/fairseq/benchmark58% (138 lines)
src/fairseq/fairseq/criterions12% (101 lines)
src/fairseq/fairseq/logging8% (46 lines)
src/fairseq/fairseq<1% (38 lines)
src/fairseq0% (0 lines)
src/fairseq/scripts0% (0 lines)

Duplication Between Components (50+ lines)

G src/fairseq/fairseq/model_parallel src/fairseq/fairseq/model_parallel src/fairseq/fairseq/modules src/fairseq/fairseq/modules src/fairseq/fairseq/model_parallel--src/fairseq/fairseq/modules 336 src/fairseq/fairseq/models src/fairseq/fairseq/models src/fairseq/fairseq/model_parallel--src/fairseq/fairseq/models 220 src/fairseq/fairseq/models--src/fairseq/fairseq/modules 52 src/fairseq/fairseq/benchmark src/fairseq/fairseq/benchmark src/fairseq/fairseq/tasks src/fairseq/fairseq/tasks src/fairseq/fairseq/benchmark--src/fairseq/fairseq/tasks 50

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 419 duplicates...
Size#FoldersFilesLinesCode
58 x 2 src/fairseq/fairseq/benchmark
src/fairseq/fairseq/benchmark
dummy_lm.py
dummy_masked_lm.py
42:118 (76%)
51:127 (69%)
view
53 x 2 src/fairseq/fairseq/clib/libnat_cuda
src/fairseq/fairseq/clib/libnat_cuda
edit_dist.cu
edit_dist.cu
101:161 (18%)
181:241 (18%)
view
38 x 2 src/fairseq/fairseq/models/nat
src/fairseq/fairseq/models/nat
iterative_nonautoregressive_transform...
nonautoregressive_transformer.py
151:191 (24%)
376:416 (12%)
view
38 x 2 src/fairseq/fairseq/modules
src/fairseq/fairseq/modules/dynamicconv_layer
dynamic_convolution.py
dynamicconv_layer.py
136:183 (19%)
136:183 (23%)
view
38 x 2 src
src
graph_construction.py
prepare_data.py
35:75 (11%)
92:132 (12%)
view
35 x 2 src/fairseq/fairseq/models/nat
src/fairseq/fairseq/models/nat
insertion_transformer.py
iterative_nonautoregressive_transform...
242:277 (17%)
151:186 (22%)
view
35 x 2 src/fairseq/fairseq/models/nat
src/fairseq/fairseq/models/nat
insertion_transformer.py
nonautoregressive_transformer.py
242:277 (17%)
376:411 (11%)
view
34 x 2 src/fairseq/fairseq/models/nat
src/fairseq/fairseq/models/nat
iterative_nonautoregressive_transform...
levenshtein_transformer.py
151:185 (21%)
402:436 (9%)
view
34 x 2 src/fairseq/fairseq/models/nat
src/fairseq/fairseq/models/nat
insertion_transformer.py
levenshtein_transformer.py
242:276 (16%)
402:436 (9%)
view
34 x 2 src/fairseq/fairseq/models/nat
src/fairseq/fairseq/models/nat
levenshtein_transformer.py
nonautoregressive_transformer.py
402:436 (9%)
376:410 (10%)
view
31 x 2 src/fairseq/fairseq/clib/libnat
src/fairseq/fairseq/clib/libnat
edit_dist.cpp
edit_dist.cpp
58:98 (21%)
130:170 (21%)
view
31 x 2 src/fairseq/fairseq/models/nat
src/fairseq/fairseq/models
nonautoregressive_transformer.py
transformer.py
375:405 (9%)
1087:1117 (3%)
view
30 x 2 src/fairseq/fairseq/models/nat
src/fairseq/fairseq/models
insertion_transformer.py
transformer.py
242:271 (14%)
1088:1117 (3%)
view
30 x 2 src/fairseq/fairseq/models/nat
src/fairseq/fairseq/models
levenshtein_transformer.py
transformer.py
402:431 (8%)
1088:1117 (3%)
view
30 x 2 src/fairseq/fairseq/models/nat
src/fairseq/fairseq/models
iterative_nonautoregressive_transform...
transformer.py
151:180 (18%)
1088:1117 (3%)
view
29 x 2 src/fairseq/fairseq/tasks
src/fairseq/fairseq/tasks
masked_lm.py
multilingual_masked_lm.py
173:202 (17%)
264:293 (11%)
view
28 x 2 src/fairseq/fairseq/models/bart
src/fairseq/fairseq/models/roberta
model.py
model.py
170:199 (9%)
191:220 (9%)
view
26 x 2 src/fairseq/fairseq/model_parallel/modules
src/fairseq/fairseq/modules
multihead_attention.py
multihead_attention.py
108:138 (10%)
177:207 (7%)
view
25 x 2 src/fairseq/fairseq/modules
src/fairseq/fairseq/modules
longformer_multihead_attention.py
multihead_attention.py
136:180 (7%)
98:141 (6%)
view
25 x 2 src/fairseq/fairseq/models/nat
src/fairseq/fairseq/models/nat
cmlm_transformer.py
insertion_transformer.py
107:131 (22%)
242:266 (12%)
view
Duplicated Units
The list of top 20 duplicated units.
See data for all 22 unit duplicates...
Size#FoldersFilesLinesCode
40 x 2 src
src
graph_construction.py
prepare_data.py
0:0 
0:0 
view
26 x 2 src/fairseq/fairseq/tasks
src/fairseq/fairseq/tasks
masked_lm.py
multilingual_masked_lm.py
0:0 
0:0 
view
24 x 2 src/fairseq/fairseq/modules
src/fairseq/fairseq/modules
longformer_multihead_attention.py
multihead_attention.py
0:0 
0:0 
view
24 x 2 src/fairseq/fairseq/benchmark
src/fairseq/fairseq/benchmark
dummy_lm.py
dummy_masked_lm.py
0:0 
0:0 
view
12 x 2 src/fairseq/fairseq/models/roberta
src/fairseq/fairseq/model_parallel/models/roberta
model.py
model.py
0:0 
0:0 
view
11 x 2 src/fairseq/fairseq/modules
src/fairseq/fairseq/modules
longformer_multihead_attention.py
multihead_attention.py
0:0 
0:0 
view
11 x 3 src/fairseq/fairseq/models/roberta
src/fairseq/fairseq/models/roberta
src/fairseq/fairseq/models/roberta
model_camembert.py
model_xlmr.py
model.py
0:0 
0:0 
0:0 
view
11 x 2 src/fairseq/fairseq/criterions
src/fairseq/fairseq/criterions
sentence_prediction.py
sentence_ranking.py
0:0 
0:0 
view
10 x 2 src/fairseq/fairseq/model_parallel/criterions
src/fairseq/fairseq/criterions
vocab_parallel_cross_entropy.py
cross_entropy.py
0:0 
0:0 
view
7 x 2 src/fairseq/fairseq/models/roberta
src/fairseq/fairseq/model_parallel/models/roberta
model.py
model.py
0:0 
0:0 
view
6 x 2 src/fairseq/fairseq/models/roberta
src/fairseq/fairseq/model_parallel/models/roberta
model.py
model.py
0:0 
0:0 
view
6 x 2 src/fairseq/fairseq/models/roberta
src/fairseq/fairseq/model_parallel/models/roberta
model.py
model.py
0:0 
0:0 
view
6 x 2 src/fairseq/fairseq/data
src/fairseq/fairseq/data
append_token_dataset.py
prepend_token_dataset.py
0:0 
0:0 
view
8 x 3 src/fairseq/fairseq/data
src/fairseq/fairseq/data/audio
src/fairseq/fairseq/data
monolingual_dataset.py
raw_audio_dataset.py
subsample_dataset.py
0:0 
0:0 
0:0 
view
11 x 2 src/fairseq/fairseq/optim
src/fairseq/fairseq/optim
nag.py
sgd.py
0:0 
0:0 
view
6 x 2 src/fairseq/fairseq/benchmark
src/fairseq/fairseq/benchmark
dummy_lm.py
dummy_masked_lm.py
0:0 
0:0 
view
8 x 2 src/fairseq/fairseq/tasks
src/fairseq/fairseq/tasks
multilingual_denoising.py
multilingual_masked_lm.py
0:0 
0:0 
view
19 x 2 src/fairseq/fairseq/models/roberta
src/fairseq/fairseq/model_parallel/models/roberta
model.py
model.py
0:0 
0:0 
view
7 x 2 src/fairseq/fairseq/tasks
src/fairseq/fairseq/tasks
sentence_prediction.py
sentence_ranking.py
0:0 
0:0 
view
8 x 2 src/fairseq/fairseq/models
src/fairseq/fairseq/models
fairseq_model.py
fairseq_model.py
0:0 
0:0 
view