pytorch / fairseq
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 18% duplication:
    • 62,711 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 11,324 duplicated lines
  • 1,190 duplicates
system18% (11,324 lines)
Duplication per Extension
py17% (10,539 lines)
cu31% (328 lines)
yaml76% (274 lines)
cpp20% (94 lines)
cuh80% (89 lines)
Duplication per Component (primary)
fairseq/models26% (4,933 lines)
fairseq/modules20% (1,414 lines)
fairseq/tasks21% (1,368 lines)
fairseq/data10% (1,147 lines)
fairseq/model_parallel30% (569 lines)
fairseq/criterions21% (486 lines)
fairseq/optim15% (399 lines)
fairseq/config76% (274 lines)
fairseq3% (221 lines)
fairseq/clib24% (192 lines)
fairseq_cli9% (156 lines)
fairseq/benchmark41% (121 lines)
fairseq/logging5% (38 lines)
fairseq/distributed<1% (6 lines)
ROOT0% (0 lines)
fairseq/scoring0% (0 lines)
fairseq/dataclass0% (0 lines)
scripts0% (0 lines)
scripts/constraints0% (0 lines)

Duplication Between Components (50+ lines)

G fairseq/model_parallel fairseq/model_parallel fairseq/models fairseq/models fairseq/model_parallel--fairseq/models 582 fairseq/modules fairseq/modules fairseq/model_parallel--fairseq/modules 280 fairseq/models--fairseq/modules 152 fairseq/data fairseq/data fairseq/tasks fairseq/tasks fairseq/data--fairseq/tasks 248 fairseq/criterions fairseq/criterions fairseq/criterions--fairseq/model_parallel 104 fairseq/benchmark fairseq/benchmark fairseq/benchmark--fairseq/tasks 57

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 1,190 duplicates...
Size#FoldersFilesLinesCode
116 x 2 fairseq/models/speech_to_speech
fairseq/models/speech_to_text
s2s_transformer.py
s2t_transformer.py
248:365 (20%)
123:241 (26%)
view
72 x 2 fairseq/models/speech_to_text
fairseq/models/speech_to_text
convtransformer.py
s2t_transformer.py
46:117 (19%)
144:215 (16%)
view
72 x 2 fairseq/models/speech_to_speech
fairseq/models/speech_to_text
s2s_transformer.py
convtransformer.py
268:339 (12%)
46:117 (19%)
view
64 x 2 fairseq/models/speech_to_speech
fairseq/models/speech_to_text
s2s_transformer.py
s2t_transformer.py
423:488 (11%)
123:189 (14%)
view
64 x 2 fairseq/models/speech_to_speech
fairseq/models/speech_to_speech
s2s_transformer.py
s2s_transformer.py
248:313 (11%)
423:488 (11%)
view
59 x 2 fairseq/clib/libnat_cuda
fairseq/clib/libnat_cuda
cu
edit_dist.cu
edit_dist.cu
101:166 (19%)
185:250 (19%)
view
46 x 2 fairseq/models/speech_to_speech
fairseq/models/speech_to_text
s2s_transformer.py
convtransformer.py
443:488 (8%)
46:91 (12%)
view
42 x 2 fairseq/tasks
fairseq/tasks
language_modeling.py
multilingual_language_modeling.py
321:370 (15%)
565:614 (8%)
view
41 x 2 fairseq/models/bart
fairseq/models/roberta
model.py
model.py
170:214 (13%)
403:445 (7%)
view
38 x 2 fairseq/models/nat
fairseq/models/nat
iterative_nonautoregressive_transform...
nonautoregressive_transformer.py
174:214 (20%)
408:448 (10%)
view
36 x 2 fairseq/modules
fairseq/modules/dynamicconv_layer
dynamic_convolution.py
dynamicconv_layer.py
187:236 (16%)
143:192 (21%)
view
35 x 2 fairseq/config/model/transformer_lm
fairseq/config/model/transformer_lm
transformer_lm_baevski_wiki103.yaml
transformer_lm_wiki103.yaml
2:36 (100%)
2:36 (100%)
view
35 x 2 fairseq/models/nat
fairseq/models/nat
insertion_transformer.py
iterative_nonautoregressive_transform...
242:277 (17%)
174:209 (19%)
view
35 x 2 fairseq/models/nat
fairseq/models/nat
insertion_transformer.py
nonautoregressive_transformer.py
242:277 (17%)
408:443 (9%)
view
35 x 2 fairseq/config/model/transformer_lm
fairseq/config/model/transformer_lm
transformer_lm_baevski_gbw.yaml
transformer_lm_gbw.yaml
2:36 (100%)
2:36 (100%)
view
34 x 2 fairseq/models/nat
fairseq/models/nat
iterative_nonautoregressive_transform...
levenshtein_transformer.py
174:208 (18%)
432:466 (8%)
view
34 x 2 fairseq/models/nat
fairseq/models/nat
insertion_transformer.py
levenshtein_transformer.py
242:276 (16%)
432:466 (8%)
view
34 x 2 fairseq/models/nat
fairseq/models/nat
levenshtein_transformer.py
nonautoregressive_transformer.py
432:466 (8%)
408:442 (9%)
view
33 x 2 fairseq/models/speech_to_text
fairseq/models/speech_to_text
s2t_transformer.py
xm_transformer.py
385:421 (7%)
326:362 (5%)
view
31 x 2 fairseq/models/nat
fairseq/models/transformer
nonautoregressive_transformer.py
transformer_legacy.py
407:437 (8%)
169:199 (14%)
view
Duplicated Units
The list of top 20 duplicated units.
See data for all 35 unit duplicates...
Size#FoldersFilesLinesCode
32 x 2 fairseq/models/speech_to_text
fairseq/models/speech_to_text
xm_transformer.py
s2t_transformer.py
0:0 
0:0 
view
24 x 2 fairseq/benchmark
fairseq/benchmark
dummy_masked_lm.py
dummy_lm.py
0:0 
0:0 
view
17 x 2 fairseq/tasks
fairseq/tasks
fairseq_task.py
translation_multi_simple_epoch.py
0:0 
0:0 
view
14 x 2 fairseq/models
fairseq/model_parallel/m...ne_parallel_transformer
lightconv.py
layers.py
0:0 
0:0 
view
15 x 2 fairseq/criterions
fairseq/criterions
speech_to_speech_criterion.py
speech_to_speech_criterion.py
0:0 
0:0 
view
13 x 3 fairseq/models/speech_to_text
fairseq/models/speech_to_text
fairseq/models/speech_to_text
convtransformer.py
xm_transformer.py
s2t_transformer.py
0:0 
0:0 
0:0 
view
12 x 2 fairseq/models/transformer
fairseq/models/text_to_speech
transformer_decoder.py
tts_transformer.py
0:0 
0:0 
view
11 x 2 fairseq/models/wav2vec
fairseq/model_parallel/m...ne_parallel_transformer
wav2vec2_asr.py
model.py
0:0 
0:0 
view
12 x 2 fairseq/tasks
fairseq/tasks
language_modeling.py
multilingual_language_modeling.py
0:0 
0:0 
view
10 x 2 fairseq/models/speech_to_speech
fairseq/models/transformer
s2s_transformer.py
transformer_decoder.py
0:0 
0:0 
view
10 x 2 fairseq/data/audio
fairseq/data/audio
speech_to_text_dataset.py
text_to_speech_dataset.py
0:0 
0:0 
view
8 x 2 fairseq/models/speech_to_text/modules
fairseq/models/speech_to_text/modules
emformer.py
emformer.py
0:0 
0:0 
view
8 x 2 fairseq/models/text_to_speech
fairseq/models/text_to_speech
fastspeech2.py
tts_transformer.py
0:0 
0:0 
view
8 x 2 fairseq/models
fairseq/models
lightconv.py
lightconv.py
0:0 
0:0 
view
7 x 2 fairseq/models/speech_to_text
fairseq/models/transformer
convtransformer.py
transformer_decoder.py
0:0 
0:0 
view
7 x 2 fairseq/models/roberta
fairseq/model_parallel/models/roberta
model.py
model.py
0:0 
0:0 
view
12 x 2 fairseq/model_parallel/m...ne_parallel_transformer
fairseq/modules
layers.py
transformer_layer.py
0:0 
0:0 
view
7 x 2 fairseq
fairseq
checkpoint_utils.py
checkpoint_utils.py
0:0 
0:0 
view
7 x 2 fairseq/modules
fairseq/modules
downsampled_multihead_attention.py
downsampled_multihead_attention.py
0:0 
0:0 
view
7 x 2 fairseq/criterions
fairseq/criterions
tacotron2_loss.py
speech_to_speech_criterion.py
0:0 
0:0 
view