facebookresearch / pytext
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 17% duplication:
    • 42,137 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 7,168 duplicated lines
  • 636 duplicates
system17% (7,168 lines)
Duplication per Extension
py17% (7,168 lines)
Duplication per Component (primary)
pytext/models15% (2,027 lines)
pytext/data19% (1,520 lines)
pytext/torchscript38% (1,463 lines)
pytext/metric_reporters17% (564 lines)
pytext/metrics20% (333 lines)
pytext/legacy13% (305 lines)
pytext/optimizer12% (283 lines)
pytext/fields29% (174 lines)
pytext/task11% (140 lines)
pytext/loss13% (109 lines)
pytext8% (98 lines)
pytext/trainers12% (78 lines)
pytext/utils5% (74 lines)
ROOT0% (0 lines)
pytext/config0% (0 lines)
pytext/exporters0% (0 lines)
pytext/resources0% (0 lines)
pytext/common0% (0 lines)

Duplication Between Components (50+ lines)

G pytext/fields pytext/fields pytext/legacy pytext/legacy pytext/fields--pytext/legacy 130 pytext/models pytext/models pytext/torchscript pytext/torchscript pytext/models--pytext/torchscript 83 pytext/data pytext/data pytext/data--pytext/torchscript 72

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 636 duplicates...
Size#FoldersFilesLinesCode
61 x 2 pytext/data
pytext/data
squad_for_bert_tensorizer.py
squad_for_bert_tensorizer.py
211:276 (16%)
391:455 (16%)
view
50 x 2 pytext/data
pytext/data
tensorizers.py
token_tensorizer.py
371:425 (2%)
191:245 (18%)
view
48 x 2 pytext/models/representations
pytext/models/seq_models
lightconv.py
conv_encoder.py
173:231 (28%)
317:375 (16%)
view
35 x 2 pytext/models/decoders
pytext/models/decoders
mlp_decoder_n_tower.py
mlp_decoder_tri_tower.py
61:100 (40%)
82:121 (28%)
view
35 x 2 pytext/torchscript
pytext/torchscript
module.py
module.py
1087:1125 (2%)
1983:2021 (2%)
view
33 x 2 pytext/data
pytext/data
squad_for_bert_tensorizer.py
squad_for_bert_tensorizer.py
169:206 (8%)
347:384 (8%)
view
30 x 2 pytext/data
pytext/data
squad_for_bert_tensorizer.py
squad_tensorizer.py
347:378 (8%)
218:249 (12%)
view
30 x 2 pytext/data
pytext/data
squad_for_bert_tensorizer.py
squad_tensorizer.py
169:200 (8%)
218:249 (12%)
view
28 x 2 pytext/models/decoders
pytext/models/decoders
mlp_decoder_n_tower.py
mlp_decoder_two_tower.py
68:100 (32%)
120:152 (18%)
view
28 x 2 pytext/torchscript
pytext/torchscript
module.py
module.py
789:817 (1%)
847:875 (1%)
view
28 x 2 pytext/models/decoders
pytext/models/decoders
mlp_decoder_tri_tower.py
mlp_decoder_two_tower.py
89:121 (22%)
120:152 (18%)
view
27 x 2 pytext/metric_reporters
pytext/metric_reporters
mask_compositional.py
mask_seq2seq_topk.py
149:183 (8%)
80:114 (17%)
view
26 x 2 pytext/torchscript
pytext/torchscript
batchutils.py
batchutils.py
357:394 (7%)
449:486 (7%)
view
26 x 2 pytext/torchscript
pytext/torchscript
module.py
module.py
1229:1254 (1%)
1286:1311 (1%)
view
24 x 2 pytext/torchscript
pytext/torchscript
module.py
module.py
686:716 (1%)
1161:1191 (1%)
view
24 x 2 pytext/models/representations/transformer
pytext/models/representations/transformer
luna_attention.py
luna_attention.py
393:422 (4%)
724:753 (4%)
view
23 x 2 pytext/torchscript
pytext/torchscript
module.py
module.py
71:100 (1%)
1364:1392 (1%)
view
23 x 2 pytext/models
pytext/models
doc_model.py
word_model.py
157:182 (5%)
159:184 (12%)
view
22 x 2 pytext/models/representations
pytext/models/representations
augmented_lstm.py
augmented_lstm.py
438:460 (6%)
497:520 (6%)
view
22 x 2 pytext/metrics
pytext/metrics
__init__.py
__init__.py
1076:1100 (2%)
1242:1266 (2%)
view
Duplicated Units
The list of top 20 duplicated units.
See data for all 46 unit duplicates...
Size#FoldersFilesLinesCode
30 x 2 pytext/data
pytext/data
squad_for_bert_tensorizer.py
squad_for_bert_tensorizer.py
0:0 
0:0 
view
29 x 2 pytext/data
pytext/data
squad_for_bert_tensorizer.py
squad_for_bert_tensorizer.py
0:0 
0:0 
view
24 x 2 pytext/models/representations/transformer
pytext/models/representations/transformer
luna_attention.py
luna_attention.py
0:0 
0:0 
view
15 x 2 pytext/models/representations
pytext/models/representations
huggingface_bert_sentence_encoder.py
huggingface_electra_sentence_encoder.py
0:0 
0:0 
view
12 x 2 pytext/data
pytext/data
token_tensorizer.py
tensorizers.py
0:0 
0:0 
view
11 x 2 pytext/metric_reporters
pytext/metric_reporters
channel.py
channel.py
0:0 
0:0 
view
14 x 2 pytext/metric_reporters
pytext/metric_reporters
mask_compositional.py
mask_seq2seq_topk.py
0:0 
0:0 
view
12 x 2 pytext/metric_reporters
pytext/metric_reporters
multi_span_qa_metric_reporter.py
squad_metric_reporter.py
0:0 
0:0 
view
11 x 2 pytext/data
pytext/data
token_tensorizer.py
tensorizers.py
0:0 
0:0 
view
10 x 2 pytext/models/embeddings
pytext/models/embeddings
int_weighted_multi_category_embedding.py
int_single_category_embedding.py
0:0 
0:0 
view
9 x 2 pytext/models/representations
pytext/models/seq_models
lightconv.py
nar_length.py
0:0 
0:0 
view
9 x 2 pytext/models/representations
pytext/models/seq_models
lightconv.py
conv_encoder.py
0:0 
0:0 
view
13 x 2 pytext/torchscript/tensorizer
pytext/data
tensorizer.py
tensorizers.py
0:0 
0:0 
view
9 x 2 pytext/models/representations
pytext/models/seq_models
lightconv.py
nar_length.py
0:0 
0:0 
view
8 x 2 pytext/models
pytext/models
joint_model.py
doc_model.py
0:0 
0:0 
view
8 x 2 pytext/metric_reporters
pytext/metric_reporters
seq2seq_compositional.py
seq2seq_metric_reporter.py
0:0 
0:0 
view
8 x 2 pytext/data
pytext/data
token_tensorizer.py
tensorizers.py
0:0 
0:0 
view
8 x 2 pytext/data
pytext/data
decoupled_data.py
data.py
0:0 
0:0 
view
8 x 2 pytext/data
pytext/data
squad_tensorizer.py
squad_for_bert_tensorizer.py
0:0 
0:0 
view
8 x 2 pytext/data/sources
pytext/data/sources
tsv.py
tsv.py
0:0 
0:0 
view