facebookresearch / MultipleAttributeTextRewriting
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 18% duplication:
    • 5,157 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 936 duplicated lines
  • 55 duplicates
system18% (936 lines)
Duplication per Extension
py18% (936 lines)
Duplication per Component (primary)
code/src/model26% (365 lines)
code/src10% (196 lines)
code30% (137 lines)
data/Amazon40% (109 lines)
data/Yelp44% (109 lines)
code/scripts18% (20 lines)
code/src/modules0% (0 lines)
code/src/data0% (0 lines)
data0% (0 lines)

Duplication Between Components (50+ lines)

G data/Amazon data/Amazon data/Yelp data/Yelp data/Amazon--data/Yelp 218

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 55 duplicates...
Size#FoldersFilesLinesCode
37 x 2 data/Amazon
data/Yelp
amazon_fader_process.py
yelp_fader_process.py
35:76 (21%)
29:72 (24%)
view
23 x 2 code/src/model
code/src/model
attention.py
seq2seq.py
189:221 (6%)
177:209 (8%)
view
22 x 2 code
code
train-classifier.py
train-lm.py
61:88 (26%)
63:90 (26%)
view
21 x 2 code/src/model
code/src/model
attention.py
seq2seq.py
151:177 (5%)
144:170 (8%)
view
20 x 2 code/src/model
code/src/model
attention.py
seq2seq.py
544:569 (5%)
375:400 (7%)
view
19 x 2 code/src/model
code/src/model
attention.py
seq2seq.py
440:464 (4%)
322:346 (7%)
view
18 x 2 code/src/model
code/src/model
lm.py
seq2seq.py
156:178 (15%)
319:341 (6%)
view
16 x 2 code/src/model
code/src/model
seq2seq.py
transformer.py
186:209 (6%)
146:169 (5%)
view
16 x 2 code
code
train-classifier.py
train-lm.py
43:59 (19%)
45:61 (19%)
view
16 x 2 code/src/model
code/src/model
attention.py
transformer.py
198:221 (4%)
146:169 (5%)
view
15 x 2 data/Amazon
data/Yelp
amazon_fader_process.py
yelp_fader_process.py
181:203 (8%)
149:173 (9%)
view
15 x 2 code/src/model
code/src/model
attention.py
lm.py
440:459 (3%)
159:178 (12%)
view
15 x 2 code/src
code/src
trainer.py
trainer.py
869:888 (2%)
1190:1209 (2%)
view
14 x 2 code/src
code/src
trainer.py
trainer.py
870:888 (1%)
1042:1060 (1%)
view
14 x 2 code/src
code/src
trainer.py
trainer.py
814:829 (1%)
1004:1019 (1%)
view
14 x 2 code/src/model
code/src/model
attention.py
lm.py
151:168 (3%)
45:62 (11%)
view
14 x 2 code/src
code/src
trainer.py
trainer.py
1004:1019 (1%)
1153:1168 (1%)
view
14 x 2 code/src
code/src
trainer.py
trainer.py
1042:1060 (1%)
1191:1209 (1%)
view
14 x 2 code/src/model
code/src/model
lm.py
seq2seq.py
45:62 (11%)
144:161 (5%)
view
14 x 2 code/src
code/src
trainer.py
trainer.py
814:829 (1%)
1153:1168 (1%)
view
Duplicated Units
The list of top 4 duplicated units.
See data for all 4 unit duplicates...
Size#FoldersFilesLinesCode
9 x 2 code/scripts
code/scripts
check_vocab.py
common_words.py
0:0 
0:0 
view
8 x 4 code/src/model
code/src/model
code/src/model
code/src/model
attention.py
transformer.py
lm.py
seq2seq.py
0:0 
0:0 
0:0 
0:0 
view
8 x 4 code/src/model
code/src/model
code/src/model
code/src/model
attention.py
transformer.py
lm.py
seq2seq.py
0:0 
0:0 
0:0 
0:0 
view
7 x 2 code/src
code/src
trainer.py
trainer.py
0:0 
0:0 
view