facebookresearch / parcus
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 59% duplication:
    • 3,891 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 2,309 duplicated lines
  • 279 duplicates
system59% (2,309 lines)
Duplication per Extension
py59% (2,309 lines)
Duplication per Component (primary)
parsers/MovieReview95% (588 lines)
training42% (517 lines)
parsers/Spouse68% (477 lines)
parsers/Hatespeech55% (343 lines)
datasets70% (256 lines)
models36% (113 lines)
utils37% (15 lines)
ROOT0% (0 lines)

Duplication Between Components (50+ lines)

G parsers/MovieReview parsers/MovieReview parsers/Spouse parsers/Spouse parsers/MovieReview--parsers/Spouse 695 parsers/Hatespeech parsers/Hatespeech parsers/Hatespeech--parsers/MovieReview 504 parsers/Hatespeech--parsers/Spouse 552

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 279 duplicates...
Size#FoldersFilesLinesCode
90 x 2 parsers/MovieReview
parsers/MovieReview
MovieReview_Finetune_Preprocess.py
MovieReview_Preprocess.py
16:158 (43%)
16:158 (42%)
view
70 x 2 parsers/MovieReview
parsers/Spouse
MovieReview_Preprocess.py
Spouse_Preprocess.py
133:247 (32%)
212:326 (25%)
view
45 x 2 parsers/MovieReview
parsers/Spouse
MovieReview_Finetune_Preprocess.py
Spouse_Finetune_Preprocess.py
133:188 (21%)
99:154 (22%)
view
40 x 2 datasets
datasets
NREDataset.py
NREDataset.py
16:68 (32%)
73:126 (32%)
view
35 x 2 datasets
datasets
NREDataset.py
NREDataset.py
73:116 (28%)
131:173 (28%)
view
35 x 2 datasets
datasets
NREDataset.py
NREDataset.py
16:58 (28%)
131:173 (28%)
view
33 x 2 parsers/Hatespeech
parsers/MovieReview
Hatespeech_Dataset_Builder.py
MovieReview_Dataset_Builder.py
55:107 (31%)
56:108 (31%)
view
33 x 2 parsers/MovieReview
parsers/Spouse
MovieReview_Finetune_Preprocess.py
Spouse_Finetune_Preprocess.py
217:257 (15%)
188:228 (16%)
view
32 x 2 parsers/Hatespeech
parsers/Spouse
Hatespeech_Preprocess.py
Spouse_Preprocess.py
171:202 (13%)
236:267 (11%)
view
32 x 2 parsers/Hatespeech
parsers/MovieReview
Hatespeech_Preprocess.py
MovieReview_Preprocess.py
171:202 (13%)
157:188 (14%)
view
29 x 2 datasets
datasets
BertBaselineDataset.py
SpouseBaselineDataset.py
14:58 (51%)
15:60 (37%)
view
27 x 2 parsers/MovieReview
parsers/MovieReview
MovieReview_Finetune_Preprocess.py
MovieReview_Preprocess.py
283:318 (13%)
291:326 (12%)
view
26 x 2 parsers/Spouse
parsers/Spouse
Spouse_Finetune_Preprocess.py
Spouse_Preprocess.py
269:302 (12%)
369:402 (9%)
view
24 x 2 parsers/Hatespeech
parsers/MovieReview
Hatespeech_Preprocess.py
MovieReview_Finetune_Preprocess.py
93:123 (10%)
85:115 (11%)
view
24 x 2 parsers/MovieReview
parsers/Spouse
MovieReview_Finetune_Dataset_Builder.py
Spouse_Finetune_Dataset_Builder.py
104:143 (25%)
103:142 (25%)
view
24 x 2 parsers/Spouse
parsers/Spouse
Spouse_Finetune_Preprocess.py
Spouse_Preprocess.py
200:228 (11%)
313:341 (8%)
view
24 x 2 parsers/MovieReview
parsers/Spouse
MovieReview_Preprocess.py
Spouse_Preprocess.py
250:288 (11%)
329:367 (8%)
view
24 x 2 parsers/Hatespeech
parsers/MovieReview
Hatespeech_Preprocess.py
MovieReview_Preprocess.py
93:123 (10%)
85:115 (11%)
view
24 x 2 parsers/MovieReview
parsers/Spouse
MovieReview_Finetune_Preprocess.py
Spouse_Preprocess.py
229:257 (11%)
313:341 (8%)
view
23 x 2 parsers/Hatespeech
parsers/MovieReview
Hatespeech_Dataset_Builder.py
MovieReview_Dataset_Builder.py
119:156 (21%)
120:157 (22%)
view
Duplicated Units
The list of top 10 duplicated units.
See data for all 10 unit duplicates...
Size#FoldersFilesLinesCode
29 x 2 parsers/MovieReview
parsers/MovieReview
MovieReview_Finetune_Preprocess.py
MovieReview_Preprocess.py
0:0 
0:0 
view
29 x 2 datasets
datasets
NREDataset.py
NREDataset.py
0:0 
0:0 
view
30 x 2 parsers/MovieReview
parsers/MovieReview
MovieReview_Finetune_Preprocess.py
MovieReview_Preprocess.py
0:0 
0:0 
view
13 x 2 parsers/MovieReview
parsers/MovieReview
MovieReview_Finetune_Preprocess.py
MovieReview_Preprocess.py
0:0 
0:0 
view
8 x 2 parsers/MovieReview
parsers/MovieReview
MovieReview_Finetune_Dataset_Builder.py
MovieReview_Dataset_Builder.py
0:0 
0:0 
view
7 x 6 parsers/Hatespeech
parsers/Hatespeech
parsers/Spouse
parsers/Spouse
parsers/MovieReview
parsers/MovieReview
Hatespeech_Fasttext_Preprocess.py
Hatespeech_Preprocess.py
Spouse_Finetune_Preprocess.py
Spouse_Preprocess.py
MovieReview_Finetune_Preprocess.py
MovieReview_Preprocess.py
0:0 
0:0 
0:0 
0:0 
0:0 
0:0 
view
7 x 4 parsers/Hatespeech
parsers/Hatespeech
parsers/Spouse
parsers/Spouse
Hatespeech_Dataset_Builder.py
Hatespeech_Dataset_Fasttext_Builder.py
Spouse_Dataset_Builder.py
Spouse_Finetune_Dataset_Builder.py
0:0 
0:0 
0:0 
0:0 
view
6 x 2 datasets
datasets
NREDataset.py
NREDataset.py
0:0 
0:0 
view
8 x 2 models
models
NPM.py
NPM.py
0:0 
0:0 
view
8 x 2 models
models
NPM.py
NPM.py
0:0 
0:0 
view